Sonstiges: |
- Nachgewiesen in: USPTO Patent Grants
- Sprachen: English
- Patent Number: 11848,001
- Publication Date: December 19, 2023
- Appl. No: 17/848028
- Application Filed: June 23, 2022
- Assignees: Intel Corporation (Santa Clara, CA, US)
- Claim: 1. A system comprising: at least one memory; machine readable instructions; and processor circuitry to at least one of instantiate or execute the machine readable instructions to: generate a breathing cue to enhance speech to be synthesized from text; determine a first insertion point of the breathing cue in the text, wherein the breathing cue is identified by a first tag of a markup language; generate a prosody cue to enhance speech to be synthesized from the text; determine a second insertion point of the prosody cue in the text, wherein the prosody cue is identified by a second tag of the markup language; insert the breathing cue at the first insertion point based on the first tag and the prosody cue at the second insertion point based on the second tag; and trigger a synthesis of the speech from the text, the breathing cue, and the prosody cue.
- Claim: 2. The system of claim 1 , wherein the breathing cue is to fill a pause.
- Claim: 3. The system of claim 1 , wherein the prosody cue is to fill a pause.
- Claim: 4. The system of claim 1 , wherein processor circuitry to: generate a phrasal stress cue to enhance the speech to be synthesized from the text; and determine a third insertion point of the phrasal stress cue in the text.
- Claim: 5. The system of claim 4 , wherein processor circuitry to trigger the synthesis of the speech from the phrasal stress cue.
- Claim: 6. The system of claim 1 , wherein processor circuitry to: generate an intonation cue to enhance the speech to be synthesized from the text; and determine a third insertion point of the intonation cue in the text.
- Claim: 7. The system of claim 6 , wherein processor circuitry to trigger the synthesis of the speech from the intonation cue.
- Claim: 8. The system of claim 1 , wherein processor circuitry to: generate a disfluency cue to enhance the speech to be synthesized from the text; and determine a third insertion point of the disfluency cue in the text.
- Claim: 9. The system of claim 8 , wherein processor circuitry to trigger the synthesis of the speech from the disfluency cue.
- Claim: 10. The system of claim 1 , wherein processor circuitry to: identify an intent of a user; and identify the text based on the intent.
- Claim: 11. The system of claim 1 , wherein processor circuitry to: identify an intent of a user; and execute a command based on the intent.
- Claim: 12. At least one storage device comprising computer readable instructions that, when executed, cause at least one machine to at least: generate a breathing cue to enhance speech to be synthesized from text, the breathing cue identified by a first tag of a markup language; generate a prosody cue to enhance speech to be synthesized from the text, the prosody cue identified by a second tag of the markup language; insert the breathing cue at a first insertion point in the text based on the first tag; insert the prosody cue at a second insertion point in the text based on the second tag; and transmit data to cause a device to synthesize the speech from the text, the breathing cue, and the prosody cue.
- Claim: 13. The at least one storage device of claim 12 , wherein the breathing cue is to fill a pause.
- Claim: 14. The at least one storage device of claim 12 , wherein the prosody cue is to fill a pause.
- Claim: 15. The at least one storage device of claim 12 , wherein the instructions cause the at least one machine to: generate a phrasal stress cue to enhance the speech to be synthesized from the text; and insert the phrasal stress cue at a third insertion point in the text.
- Claim: 16. The at least one storage device of claim 12 , wherein the instructions cause the at least one machine to: generate an intonation cue to enhance the speech to be synthesized from the text; and insert the intonation cue at a third insertion point in the text.
- Claim: 17. The at least one storage device of claim 12 , wherein the instructions cause the at least one machine to: generate a disfluency cue to enhance the speech to be synthesized from the text; and insert the disfluency cue at a third insertion point in the text.
- Claim: 18. The at least one storage device of claim 12 , wherein the instructions cause the at least one machine to: identify an intent of a user; and identify the text based on the intent.
- Claim: 19. The at least one storage device of claim 12 , wherein the instructions cause the at least one machine to: identify an intent of a user; and execute a command based on the intent.
- Patent References Cited: 6226614 May 2001 Mizuno et al. ; 6236966 May 2001 Fleming ; 6282599 August 2001 Gallick et al. ; 7617188 November 2009 Hu et al. ; 7685140 March 2010 Jackson ; 7689617 March 2010 Parikh ; 8935151 January 2015 Petrov et al. ; 8972259 March 2015 Tepperman et al. ; 9223537 December 2015 Brown et al. ; 9223547 December 2015 Endresen et al. ; 9305544 April 2016 Petrov et al. ; 9524650 December 2016 Yavari ; 9542929 January 2017 Christian ; 9721573 August 2017 Fritsch ; 9767788 September 2017 Li ; 10026393 July 2018 Christian ; 10445668 October 2019 Oehrle ; 10679606 June 2020 Christian ; 11398217 July 2022 Christian ; 11404043 August 2022 Christian ; 20060217966 September 2006 Hu et al. ; 20070094030 April 2007 Xu ; 20120065977 March 2012 Tepperman et al. ; 20120084248 April 2012 Gavrilescu ; 20130006952 January 2013 Wong et al. ; 20130282688 October 2013 Wong et al. ; 20130289998 October 2013 Eller et al. ; 20150371626 December 2015 Li ; 20170256252 September 2017 Christian et al. ; 20180227417 August 2018 Segalis et al. ; 20190115007 April 2019 Christian et al. ; 20200243064 July 2020 Christian et al. ; 20200243065 July 2020 Christian et al. ; 1208910 February 1999 ; 1602483 March 2005 ; 1604183 April 2005 ; 1945693 April 2007 ; 101000764 July 2007 ; 101000765 July 2007 ; 101504643 August 2009 ; 102368256 March 2012 ; 103366731 October 2013 ; 103620605 March 2014 ; 104021784 September 2014 ; 1363200 November 2003
- Other References: Allen, “Linguistic Aspects of Speech Synthesis,” Proceedings of the National Academy of Science, vol. 92, Colloquium Paper, Oct. 1995, pp. 9946-9952. cited by applicant ; Sprout, “Multilingual Text Analysis for Text-to-Speech Synthesis,” IEEE, vol. 3, Oct. 3, 1996, pp. 1365-1368. cited by applicant ; United States Patent and Trademark Office, “Non-final Office Action,” issued in connection with U.S. Appl. No. 14/497,994, dated May 5, 2016, 5 pages. cited by applicant ; United States Patent and Trademark Office, “Notice of Allowance,” issued in connection with U.S. Appl. No. 14/497,994, dated Sep. 19, 2016, 7 pages. cited by applicant ; Arnold et al., “Disfluencies Signal Theee, Um, New Information,” Journal of Psycholinguistic Research, vol. 32, No. 1, Jan. 2003, pp. 25-36. cited by applicant ; International Searching Authority, “International Search Report and Written Opinion,” issued in connection with International Patent Application No. PCT/US2015/047534, dated Oct. 30, 2015, 11 pages. cited by applicant ; Shriver et al., “Audio Signals in Speech Interfaces,” Language Technologies Institute, Carnegie Mellon University, 2000, 7 pages. cited by applicant ; Tang et al., “Humanoid Audio-Visual Avatar with Emotive Text-to-Speech Synthesis,” IEEE Transactions on Multimedia, vol. 10, No. 6, Oct. 6, 2008, pp. 969-981. cited by applicant ; United States Patent and Trademark Office, “Non-final Office Action,” issued in connection with U.S. Appl. No. 15/384,148, dated Oct. 18, 2017, 7 pages. cited by applicant ; United States Patent and Trademark Office, “Notice of Allowance,” issued in connection with U.S. Appl. No. 15/384,148, dated Mar. 27, 2018, 5 pages. cited by applicant ; European Patent Office, “Extended European Search Report,” issued in connection with European Patent application No. 15844926.4, dated Apr. 30, 2018, 7 pages. cited by applicant ; United States Patent and Trademark Office, “Non-Final Office action,” issued in connection with U.S. Appl. No. 16/037,872, dated Aug. 12, 2019, 13 pages. cited by applicant ; United States Patent and Trademark Office, “Notice of Allowance,” issued in connection with U.S. Appl. No. 16/037,872, dated Feb. 3, 2020, 12 pages. cited by applicant ; Patent Cooperation Treaty “International Preliminary Report on Patentability” dated Mar. 28, 2017, issued in International Application No. PCT/US2015/047534, 7 pages. cited by applicant ; United States Patent and Trademark Office “Non-Final Office Action” dated Nov. 3, 2021, issued in related U.S. Appl. No. 16/851,457, 12 pages. cited by applicant ; United States Patent and Trademark Office “Notice of Allowance” dated Mar. 3, 2022, issued in related U.S. Appl. No. 16/851,457, 8 pages. cited by applicant ; Nick Campbell, “Specifying Affect and Emotion for Expressive Speech Synthesis,” Lecture Notes in Computer Science, 2004, pp. 395-406, vol. 2945, ATR Human Information Science Laboratories, Kyoto, Japan. cited by applicant ; European Patent Office, “Examination Report,” issued in connection with Patent Application No. 15 844 926.4-1231, dated Apr. 5, 2021, 9 pages. cited by applicant ; Jonathan Allen, “Linguistic aspects of speech synthesis,” Human-Machine Communication by Voice, Feb. 8-9, 1993, 7 pages, Research Laboratory of Electronics, Massachusetts Institute of Technology, Cambridge, MA., United States of America. cited by applicant ; Shive Sundaram and Shrikanth Narayanan, “An Empirical Text Transformation Method for Spontaneous Speech Synthesizers,” Eurospeech, 2003, 4 pages, Department of Electrical Engineering-Systems and Integrated Media Systems Center, University of Southern California, Los Angelos, CA, United States of America. cited by applicant ; Chinese Patent Office,“Notification to Grant Patent right for Invention,” issued in connection with Chinese patent application No. 201580045620.x, dated Dec. 25, 2020, 7 pages. cited by applicant ; Wata et al., “Speaker's intentions conveyed to listeners by sentence-final particles and their intonations in Japanese conversational speech,” retrieved from https://waseda.pure.elsevier.com/en/publications/speakers-intentions-conveyed-to-listeners-by-sentence-final-parti, on Feb. 24, 2021,4 pages. Abstract only. cited by applicant ; United States Patent and Trademark Office, “Notice of Allowance,” issued in connection with U.S. Appl. No. 16/851,457, dated Jun. 2, 2022, 5 pages. cited by applicant ; Chinese Patent Office,“Second Office action and Search report,” issued in connection with Chinese patent application No. 201580045620.X., dated Sep. 10, 2020, 8 pages. cited by applicant ; Chinese Patent Office,“First Office Action,” issued in connection with Chinese patent application No. 201580045620.x, dated Mar. 26, 2020, 39 pages. cited by applicant ; United States Patent and Trademark Office, “Notice of Allowability,” issued in connection with U.S. Appl. No. 16/851,444, dated Jun. 10, 2022, 6 pages. cited by applicant ; United States Patent and Trademark Office, “Notice of Allowance,” issued in connection with U.S. Appl. No. 16/851,444, dated May 18, 2022, 5 pages. cited by applicant ; United States Patent and Trademark Office, “Notice of Allowance,” issued in connection with U.S. Appl. No. 16/851,444, dated Mar. 2, 2022, 8 pages. cited by applicant ; United States Patent and Trademark Office, “Non-Final Office action,” issued in connection with U.S. Appl. No. 16/851,444, dated Mar. 2, 2022, 15 pages. cited by applicant ; United States Patent and Trademark Office, “Restriction Election,” issued in connection with U.S. Appl. No. 16/851,444, dated Jul. 21, 2022, 6 pages. cited by applicant ; United States Patent and Trademark Office, “Restriction Election,” issued in connection with U.S. Appl. No. 16/851,457 dated Jul. 30, 2022, 6 pages. cited by applicant ; European Patent Office “Communication pursuant to Article 94(3) EPC” dated Apr. 5, 2021, issued in connection with European application No. 15844926.4, 8 pages. cited by applicant
- Primary Examiner: McFadden, Susan I
- Attorney, Agent or Firm: HANLEY, FLIGHT & ZIMMERMAN, LLC
|