Zum Hauptinhalt springen
Dekorationsartikel gehören nicht zum Leistungsumfang.
Neural Text-to-Speech Synthesis
Buch von Xu Tan
Sprache: Englisch

160,49 €*

inkl. MwSt.

Versandkostenfrei per Post / DHL

Aktuell nicht verfügbar

Kategorien:
Beschreibung
Text-to-speech (TTS) aims to synthesize intelligible and natural speech based on the given text. It is a hot topic in language, speech, and machine learning research and has broad applications in industry. This book introduces neural network-based TTS in the era of deep learning, aiming to provide a good understanding of neural TTS, current research and applications, and the future research trend.

This book first introduces the history of TTS technologies and overviews neural TTS, and provides preliminary knowledge on language and speech processing, neural networks and deep learning, and deep generative models. It then introduces neural TTS from the perspective of key components (text analyses, acoustic models, vocoders, and end-to-end models) and advanced topics (expressive and controllable, robust, model-efficient, and data-efficient TTS). It also points some future research directions and collects some resources related to TTS.

This book is the first to introduceneural TTS in a comprehensive and easy-to-understand way and can serve both academic researchers and industry practitioners working on TTS.
Text-to-speech (TTS) aims to synthesize intelligible and natural speech based on the given text. It is a hot topic in language, speech, and machine learning research and has broad applications in industry. This book introduces neural network-based TTS in the era of deep learning, aiming to provide a good understanding of neural TTS, current research and applications, and the future research trend.

This book first introduces the history of TTS technologies and overviews neural TTS, and provides preliminary knowledge on language and speech processing, neural networks and deep learning, and deep generative models. It then introduces neural TTS from the perspective of key components (text analyses, acoustic models, vocoders, and end-to-end models) and advanced topics (expressive and controllable, robust, model-efficient, and data-efficient TTS). It also points some future research directions and collects some resources related to TTS.

This book is the first to introduceneural TTS in a comprehensive and easy-to-understand way and can serve both academic researchers and industry practitioners working on TTS.
Über den Autor

Xu Tan is a Principal Researcher and Research Manager at Microsoft Research Asia. His research interests cover deep learning and its applications in language/speech/music processing and digital human creation. He has rich research experience in text-to-speech synthesis. He has developed high-quality TTS systems such as FastSpeech 1/2 (widely used in the TTS community), DelightfulTTS (winning the champion of the Blizzard TTS Challenge), and NaturalSpeech (achieving human-level quality on the TTS benchmark dataset), and transferred many research works to improve the experience of Microsoft Azure TTS services. He has given a series of tutorials on TTS at top conferences such as IJCAI, ICASSP, and INTERSPEECH, and written a comprehensive survey paper on TTS.

Besides speech synthesis, he has designed several popular language models (e.g., MASS) and AI music systems (e.g., Muzic), developed machine translation systems that achieved human parity in Chinese-English translation and won several champions in WMT machine translation competitions. He has published over 100 papers at prestigious conferences such as ICML, NeurIPS, ICLR, AAAI, IJCAI, ACL, EMNLP, NAACL, ICASSP, INTERSPEECH, KDD, and IEEE/ACM Transactions, and served as the area chair or action editor of some AI conferences and journals (e.g., NeurIPS, AAAI, ICASSP, TMLR).

Zusammenfassung

The first book to comprehensively introduce neural text-to-speech synthesis

Illustrates the complete process of text-to-speech synthesis technology

Equip readers to implement text-to-speech synthesis, either for research or product

Inhaltsverzeichnis
Chapter 1. Introduction.- Part 1. Preliminary.- Chapter 2. Basics of Spoken Language Processing.- Chapter 3. Basics of Deep Learning.- Part 2. Key Components in TTS.- Chapter 4. Text Analyses.- Chapter 5. Acoustic Models.- Chapter 6. Vocoders.- Chapter 7. Fully End-to-End TTS.- Part 3. Advanced Topics in TTS.- Chapter 8. Expressive and Controllable TTS.- Chapter 9. Robust TTS.- Chapter 10. Model-Efficient TTS.- Chapter 11. Data-Efficient TTS.- Chapter 12. Beyond Text-to-Speech Synthesis.- Part 4. Summary and Outlook.- Chapter 13. Summary and Outlook.
Details
Erscheinungsjahr: 2023
Genre: Informatik
Rubrik: Naturwissenschaften & Technik
Medium: Buch
Reihe: Artificial Intelligence: Foundations, Theory, and Algorithms
Inhalt: xxv
201 S.
24 farbige Illustr.
201 p. 24 illus. in color.
ISBN-13: 9789819908264
ISBN-10: 9819908264
Sprache: Englisch
Ausstattung / Beilage: HC runder Rücken kaschiert
Einband: Gebunden
Autor: Tan, Xu
Hersteller: Springer Singapore
Springer Nature Singapore
Artificial Intelligence: Foundations, Theory, and Algorithms
Maße: 241 x 160 x 18 mm
Von/Mit: Xu Tan
Erscheinungsdatum: 30.05.2023
Gewicht: 0,512 kg
Artikel-ID: 126519992
Über den Autor

Xu Tan is a Principal Researcher and Research Manager at Microsoft Research Asia. His research interests cover deep learning and its applications in language/speech/music processing and digital human creation. He has rich research experience in text-to-speech synthesis. He has developed high-quality TTS systems such as FastSpeech 1/2 (widely used in the TTS community), DelightfulTTS (winning the champion of the Blizzard TTS Challenge), and NaturalSpeech (achieving human-level quality on the TTS benchmark dataset), and transferred many research works to improve the experience of Microsoft Azure TTS services. He has given a series of tutorials on TTS at top conferences such as IJCAI, ICASSP, and INTERSPEECH, and written a comprehensive survey paper on TTS.

Besides speech synthesis, he has designed several popular language models (e.g., MASS) and AI music systems (e.g., Muzic), developed machine translation systems that achieved human parity in Chinese-English translation and won several champions in WMT machine translation competitions. He has published over 100 papers at prestigious conferences such as ICML, NeurIPS, ICLR, AAAI, IJCAI, ACL, EMNLP, NAACL, ICASSP, INTERSPEECH, KDD, and IEEE/ACM Transactions, and served as the area chair or action editor of some AI conferences and journals (e.g., NeurIPS, AAAI, ICASSP, TMLR).

Zusammenfassung

The first book to comprehensively introduce neural text-to-speech synthesis

Illustrates the complete process of text-to-speech synthesis technology

Equip readers to implement text-to-speech synthesis, either for research or product

Inhaltsverzeichnis
Chapter 1. Introduction.- Part 1. Preliminary.- Chapter 2. Basics of Spoken Language Processing.- Chapter 3. Basics of Deep Learning.- Part 2. Key Components in TTS.- Chapter 4. Text Analyses.- Chapter 5. Acoustic Models.- Chapter 6. Vocoders.- Chapter 7. Fully End-to-End TTS.- Part 3. Advanced Topics in TTS.- Chapter 8. Expressive and Controllable TTS.- Chapter 9. Robust TTS.- Chapter 10. Model-Efficient TTS.- Chapter 11. Data-Efficient TTS.- Chapter 12. Beyond Text-to-Speech Synthesis.- Part 4. Summary and Outlook.- Chapter 13. Summary and Outlook.
Details
Erscheinungsjahr: 2023
Genre: Informatik
Rubrik: Naturwissenschaften & Technik
Medium: Buch
Reihe: Artificial Intelligence: Foundations, Theory, and Algorithms
Inhalt: xxv
201 S.
24 farbige Illustr.
201 p. 24 illus. in color.
ISBN-13: 9789819908264
ISBN-10: 9819908264
Sprache: Englisch
Ausstattung / Beilage: HC runder Rücken kaschiert
Einband: Gebunden
Autor: Tan, Xu
Hersteller: Springer Singapore
Springer Nature Singapore
Artificial Intelligence: Foundations, Theory, and Algorithms
Maße: 241 x 160 x 18 mm
Von/Mit: Xu Tan
Erscheinungsdatum: 30.05.2023
Gewicht: 0,512 kg
Artikel-ID: 126519992
Warnhinweis

Ähnliche Produkte

Ähnliche Produkte