An overview of affective speech synthesis and conversion in the deep learning era

  • Speech is the fundamental mode of human communication, and its synthesis has long been a core priority in human–computer interaction research. In recent years, machines have managed to master the art of generating speech that is understandable by humans. However, the linguistic content of an utterance encompasses only a part of its meaning. Affect, or expressivity, has the capacity to turn speech into a medium capable of conveying intimate thoughts, feelings, and emotions—aspects that are essential for engaging and naturalistic interpersonal communication. While the goal of imparting expressivity to synthesized utterances has so far remained elusive, following recent advances in text-to-speech synthesis, a paradigm shift is well under way in the fields of affective speech synthesis and conversion as well. Deep learning, as the technology that underlies most of the recent advances in artificial intelligence, is spearheading these efforts. In this overview, we outline ongoing trends andSpeech is the fundamental mode of human communication, and its synthesis has long been a core priority in human–computer interaction research. In recent years, machines have managed to master the art of generating speech that is understandable by humans. However, the linguistic content of an utterance encompasses only a part of its meaning. Affect, or expressivity, has the capacity to turn speech into a medium capable of conveying intimate thoughts, feelings, and emotions—aspects that are essential for engaging and naturalistic interpersonal communication. While the goal of imparting expressivity to synthesized utterances has so far remained elusive, following recent advances in text-to-speech synthesis, a paradigm shift is well under way in the fields of affective speech synthesis and conversion as well. Deep learning, as the technology that underlies most of the recent advances in artificial intelligence, is spearheading these efforts. In this overview, we outline ongoing trends and summarize state-of-the-art approaches in an attempt to provide a broad overview of this exciting field.show moreshow less

Download full text files

  • 104403.pdfeng
    (1617KB)

    Postprint. © 2023 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

Export metadata

Statistics

Number of document requests

Additional Services

Share in Twitter Search Google Scholar
Metadaten
Author:Andreas TriantafyllopoulosORCiD, Bjorn W. SchullerORCiDGND, Gokce Iymen, Metin Sezgin, Xiangheng He, Zijiang YangORCiD, Panagiotis Tzirakis, Shuo LiuORCiD, Silvan MertesORCiDGND, Elisabeth AndréORCiDGND, Ruibo Fu, Jianhua Tao
URN:urn:nbn:de:bvb:384-opus4-1044036
Frontdoor URLhttps://opus.bibliothek.uni-augsburg.de/opus4/104403
ISSN:0018-9219OPAC
ISSN:1558-2256OPAC
Parent Title (English):Proceedings of the IEEE
Publisher:IEEE
Place of publication:Piscataway, NJ
Type:Article
Language:English
Year of first Publication:2023
Publishing Institution:Universität Augsburg
Release Date:2023/05/16
Tag:Electrical and Electronic Engineering
Volume:111
Issue:10
First Page:1355
Last Page:1381
DOI:https://doi.org/10.1109/jproc.2023.3250266
Institutes:Fakultät für Angewandte Informatik
Fakultät für Angewandte Informatik / Institut für Informatik
Fakultät für Angewandte Informatik / Institut für Informatik / Lehrstuhl für Menschzentrierte Künstliche Intelligenz
Fakultät für Angewandte Informatik / Institut für Informatik / Lehrstuhl für Embedded Intelligence for Health Care and Wellbeing
Dewey Decimal Classification:0 Informatik, Informationswissenschaft, allgemeine Werke / 00 Informatik, Wissen, Systeme / 004 Datenverarbeitung; Informatik
Licence (German):Deutsches Urheberrecht