Voice command generation using progressive WaveGANs

The search result changed since you submitted your search request. Documents might be displayed in a different sort order.

Generative Adversarial Networks (GANs) have become exceedingly popular in a wide range of data-driven research fields, due in part to their success in image generation. Their ability to generate new samples, often from only a small amount of input data, makes them an exciting research tool in areas with limited data resources. One less-explored application of GANs is the synthesis of speech and audio samples. Herein, we propose a set of extensions to the WaveGAN paradigm, a recently proposed approach for sound generation using GANs. The aim of these extensions - preprocessing, Audio-to-Audio generation, skip connections and progressive structures - is to improve the human likeness of synthetic speech samples. Scores from listening tests with 30 volunteers demonstrated a moderate improvement (Cohen's d coefficient of 0.65) in human likeness using the proposed extensions compared to the original WaveGAN approach.

Metadaten
Author:	Nicholas Cummins ORCiD GND, T. Wiest, Alice Baird GND, Simone Hantke, J. Dineley, Björn Schuller ORCiD GND
URN:	urn:nbn:de:bvb:384-opus4-668747
Frontdoor URL	https://opus.bibliothek.uni-augsburg.de/opus4/66874
Parent Title (English):	arXiv
Type:	Preprint
Language:	English
Date of Publication (online):	2019/12/09
Year of first Publication:	2019
Publishing Institution:	Universität Augsburg
Release Date:	2019/12/11
First Page:	arXiv: 1903.07395
DOI:	https://doi.org/10.48550/arXiv.1903.07395
Institutes:	Fakultät für Angewandte Informatik
	Fakultät für Angewandte Informatik / Institut für Informatik
	Fakultät für Angewandte Informatik / Institut für Informatik / Lehrstuhl für Embedded Intelligence for Health Care and Wellbeing
Dewey Decimal Classification:	0 Informatik, Informationswissenschaft, allgemeine Werke / 00 Informatik, Wissen, Systeme / 004 Datenverarbeitung; Informatik
Licence (German):	Deutsches Urheberrecht

Open Access