Multi-type outer product-based fusion of respiratory sounds for detecting COVID-19

Mallol-Ragolta, Adria; Cuesta, Helena; Gomez, Emilia; Schuller, Björn

doi:10.21437/interspeech.2022-10291

Multi-type outer product-based fusion of respiratory sounds for detecting COVID-19

Adria Mallol-Ragolta, Helena Cuesta, Emilia Gomez, Björn Schuller

This work presents an outer product-based approach to fuse the embedded representations learnt from the spectrograms of cough, breath, and speech samples for the automatic detection of COVID-19. To extract deep learnt representations from the spectrograms, we compare the performance of specific Convolutional Neural Networks (CNNs) trained from scratch and ResNet18-based CNNs fine-tuned for the task at hand. Furthermore, we investigate whether the patients' sex and the use of contextual attention mechanisms are beneficial. Our experiments use the dataset released as part of the Second Diagnosing COVID-19 using Acoustics (DiCOVA) Challenge. The results suggest the suitability of fusing breath and speech information to detect COVID-19. An Area Under the Curve (AUC) of 84.06 % is obtained on the test partition when using specific CNNs trained from scratch with contextual attention mechanisms. When using ResNet18-based CNNs for feature extraction, the baseline model scores the highestThis work presents an outer product-based approach to fuse the embedded representations learnt from the spectrograms of cough, breath, and speech samples for the automatic detection of COVID-19. To extract deep learnt representations from the spectrograms, we compare the performance of specific Convolutional Neural Networks (CNNs) trained from scratch and ResNet18-based CNNs fine-tuned for the task at hand. Furthermore, we investigate whether the patients' sex and the use of contextual attention mechanisms are beneficial. Our experiments use the dataset released as part of the Second Diagnosing COVID-19 using Acoustics (DiCOVA) Challenge. The results suggest the suitability of fusing breath and speech information to detect COVID-19. An Area Under the Curve (AUC) of 84.06 % is obtained on the test partition when using specific CNNs trained from scratch with contextual attention mechanisms. When using ResNet18-based CNNs for feature extraction, the baseline model scores the highest performance with an AUC of 84.26 %.…

Metadaten
Author:	Adria Mallol-Ragolta ORCiD GND, Helena Cuesta, Emilia Gomez, Björn Schuller ORCiD GND
URN:	urn:nbn:de:bvb:384-opus4-992930
Frontdoor URL	https://opus.bibliothek.uni-augsburg.de/opus4/99293
Parent Title (English):	Interspeech 2022, Incheon, Korea, 18-22 September 2022
Publisher:	ISCA
Place of publication:	Baixas
Editor:	Hanseok Ko, John H. L. Hansen
Type:	Conference Proceeding
Language:	English
Year of first Publication:	2022
Publishing Institution:	Universität Augsburg
Release Date:	2022/11/15
First Page:	2163
Last Page:	2167
DOI:	https://doi.org/10.21437/interspeech.2022-10291
Institutes:	Fakultät für Angewandte Informatik
	Fakultät für Angewandte Informatik / Institut für Informatik
	Fakultät für Angewandte Informatik / Institut für Informatik / Lehrstuhl für Embedded Intelligence for Health Care and Wellbeing
	Nachhaltigkeitsziele
	Nachhaltigkeitsziele / Ziel 3 - Gesundheit und Wohlergehen
Dewey Decimal Classification:	0 Informatik, Informationswissenschaft, allgemeine Werke / 00 Informatik, Wissen, Systeme / 004 Datenverarbeitung; Informatik
Licence (German):	Deutsches Urheberrecht

Open Access

Multi-type outer product-based fusion of respiratory sounds for detecting COVID-19

Download full text files

Export metadata

Statistics

Additional Services