• Deutsch
Login

Open Access

  • Home
  • Search
  • Browse
  • Publish/report a document
  • Help

Refine

Has Fulltext

  • yes (1)

Author

  • Eyben, Florian (1)
  • Feiner, Marlies (1)
  • Gerstenberger, Claus (1)
  • Gugatschka, Markus (1)
  • Hagmüller, Martin (1)
  • Haspl, Katja (1)
  • Kubin, Gernot (1)
  • Linke, Julian (1)
  • Lohrmann, Simon (1)
  • Pokorny, Florian B. (1)
+ more

Year of publication

  • 2023 (1)

Document Type

  • Article (1)

Language

  • English (1)

Keywords

  • Biomedical Engineering (1)
  • Health Informatics (1)
  • Signal Processing (1)

Institute

  • Fakultät für Angewandte Informatik (1)
  • Fakultätsübergreifende Institute und Einrichtungen (1)
  • Institut für Informatik (1)
  • Lehrstuhl für Embedded Intelligence for Health Care and Wellbeing (1)
  • Zentrum für Interdisziplinäre Gesundheitsforschung (ZIG) (1)

1 search hit

  • 1 to 1
  • 10
  • 20
  • 50
  • 100
VocDoc, what happened to my voice? Towards automatically capturing vocal fatigue in the wild (2023)
Pokorny, Florian B. ; Linke, Julian ; Seddiki, Nico ; Lohrmann, Simon ; Gerstenberger, Claus ; Haspl, Katja ; Feiner, Marlies ; Eyben, Florian ; Hagmüller, Martin ; Schuppler, Barbara ; Kubin, Gernot ; Gugatschka, Markus
Objective: Voice problems that arise during everyday vocal use can hardly be captured by standard outpatient voice assessments. In preparation for a digital health application to automatically assess longitudinal voice data ‘in the wild’ – the VocDoc, the aim of this paper was to study vocal fatigue from the speaker’s perspective, the healthcare professional’s perspective, and the ‘machine’s’ perspective. Methods: We collected data of four voice healthy speakers completing a 90-min reading task. Every 10 min the speakers were asked about subjective voice characteristics. Then, we elaborated on the task of elapsed speaking time recognition: We carried out listening experiments with speech and language therapists and employed random forests on the basis of extracted acoustic features. We validated our models speaker-dependently and speaker-independently and analysed underlying feature importances. For an additional, clinical application-oriented scenario, we extended our dataset for lecture recordings of another two speakers. Results: Self- and expert-assessments were not consistent. With mean F1 scores up to 0.78, automatic elapsed speaking time recognition worked reliably in the speaker-dependent scenario only. A small set of acoustic features – other than features previously reported to reflect vocal fatigue – was found to universally describe long-term variations of the voice. Conclusion: Vocal fatigue seems to have individual effects across different speakers. Machine learning has the potential to automatically detect and characterise vocal changes over time. Significance: Our study provides technical underpinnings for a future mobile solution to objectively capture pathological long-term voice variations in everyday life settings and make them clinically accessible.
  • 1 to 1

OPUS4 Logo

  • Contact
  • Imprint
  • Sitelinks