Emotion and themes recognition in music with convolutional and recurrent attention-blocks
- Emotion is an essential aspect of music, and its recognition is a prevalent research topic in the field of computer audition. Machine learning-based Music Emotion Recognition ( MER) systems could boost the accessibility of music collections by providing standardised methodologies of music categorisation. In this paper, we introduce our (team name: AugsBurger) machine learning architecture sequentially composed of a convolutional feature extractor with block attention modules and a recurrent stack with self-attention for automatic MER. We train 5 models and conduct various late fusion experiments. Utilising a Convolutional Recurrent Neural Network ( CRNN ) with convolutional block attention applied throughout a 18-layer ResNet and a single recurrent layer with a Gated Recurrent Unit cell, a ROC-AUC of 73.9 % can be achieved on the test partition of the MediaEval 2020 Emotion & Themes in Music task. Applying late fusion on the individual model predictions and another challengeEmotion is an essential aspect of music, and its recognition is a prevalent research topic in the field of computer audition. Machine learning-based Music Emotion Recognition ( MER) systems could boost the accessibility of music collections by providing standardised methodologies of music categorisation. In this paper, we introduce our (team name: AugsBurger) machine learning architecture sequentially composed of a convolutional feature extractor with block attention modules and a recurrent stack with self-attention for automatic MER. We train 5 models and conduct various late fusion experiments. Utilising a Convolutional Recurrent Neural Network ( CRNN ) with convolutional block attention applied throughout a 18-layer ResNet and a single recurrent layer with a Gated Recurrent Unit cell, a ROC-AUC of 73.9 % can be achieved on the test partition of the MediaEval 2020 Emotion & Themes in Music task. Applying late fusion on the individual model predictions and another challenge submission, this result is further increased to 75.3 % ROC-AUC.…
Author: | Maurice GerczukORCiD, Shahin AmiriparianORCiDGND, Sandra Ottl, Srividya Tirunellai RajamaniORCiD, Björn W. SchullerORCiDGND |
---|---|
URN: | urn:nbn:de:bvb:384-opus4-1169655 |
Frontdoor URL | https://opus.bibliothek.uni-augsburg.de/opus4/116965 |
URL: | https://nbn-resolving.org/urn:nbn:de:0074-2882-4 |
Parent Title (English): | MediaEval 2020 - Multimedia Benchmark Workshop 2020: Workshop Notes, Proceedings of the MediaEval 2020 Workshop, Online, 14-15 December 2020 |
Publisher: | RWTH Aachen University |
Place of publication: | Aachen |
Editor: | Steven Hicks, Debesh Jha, Konstantin Pogorelov, Alba García Seco De Herrera, Dmitry Bogdanov, Pierre-Etienne Martin, Stelios Andreadis, Minh-Son Dao, Zhuoran Liu, José Vargas-Quirós, Benjamin Kille, Martha Larson |
Type: | Conference Proceeding |
Language: | English |
Year of first Publication: | 2020 |
Publishing Institution: | Universität Augsburg |
Release Date: | 2024/11/25 |
First Page: | 60 |
Series: | CEUR Workshop Proceedings ; 2882 |
Institutes: | Fakultät für Angewandte Informatik |
Fakultät für Angewandte Informatik / Institut für Informatik | |
Fakultät für Angewandte Informatik / Institut für Informatik / Lehrstuhl für Embedded Intelligence for Health Care and Wellbeing | |
Nachhaltigkeitsziele | |
Nachhaltigkeitsziele / Ziel 3 - Gesundheit und Wohlergehen | |
Dewey Decimal Classification: | 6 Technik, Medizin, angewandte Wissenschaften / 61 Medizin und Gesundheit / 610 Medizin und Gesundheit |
Licence (German): | ![]() |