Emotion and themes recognition in music with convolutional and recurrent attention-blocks

  • Emotion is an essential aspect of music, and its recognition is a prevalent research topic in the field of computer audition. Machine learning-based Music Emotion Recognition ( MER) systems could boost the accessibility of music collections by providing standardised methodologies of music categorisation. In this paper, we introduce our (team name: AugsBurger) machine learning architecture sequentially composed of a convolutional feature extractor with block attention modules and a recurrent stack with self-attention for automatic MER. We train 5 models and conduct various late fusion experiments. Utilising a Convolutional Recurrent Neural Network ( CRNN ) with convolutional block attention applied throughout a 18-layer ResNet and a single recurrent layer with a Gated Recurrent Unit cell, a ROC-AUC of 73.9 % can be achieved on the test partition of the MediaEval 2020 Emotion & Themes in Music task. Applying late fusion on the individual model predictions and another challengeEmotion is an essential aspect of music, and its recognition is a prevalent research topic in the field of computer audition. Machine learning-based Music Emotion Recognition ( MER) systems could boost the accessibility of music collections by providing standardised methodologies of music categorisation. In this paper, we introduce our (team name: AugsBurger) machine learning architecture sequentially composed of a convolutional feature extractor with block attention modules and a recurrent stack with self-attention for automatic MER. We train 5 models and conduct various late fusion experiments. Utilising a Convolutional Recurrent Neural Network ( CRNN ) with convolutional block attention applied throughout a 18-layer ResNet and a single recurrent layer with a Gated Recurrent Unit cell, a ROC-AUC of 73.9 % can be achieved on the test partition of the MediaEval 2020 Emotion & Themes in Music task. Applying late fusion on the individual model predictions and another challenge submission, this result is further increased to 75.3 % ROC-AUC.show moreshow less

Download full text files

Export metadata

Statistics

Number of document requests

Additional Services

Share in Twitter Search Google Scholar
Metadaten
Author:Maurice GerczukORCiD, Shahin AmiriparianORCiDGND, Sandra Ottl, Srividya Tirunellai RajamaniORCiD, Björn W. SchullerORCiDGND
URN:urn:nbn:de:bvb:384-opus4-1169655
Frontdoor URLhttps://opus.bibliothek.uni-augsburg.de/opus4/116965
URL:https://nbn-resolving.org/urn:nbn:de:0074-2882-4
Parent Title (English):MediaEval 2020 - Multimedia Benchmark Workshop 2020: Workshop Notes, Proceedings of the MediaEval 2020 Workshop, Online, 14-15 December 2020
Publisher:RWTH Aachen University
Place of publication:Aachen
Editor:Steven Hicks, Debesh Jha, Konstantin Pogorelov, Alba García Seco De Herrera, Dmitry Bogdanov, Pierre-Etienne Martin, Stelios Andreadis, Minh-Son Dao, Zhuoran Liu, José Vargas-Quirós, Benjamin Kille, Martha Larson
Type:Conference Proceeding
Language:English
Year of first Publication:2020
Publishing Institution:Universität Augsburg
Release Date:2024/11/25
First Page:60
Series:CEUR Workshop Proceedings ; 2882
Institutes:Fakultät für Angewandte Informatik
Fakultät für Angewandte Informatik / Institut für Informatik
Fakultät für Angewandte Informatik / Institut für Informatik / Lehrstuhl für Embedded Intelligence for Health Care and Wellbeing
Nachhaltigkeitsziele
Nachhaltigkeitsziele / Ziel 3 - Gesundheit und Wohlergehen
Dewey Decimal Classification:6 Technik, Medizin, angewandte Wissenschaften / 61 Medizin und Gesundheit / 610 Medizin und Gesundheit
Licence (German):CC-BY 4.0: Creative Commons: Namensnennung (mit Print on Demand)