TY - JOUR
T1 - Genre Classification Empowered by Knowledge-Embedded Music Representation
AU - Ding, Han
AU - Zhai, Linwei
AU - Zhao, Cui
AU - Wang, Fei
AU - Wang, Ge
AU - Xi, Wei
AU - Wang, Zhi
AU - Zhao, Jizhong
N1 - Publisher Copyright:
© 2014 IEEE.
PY - 2024
Y1 - 2024
N2 - This paper introduces a pioneering framework for music representation learning, which harnesses knowledge graph embeddings to enrich genre classification. Leveraging metadata from publicly available datasets like FMA and OpenMIC-2018, the constructed knowledge graph delineates intricate relationships among genres, artists, and instruments, offering valuable insights for genre representation. Within this framework, we propose two models tailored for distinct genre classification scenarios: fixed-set genre classification and open-set genre classification. These models exploit the knowledge graph to unveil correlations among different genres and integrate this knowledge into the audio representation. Notably, our approach is the first to merge audio data with high-level knowledge for music genre classification. Experimental results demonstrate that our proposed methods outperform state-of-the-art approaches, achieving an average genre classification accuracy of 68.07% on the FMA-medium dataset and 42.4% for open-set classification on the FMA-large dataset.
AB - This paper introduces a pioneering framework for music representation learning, which harnesses knowledge graph embeddings to enrich genre classification. Leveraging metadata from publicly available datasets like FMA and OpenMIC-2018, the constructed knowledge graph delineates intricate relationships among genres, artists, and instruments, offering valuable insights for genre representation. Within this framework, we propose two models tailored for distinct genre classification scenarios: fixed-set genre classification and open-set genre classification. These models exploit the knowledge graph to unveil correlations among different genres and integrate this knowledge into the audio representation. Notably, our approach is the first to merge audio data with high-level knowledge for music genre classification. Experimental results demonstrate that our proposed methods outperform state-of-the-art approaches, achieving an average genre classification accuracy of 68.07% on the FMA-medium dataset and 42.4% for open-set classification on the FMA-large dataset.
KW - Music genre classification
KW - knowledge graph embedding
KW - multi-modality fusion
UR - https://www.scopus.com/pages/publications/85193486686
U2 - 10.1109/TASLP.2024.3402115
DO - 10.1109/TASLP.2024.3402115
M3 - 文章
AN - SCOPUS:85193486686
SN - 2329-9290
VL - 32
SP - 2764
EP - 2776
JO - IEEE/ACM Transactions on Audio Speech and Language Processing
JF - IEEE/ACM Transactions on Audio Speech and Language Processing
ER -