Skip to main navigation Skip to search Skip to main content

Singing Melody Extraction Based on Combined Frequency-Temporal Attention and Attentional Feature Fusion with Self-Attention

  • Xi Qi
  • , Lihua Tian
  • , Chen Li
  • , Hui Song
  • , Jiahui Yan
  • Xi'an Jiaotong University

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The main melody extraction of polyphonic music is a challenging task for music information retrieval. Traditional convolutional neural networks, recurrent neural networks have effectively improved this task. In recent years, with the development of attention mechanism in neural networks, the frequency and time attention information of audio has been fully exploited, and the amplitude properties of audio can also be better integrated with a good fusion module. This paper improves the frequency-temporal attention based on others' prior work. By extracting the attention information with the frequency-temporal attention and performing additive fusion of features, the combined frequency-temporal attention is obtained. Then we apply attentional feature fusion based on multi-scale channel attention, and finally the temporal dependencies are learned through the self-attention module. Our experimental results on four datasets demonstrate that our model outperforms existing models.

Original languageEnglish
Title of host publicationProceedings - 2022 IEEE International Symposium on Multimedia, ISM 2022
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages220-227
Number of pages8
ISBN (Electronic)9781665471725
DOIs
StatePublished - 2022
Event24th IEEE International Symposium on Multimedia, ISM 2022 - Virtual, Online, Italy
Duration: 5 Dec 20227 Dec 2022

Publication series

NameProceedings - 2022 IEEE International Symposium on Multimedia, ISM 2022

Conference

Conference24th IEEE International Symposium on Multimedia, ISM 2022
Country/TerritoryItaly
CityVirtual, Online
Period5/12/227/12/22

Keywords

  • feature fusion
  • music information retrieval
  • self-attention
  • singing melody extraction

Fingerprint

Dive into the research topics of 'Singing Melody Extraction Based on Combined Frequency-Temporal Attention and Attentional Feature Fusion with Self-Attention'. Together they form a unique fingerprint.

Cite this