跳到主要导航 跳到搜索 跳到主要内容

Singing Melody Extraction Based on Combined Frequency-Temporal Attention and Attentional Feature Fusion with Self-Attention

  • Xi Qi
  • , Lihua Tian
  • , Chen Li
  • , Hui Song
  • , Jiahui Yan
  • Xi'an Jiaotong University

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

The main melody extraction of polyphonic music is a challenging task for music information retrieval. Traditional convolutional neural networks, recurrent neural networks have effectively improved this task. In recent years, with the development of attention mechanism in neural networks, the frequency and time attention information of audio has been fully exploited, and the amplitude properties of audio can also be better integrated with a good fusion module. This paper improves the frequency-temporal attention based on others' prior work. By extracting the attention information with the frequency-temporal attention and performing additive fusion of features, the combined frequency-temporal attention is obtained. Then we apply attentional feature fusion based on multi-scale channel attention, and finally the temporal dependencies are learned through the self-attention module. Our experimental results on four datasets demonstrate that our model outperforms existing models.

源语言英语
主期刊名Proceedings - 2022 IEEE International Symposium on Multimedia, ISM 2022
出版商Institute of Electrical and Electronics Engineers Inc.
220-227
页数8
ISBN(电子版)9781665471725
DOI
出版状态已出版 - 2022
活动24th IEEE International Symposium on Multimedia, ISM 2022 - Virtual, Online, 意大利
期限: 5 12月 20227 12月 2022

出版系列

姓名Proceedings - 2022 IEEE International Symposium on Multimedia, ISM 2022

会议

会议24th IEEE International Symposium on Multimedia, ISM 2022
国家/地区意大利
Virtual, Online
时期5/12/227/12/22

学术指纹

探究 'Singing Melody Extraction Based on Combined Frequency-Temporal Attention and Attentional Feature Fusion with Self-Attention' 的科研主题。它们共同构成独一无二的指纹。

引用此