TY - GEN
T1 - Beat Tracking Algorithm Based on Multi-scale Feature Fusion and Attention Mechanism
AU - Dong, Yunlong
AU - Li, Chen
AU - Tian, Lihua
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2026.
PY - 2026
Y1 - 2026
N2 - Automatic beat and downbeat tracking is an important research direction in the field of music information retrieval. This paper proposes a beat tracking algorithm based on multi-scale feature fusion and attention mechanism for the joint tracking of beat and downbeat. Firstly, we propose a convolution feature extraction layer based on multiscale feature fusion, which makes the model pay attention to different levels of music information and exchange musical instrument information with separated tracks. Then, based on the dilated self-attention, we introduce the dilated neighborhood attention module and the global attention module with multi-scale features. The former not only reduces the time complexity, but also realizes the information exchange of time instrument dimension characteristics, and improves the accuracy of beat detection; The latter can determine the global optimal beat sequence while fusing the time information of different scales, which improves the stability of beat detection. By comprehensively utilizing the information of different musical levels and a variety of attention mechanisms, our model can better perceive the global and local characteristics of beat. We performed experimental verification on four widely used datasets, including ballroom, Hainsworth, harmonic and Carnatic datasets. The experimental results show that, compared with the deep learning method in recent years, our proposed model shows better performance in beat tracking and downbeat tracking. Compared with baseline, the F-measure indexes of beat tracking and downbeat tracking on ballroom dataset are improved by 1.2% and 2.8% respectively.
AB - Automatic beat and downbeat tracking is an important research direction in the field of music information retrieval. This paper proposes a beat tracking algorithm based on multi-scale feature fusion and attention mechanism for the joint tracking of beat and downbeat. Firstly, we propose a convolution feature extraction layer based on multiscale feature fusion, which makes the model pay attention to different levels of music information and exchange musical instrument information with separated tracks. Then, based on the dilated self-attention, we introduce the dilated neighborhood attention module and the global attention module with multi-scale features. The former not only reduces the time complexity, but also realizes the information exchange of time instrument dimension characteristics, and improves the accuracy of beat detection; The latter can determine the global optimal beat sequence while fusing the time information of different scales, which improves the stability of beat detection. By comprehensively utilizing the information of different musical levels and a variety of attention mechanisms, our model can better perceive the global and local characteristics of beat. We performed experimental verification on four widely used datasets, including ballroom, Hainsworth, harmonic and Carnatic datasets. The experimental results show that, compared with the deep learning method in recent years, our proposed model shows better performance in beat tracking and downbeat tracking. Compared with baseline, the F-measure indexes of beat tracking and downbeat tracking on ballroom dataset are improved by 1.2% and 2.8% respectively.
KW - Attention mechanism
KW - Beat-tracking
KW - Transformer
UR - https://www.scopus.com/pages/publications/105028292983
U2 - 10.1007/978-981-95-4828-6_2
DO - 10.1007/978-981-95-4828-6_2
M3 - 会议稿件
AN - SCOPUS:105028292983
SN - 9789819548279
T3 - Communications in Computer and Information Science
SP - 8
EP - 18
BT - Artificial Intelligence and Robotics - 10th International Symposium, ISAIR 2025, Revised Selected Papers
A2 - Lu, Huimin
PB - Springer Science and Business Media Deutschland GmbH
T2 - 10th International Symposium on Artificial Intelligence and Robotics, ISAIR 2025
Y2 - 24 August 2025 through 26 August 2025
ER -