跳到主要导航 跳到搜索 跳到主要内容

VITS-Based Singing Voice Conversion System with DSPGAN Post-Processing for SVCC2023

  • Yiquan Zhou
  • , Meng Chen
  • , Yi Lei
  • , Jihua Zhu
  • , Weifeng Zhao
  • Xi'an Jiaotong University
  • Northwestern Polytechnical University Xian
  • Tencent

科研成果: 书/报告/会议事项章节会议稿件同行评审

6 引用 (Scopus)

摘要

This paper presents the T02 team's system for the Singing Voice Conversion Challenge 2023 (SVCC2023). Our system entails a VITS-based SVC model, incorporating three modules: a feature extractor, a voice converter, and a postprocessor. Specifically, the feature extractor provides F0 contours and extracts speaker-independent linguistic content from the input singing voice by leveraging a HuBERT model. The voice converter is employed to recompose the speaker timbre, F0, and linguistic content to generate the waveform of the target speaker. Besides, to further improve the audio quality, a fine-tuned DSPGAN vocoder is introduced to resynthesise the waveform. Given the limited target speaker data, we utilize a two-stage training strategy to adapt the base model to the target speaker. During model adaptation, several tricks, such as data augmentation and joint training with auxiliary singer data, are involved. Official challenge results show that our system achieves superior performance, especially in the cross-domain task, ranking 1st and 2nd in naturalness and similarity, respectively. Further ablation justifies the effectiveness of our system design.

源语言英语
主期刊名2023 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2023
出版商Institute of Electrical and Electronics Engineers Inc.
ISBN(电子版)9798350306897
DOI
出版状态已出版 - 2023
活动2023 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2023 - Taipei, 中国台湾
期限: 16 12月 202320 12月 2023

出版系列

姓名2023 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2023

会议

会议2023 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2023
国家/地区中国台湾
Taipei
时期16/12/2320/12/23

学术指纹

探究 'VITS-Based Singing Voice Conversion System with DSPGAN Post-Processing for SVCC2023' 的科研主题。它们共同构成独一无二的指纹。

引用此