UltraSpeech: Speech Enhancement by Interaction between Ultrasound and Speech

Research output: Contribution to journalArticlepeer-review

20 Scopus citations

Abstract

Speech enhancement can benefit lots of practical voice-based interaction applications, where the goal is to generate clean speech from noisy ambient conditions. This paper presents a practical design, namely UltraSpeech, to enhance speech by exploring the correlation between the ultrasound (profiled articulatory gestures) and speech. UltraSpeech uses a commodity smartphone to emit the ultrasound and collect the composed acoustic signal for analysis. We design a complex masking framework to deal with complex-valued spectrograms, incorporating the magnitude and phase rectification of speech simultaneously. We further introduce an interaction module to share information between ultrasound and speech two branches and thus enhance their discrimination capabilities. Extensive experiments demonstrate that UltraSpeech increases the Scale Invariant SDR by 12dB, improves the speech intelligibility and quality effectively, and is capable to generalize to unknown speakers.

Original languageEnglish
Article number111
JournalProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies
Volume6
Issue number3
DOIs
StatePublished - 7 Sep 2022

Keywords

  • acoustic sensing
  • multi-modality fusion
  • speech enhancement

Fingerprint

Dive into the research topics of 'UltraSpeech: Speech Enhancement by Interaction between Ultrasound and Speech'. Together they form a unique fingerprint.

Cite this