RGB-D Domain adaptive semantic segmentation with cross-modality feature recalibration

  • Qizhe Fan
  • , Xiaoqin Shen
  • , Shihui Ying
  • , Juan Wang
  • , Shaoyi Du

Research output: Contribution to journalArticlepeer-review

4 Scopus citations

Abstract

Unsupervised domain adaptive (UDA) semantic segmentation aims to train models that effectively transfer knowledge from synthetic to real-world images, thereby reducing the reliance on manual annotation. Currently, most existing UDA methods primarily focus on RGB image processing, largely overlooking depth information as a valuable geometric cue that complements RGB representations. Additionally, while some approaches attempt to incorporate depth information by inferring it from RGB images as an auxiliary task, inaccuracies in depth estimation can still result in localized blurring or distortion in segmentation outcomes. To comprehensively address these limitations, we propose an innovative RGB-D UDA framework CMFRDA, which seamlessly integrates both RGB and depth images as inputs, fully leveraging their distinct yet complementary properties to improve segmentation performance. Specifically, to mitigate the prevalent object boundary noise in depth information, we propose a Depth Feature Rectification Module (DFRM), which effectively suppresses noise while enhancing the representation of fine structural details. Nevertheless, despite the effectiveness of DFRM, challenges remain due to the presence of noisy signals arising from incomplete surface data beyond the operational range of depth sensors, as well as potential mismatches between modalities. In order to overcome these challenges, we further introduce a Cross-Modality Feature Recalibration (CMFR) block. CMFR comprises two key components: Channel-wise Consistency Recalibration (CCR) and Spatial-wise Consistency Recalibration (SCR). CCR suppresses noise from incomplete surfaces in depth by leveraging the complementary information provided by RGB features, while SCR exploits the distinctive advantages of both modalities to mutually recalibrate each other, thereby ensuring consistency between RGB and depth modalities. By seamlessly integrating DFRM and CMFR, our CMFRDA framework effectively improves the performance of UDA semantic segmentation. Multitudinous experiments demonstrate that our CMFRDA achieves competitive performance on two widely-used UDA benchmarks GTA → Cityscapes and Synthia → Cityscapes.

Original languageEnglish
Article number103117
JournalInformation Fusion
Volume120
DOIs
StatePublished - Aug 2025

Keywords

  • Cross-modality feature recalibration
  • RGB-D semantic segmentation
  • Unsupervised domain adaptation

Fingerprint

Dive into the research topics of 'RGB-D Domain adaptive semantic segmentation with cross-modality feature recalibration'. Together they form a unique fingerprint.

Cite this