跳到主要导航 跳到搜索 跳到主要内容

RGB-D Domain adaptive semantic segmentation with cross-modality feature recalibration

  • Qizhe Fan
  • , Xiaoqin Shen
  • , Shihui Ying
  • , Juan Wang
  • , Shaoyi Du
  • Xi'an University of Technology
  • Shanghai University
  • The Second Affiliated Hospital of Xi'an Jiaotong University

科研成果: 期刊稿件文章同行评审

4 引用 (Scopus)

摘要

Unsupervised domain adaptive (UDA) semantic segmentation aims to train models that effectively transfer knowledge from synthetic to real-world images, thereby reducing the reliance on manual annotation. Currently, most existing UDA methods primarily focus on RGB image processing, largely overlooking depth information as a valuable geometric cue that complements RGB representations. Additionally, while some approaches attempt to incorporate depth information by inferring it from RGB images as an auxiliary task, inaccuracies in depth estimation can still result in localized blurring or distortion in segmentation outcomes. To comprehensively address these limitations, we propose an innovative RGB-D UDA framework CMFRDA, which seamlessly integrates both RGB and depth images as inputs, fully leveraging their distinct yet complementary properties to improve segmentation performance. Specifically, to mitigate the prevalent object boundary noise in depth information, we propose a Depth Feature Rectification Module (DFRM), which effectively suppresses noise while enhancing the representation of fine structural details. Nevertheless, despite the effectiveness of DFRM, challenges remain due to the presence of noisy signals arising from incomplete surface data beyond the operational range of depth sensors, as well as potential mismatches between modalities. In order to overcome these challenges, we further introduce a Cross-Modality Feature Recalibration (CMFR) block. CMFR comprises two key components: Channel-wise Consistency Recalibration (CCR) and Spatial-wise Consistency Recalibration (SCR). CCR suppresses noise from incomplete surfaces in depth by leveraging the complementary information provided by RGB features, while SCR exploits the distinctive advantages of both modalities to mutually recalibrate each other, thereby ensuring consistency between RGB and depth modalities. By seamlessly integrating DFRM and CMFR, our CMFRDA framework effectively improves the performance of UDA semantic segmentation. Multitudinous experiments demonstrate that our CMFRDA achieves competitive performance on two widely-used UDA benchmarks GTA → Cityscapes and Synthia → Cityscapes.

源语言英语
文章编号103117
期刊Information Fusion
120
DOI
出版状态已出版 - 8月 2025

学术指纹

探究 'RGB-D Domain adaptive semantic segmentation with cross-modality feature recalibration' 的科研主题。它们共同构成独一无二的指纹。

引用此