Towards Handling Sudden Changes in Feature Maps During Depth Estimation

  • Yao Xue
  • , Yu Cao
  • , Xubin Feng
  • , Meilin Xie
  • , Ke Li
  • , Xingjun Zhang
  • , Xueming Qian

Research output: Contribution to journalArticlepeer-review

8 Scopus citations

Abstract

Depth estimation aims to predict depth map from RGB images without high cost equipments. Deep learning based depth estimation methods have shown their effectiveness. However in existing methods, depth information is represented by a per-pixel depth map. Such depth map representation is fragile facing different kinds of depth changes. This paper proposes a Compressive Sensing based Depth Representation (CSDR) scheme, which formulates the problem of depth estimation in pixel space into the task of fixed-length vector regression in representation space. In this way, deep model training errors will not directly interfere depth estimation, and distortions in estimated depth maps can be restrained in the greatest extent. In addition, we improve depth estimation from two other aspects: model structure and loss function. To capture the features in different scales, we propose a Multiscale Encoder & Multiscale Decoder (MEMD) structure as the vector regression model. To further deal with depth change, we also modify the loss function, where the curvature difference between ground truth and estimation is directly incorporated. With the support of CSDR, MEMD and the curvature loss, the proposed approach achieves superior performance on a challenging depth estimation dataset: NYU-Depth-v2. A range of experiments support our claim that regression in CSDR space performs better than traditionally direct depth map estimation in pixel space.

Original languageEnglish
Pages (from-to)4002-4012
Number of pages11
JournalIEEE Transactions on Multimedia
Volume25
DOIs
StatePublished - 2023

Keywords

  • Depth estimation
  • depth representation
  • multiscale feature

Fingerprint

Dive into the research topics of 'Towards Handling Sudden Changes in Feature Maps During Depth Estimation'. Together they form a unique fingerprint.

Cite this