Skip to main navigation Skip to search Skip to main content

UML-MVSNet: Uncertainty-guided Multi-task Learning for Multi-view Stereo

  • Xi'an Jiaotong University

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Multi-view stereo methods have achieved remarkable progress in recent years, benefiting from advancements in depth and confidence estimation. Existing multi-view stereo methods estimate depth through regression or classification, but both approaches have notable limitations: regression methods are prone to overfitting, while classification methods struggle to achieve precise depth prediction. Combining the strengths of these approaches is crucial for accurate reconstruction. To address these issues, we propose a novel network, termed UML-MVSNet, to enable accurate feature extraction and depth estimation. Specifically, we introduce a Local Transformer (LT) module that applies attention mechanisms to local features, effectively capturing local detail information and enhancing feature matching accuracy. Additionally, we propose an Uncertainty-guided Multi-task Learning (UML) module that integrates the advantages of both regression and classification branches for robust depth estimation. In each sub-branch, the Uncertainty-Guided Optimization (UGO) module is designed to refine the probability volume guided by uncertainty. To guide the network toward low-uncertainty regions and balance multi-task losses, we introduce the Uncertainty-Aware Loss (UA Loss). Extensive experiments on the DTU and Tanks & Temples datasets demonstrate that our UML-MVSNet achieves competitive results in both qualitative and quantitative performance compared to other state-of-the-art methods.

Original languageEnglish
Title of host publicationInternational Joint Conference on Neural Networks, IJCNN 2025 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798331510428
DOIs
StatePublished - 2025
Event2025 International Joint Conference on Neural Networks, IJCNN 2025 - Rome, Italy
Duration: 30 Jun 20255 Jul 2025

Publication series

NameProceedings of the International Joint Conference on Neural Networks
ISSN (Print)2161-4393
ISSN (Electronic)2161-4407

Conference

Conference2025 International Joint Conference on Neural Networks, IJCNN 2025
Country/TerritoryItaly
CityRome
Period30/06/255/07/25

Keywords

  • Deep learning
  • Local Transformer
  • Multi-view stereo
  • Uncertainty-guided Multi-task Learning

Fingerprint

Dive into the research topics of 'UML-MVSNet: Uncertainty-guided Multi-task Learning for Multi-view Stereo'. Together they form a unique fingerprint.

Cite this