Skip to main navigation Skip to search Skip to main content

Multi-Granularity Sparse Relationship Matrix Prediction Network for End-to-End Scene Graph Generation

  • Xi'an Jiaotong University

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

Current end-to-end Scene Graph Generation (SGG) relies solely on visual representations to separately detect sparse relations and entities in an image. This leads to the issue where the predictions of entities do not contribute to the prediction of relations, necessitating post-processing to assign corresponding subjects and objects to the predicted relations. In this paper, we introduce a sparse relationship matrix that bridges entity detection and relation detection. Our approach not only eliminates the need for relation matching, but also leverages the semantics and positional information of predicted entities to enhance relation prediction. Specifically, a multi-granularity sparse relationship matrix prediction network is proposed, which utilizes three gated pooling modules focusing on filtering negative samples at different granularities, thereby obtaining a sparse relationship matrix containing entity pairs most likely to form relations. Finally, a set of sparse, most probable subject-object pairs can be constructed and used for relation decoding. Experimental results on multiple datasets demonstrate that our method achieves a new state-of-the-art overall performance. Our code is available at https://github.com/wanglei0618/Mg-RMPN.

Original languageEnglish
Title of host publicationComputer Vision – ECCV 2024 - 18th European Conference, Proceedings
EditorsAleš Leonardis, Elisa Ricci, Stefan Roth, Olga Russakovsky, Torsten Sattler, Gül Varol
PublisherSpringer Science and Business Media Deutschland GmbH
Pages105-121
Number of pages17
ISBN (Print)9783031730061
DOIs
StatePublished - 2025
Event18th European Conference on Computer Vision, ECCV 2024 - Milan, Italy
Duration: 29 Sep 20244 Oct 2024

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume15140 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference18th European Conference on Computer Vision, ECCV 2024
Country/TerritoryItaly
CityMilan
Period29/09/244/10/24

Keywords

  • End-to-End
  • Multi-Granularity
  • Scene Graph Generation
  • Sparse Relationship Matrix

Fingerprint

Dive into the research topics of 'Multi-Granularity Sparse Relationship Matrix Prediction Network for End-to-End Scene Graph Generation'. Together they form a unique fingerprint.

Cite this