Relation-Specific Feature Augmentation for unbiased scene graph generation

Research output: Contribution to journalArticlepeer-review

6 Scopus citations

Abstract

Scene Graph Generation (SGG) models suffer from the long-tailed distribution of relations, which results in biased predictions that favor head relations (e.g., on) over informative tail ones (e.g., sitting on, laying on, standing on). Existing solutions typically adopt class re-balancing strategies to balance data distribution. However, they do not essentially solve the lack of information due to insufficient tail data. To this end, we propose a Relation-Specific Feature Augmentation (RSFA) framework to mitigate the long-tailed bias by augmenting relations in the feature space. To perform augmentation effectively, we design an augmentation scheme and a novel Dual Attention Network (DAN). The augmentation scheme augments each relation uniformly based on the reciprocal number of samples to avoid over-fitting. By extracting relation-specific information from new object features generated by a Conditional Variational AutoEncoder (CVAE), DAN generates reliable virtual relation representations to provide useful information to guide optimizing relation classifier. Extensive ablation studies and comprehensive analysis demonstrate the effectiveness of our method in debiasing. And results on the Visual Genome benchmark show that our method significantly outperforms the existing state-of-the-art methods.

Original languageEnglish
Article number110936
JournalPattern Recognition
Volume157
DOIs
StatePublished - Jan 2025

Keywords

  • Feature augmentation
  • Image understanding
  • Long-tailed distribution
  • Scene graph generation

Fingerprint

Dive into the research topics of 'Relation-Specific Feature Augmentation for unbiased scene graph generation'. Together they form a unique fingerprint.

Cite this