跳到主要导航 跳到搜索 跳到主要内容

CSCC: Cross-Scene Crowd Counting via Learning to Diversify for Domain Generalization

  • Xi'an Jiaotong University
  • Baidu Inc

科研成果: 期刊稿件文章同行评审

2 引用 (Scopus)

摘要

It is challenging for crowd counting models to generalize to new scenes due to domain shifts in training and test data. Although domain adaptation approaches have made notable progress in bridging the domain gap, they require target domain data. In this paper, we propose a novel framework for cross-scene crowd counting, which unifies domain generalization and adaptation. For domain generalization, we train a model only using single-domain data and the model can be generalized to any scene with satisfying performance. Regarding domain adaptation, we use both source and target domain data to further improve the performance. We first design a generation network that diversifies the generated samples to cover the unseen target domains as much as possible by minimizing mutual information. This approach simulates training data in various domains, thereby enhancing the model's generalization ability. Then we develop a pixel-wise supervised contrastive loss function that pulls the human heads in the source images and generated images closer to each other and pushes them further away from the background. This loss helps extract a domain-invariant feature representation, thus improving the model's generalization ability. Moreover, if information about the target domain is available, our generalization method can be easily applied as an adaptation method by replacing the mutual information minimization loss with the mutual information maximization loss. This can further improve cross-scene crowd counting performance. The experimental results demonstrate the strong generalizability of our method across different datasets.

源语言英语
页(从-至)3320-3330
页数11
期刊IEEE Transactions on Multimedia
27
DOI
出版状态已出版 - 2025

学术指纹

探究 'CSCC: Cross-Scene Crowd Counting via Learning to Diversify for Domain Generalization' 的科研主题。它们共同构成独一无二的指纹。

引用此