Abstract
Few-shot action recognition (FSAR) has made substantial progress, however it primarily addresses problems within a single domain. Its effectiveness is often questioned when applied across different domains. This is mainly due to inductive biases in data distribution during the meta-training, including spatial and temporal distribution biases. These combined biases further complicate the adaptation issue in videos, making it challenging for models trained in one domain to adapt to another. In order to deal with this problem, we first enhance the source domain videos with frames from unlabeled target domain videos. Then, we employ a dual-branch structure to process the videos. The first branch, named the Domain Temporal branch, simultaneously handles global sequences of videos from both the source and target domains, while the second branch, named the Local-Global Adapter branch, compares local tuples of videos with global sequences from the source domain. We align the meta-learning results of the source domain from the first branch with that from the second branch, enabling us to obtain domain-invariant information solely from the source domain. Concurrently, in the first branch, we perform a reconstruction operation for the target domain videos, allowing the model to extract features that approach the target domain. Our code is available on: https://github.com/cofly2014/GSLTA.git.
| Original language | English |
|---|---|
| Article number | 113041 |
| Journal | Knowledge-Based Systems |
| Volume | 311 |
| DOIs | |
| State | Published - 28 Feb 2025 |
Keywords
- Cross-domain
- Few-shot action recognition
- Multiple-level distillation
Fingerprint
Dive into the research topics of 'GSLTA-CDFSAR: Global Sequences and Local Tuples Alignment for Cross-Domain Few-Shot Action Recognition'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver