Skip to main navigation Skip to search Skip to main content

Semantic-Anchored Cross-modal Distillation framework with Foundation models for SAR Ship Recognition

  • Northwestern Polytechnical University Xian
  • Zhejiang University

Research output: Contribution to journalArticlepeer-review

Abstract

Synthetic Aperture Radar (SAR) offers day-and-night capability for ship recognition, but its scattering mechanism results in limited textural and spectral detail compared to optical imagery, hindering fine-grained semantic interpretation and recognition. Existing cross-modal transfer methods mainly align features or pixels between SAR and optical imagery, yet they fail to guarantee semantic consistency across modalities. To address this, we propose a Semantic-anchored Cross-modal Distillation Framework (SCDF) with foundation models. SCDF introduces textual semantic descriptors for each ship category as semantic anchors to ensure cross-modal semantic consistency, while incorporating scattering topology maps into SAR images, thus enabling effective transfer without sacrificing modality-specific discriminability. Within this framework, a language foundation model encodes semantic anchors into text embeddings as class references, formulating ship recognition as aligning visual features with semantic anchors. To enhance the alignment between SAR features and anchors, a scattering-aware student model integrates scattering topology maps with SAR imagery, emphasizing key ship structures. This alignment is further guided by a vision foundation model acting as the optical teacher, which provides reliable optical-semantic similarity for distillation. Instead of simply transferring labels or features, the semantic-anchored distillation transfers semantic discriminability from the optical domain to SAR while preserving SAR-specific scattering topology features. Extensive experiments on the FUSAR-Ship dataset and fine-grained optical datasets (FGSC-23 and FGSCR-42) demonstrate that SCDF effectively bridges SAR and optical imagery and enhances SAR ship recognition.

Original languageEnglish
JournalIEEE Transactions on Aerospace and Electronic Systems
DOIs
StateAccepted/In press - 2025
Externally publishedYes

Keywords

  • Cross-modal imagery
  • Foundation models
  • Knowledge distillation
  • Ship recognition
  • Textual descriptors

Fingerprint

Dive into the research topics of 'Semantic-Anchored Cross-modal Distillation framework with Foundation models for SAR Ship Recognition'. Together they form a unique fingerprint.

Cite this