Abstract
In bandwidth-constrained scenarios, achieving high compression efficiency while preserving visual fidelity remains a major challenge. This paper proposes a novel generative representation fusion framework for ultra-low bitrate natural image compression. At the transmitter side, we introduce SketchFusionNet, a lightweight encoder that transforms input images into compact, sketch-like representations by fusing structural and perceptual cues into a unified, compression-friendly format. This fused representation is optimized through adversarial training, guided jointly by compression and preview objectives, to ensure both compactness and semantic richness. On the receiver side, a two-stage decoding strategy is employed. The preview module rapidly reconstructs a coarse yet semantically meaningful approximation of the original image, providing immediate structural context. A diffusion-based generative module then progressively enhances the visual quality by recovering fine-grained details, leveraging learned generative priors to mitigate the information loss caused by extreme compression. Experimental results on benchmark datasets show that our method surpasses state-of-the-art approaches in both rate-distortion performance and perceptual quality. Compared to PICS, which also leverages edge-based representations and generative AI, our approach achieves a 0.29-0.42 reduction in LPIPS and a 10.64-12.09 dB improvement in PSNR, with only a 0.01-0.02 bpp increase in bitrate under the same dataset conditions. By integrating symbolic compression with generative reconstruction, our approach demonstrates a practical and efficient realization of generative information fusion for high-fidelity image communication under constrained bandwidth.
| Original language | English |
|---|---|
| Article number | 103954 |
| Journal | Information Fusion |
| Volume | 128 |
| DOIs | |
| State | Published - Apr 2026 |
Keywords
- Image compression
- Representation fusion
- Sketch
- Ultra-low bitrate
- Visual fidelity