LVM-Empowered Image Semantic Communication via Next-pixel Prediction： A Separate Source-Channel Coding Perspective

REN Tianqi; LI Rongpeng

doi:10.12466/xhcl.2025.10.006

REN Tianqi, LI Rongpeng. LVM-empowered image semantic communication via next-pixel prediction: a separate source-channel coding perspective[J]. Journal of Signal Processing, 2025, 41(10): 1657-1669.DOI: 10.12466/xhcl.2025.10.006.

Citation:

LVM-Empowered Image Semantic Communication via Next-pixel Prediction： A Separate Source-Channel Coding Perspective

Abstract

Abstract

As the vision for 6G unfolds， semantic communication is emerging as a core technology. The prevailing paradigm， deep learning-based joint source-channel coding （JSCC）， performs well under specific conditions but is hampered by inherent limitations such as poor compatibility with digital systems， weak generalization， and low design flexibility. To address these challenges， this study revisits the separate source-channel coding （SSCC） paradigm and proposes the large visual model-based separate source-channel coding framework （LVM-SSCC）. This framework innovatively leverages large vision models （e.g.， ImageGPT） for autoregressive pixel prediction， which， combined with arithmetic coding， achieves highly efficient lossless source compression. Concurrently， an error correction code transformer （ECCT） is introduced on the channel-coding side to enhance the low-density parity-check （LDPC） decoding robustness. To ensure a fair comparison， this study utilized a unified energy consumption-based signal-to-noise ratio （SNR_unified） metric. Extensive simulations on the CIFAR-10 dataset demonstrated that under both additive white Gaussian noise （AWGN） and Rayleigh fading channels， the proposed scheme significantly outperformed mainstream JSCC schemes such as DeepJSCC and SparseSBC in terms of the image reconstruction quality （peak signal-to-noise ratio （PSNR） and structural similarity index （SSIM））. This was especially true in the mid-to-high SNR region， where our scheme achieved near-lossless reconstruction with high fidelity while maintaining full compatibility with digital communication systems. The results of this study provide compelling evidence of the benefits of using the SSCC paradigm in future image semantic communication， highlighting its comprehensive advantages in performance， compatibility， and flexibility.

FullText(HTML)

References (31)

Cited By

LVM-Empowered Image Semantic Communication via Next-pixel Prediction： A Separate Source-Channel Coding Perspective

Abstract

Catalog

Export File

Citation

Format

Content