CHEN Mingkai, LIU Minghao, WANG Wenjun, WANG Lei, ZHENG Baoyu. Codec for Cross-modal Semantic Communication in 6G[J]. JOURNAL OF SIGNAL PROCESSING, 2023, 39(7): 1141-1154. DOI: 10.16798/j.issn.1003-0530.2023.07.001
Citation: CHEN Mingkai, LIU Minghao, WANG Wenjun, WANG Lei, ZHENG Baoyu. Codec for Cross-modal Semantic Communication in 6G[J]. JOURNAL OF SIGNAL PROCESSING, 2023, 39(7): 1141-1154. DOI: 10.16798/j.issn.1003-0530.2023.07.001

Codec for Cross-modal Semantic Communication in 6G

  • ‍ ‍In the era of 6G, semantic communication is considered as one of the most potential research directions in 6G communication. Semantic communication tries to focus on the requirements of users’ immersive and multi-modal experience, low latency, and high reliability in order to make it clear. For this reason, a cross-modal semantic communication based on deep learning is proposed in this paper, in which the semantic encoding and the semantic decoding are designed. The Frobenius norm is used to judge the similarity of the three encoded modal semantic intermediate vectors, discarding the same features, and preserving the unique features for the feature-weighted summation. Cross-modal semantic fusion is also designed, and the end-to-end data transmission driven by different user task requirements of users in multi-modal business is realized in this communication. This coding and decoding framework realizes the cross-modal data transmission of multi-modal data including voice, text, and image, it provides solutions for pragmatically oriented tasks of communication and greatly enhances the addition of user experience. In this paper proposes an architecture for evaluating the semantic similarity between the receiver and the transmitter. The architecture is composed of a siamese network and a pseudo-siamese network, the siamese network discriminates the same mode and the pseudo-siamese network discriminates the different modes. At the same time, the matching loss between the modal contents is obtained accurately. And we assume that feedback guides the optimization of the optimization in the reverse direction so that the loss value reaches the minimum, and the whole network iteration converges in such a way as to achieve accurate semantic translation in both the encoder and decoder. From the simulation results, it can be seen that the proposed cross-modal semantic communication is obviously superior to the traditional communication system. In the case of high SNR, the similarity of all the modes is almost more than 90%. In the case of low SNR, the advantage of a cross-modal semantic communication system is more obvious. The similarity of the cross-modal semantic communication is improved by more than 53% compared with traditional communication. Thus, the superiority and feasibility of cross-modal semantic communication are proved.
  • loading

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return