ZHANG Hanqi, HUANG Congyu, WANG Jing, et al. Predicting viewports for multi-user panoramic streams using an attention mechanism[J]. Journal of Signal Processing, 2025, 41(2): 302-311. DOI: 10.12466/xhcl.2025.02.009.
Citation: ZHANG Hanqi, HUANG Congyu, WANG Jing, et al. Predicting viewports for multi-user panoramic streams using an attention mechanism[J]. Journal of Signal Processing, 2025, 41(2): 302-311. DOI: 10.12466/xhcl.2025.02.009.

Predicting Viewports for Multi-User Panoramic Streams Using an Attention Mechanism

  • Recently, with the development of immersive technologies such as virtual reality, the application prospects of panoramic video technology have gradually expanded. While offering realistic experiences, panoramic videos strain network bandwidth. Therefore, reducing the transmission bandwidth has become a research focus, with viewport prediction emerging as a popular topic in the field. Currently, mainstream solutions for viewport prediction often utilize viewpoint trajectories and scene content, combined with neural network outputs for evaluation. Most of the existing methods cannot achieve good performance in long-term prediction and do not fully utilize information in multi-user scenarios. This paper proposes a viewport prediction method inspired by Transformer networks. Because of the similarity in viewpoint trajectories of different users watching the same video, this paper first proposes a scheme to compare multi-user viewport trajectory similarity, which uses the target user’s and historical user’s viewport trajectory data to predict the target user’s future viewport trajectory data. Owing to the discontinuity of the panoramic video viewport trajectory, this paper maps the discontinuous trajectory to solve the problem of discontinuous single prediction trajectory data. In an experiment, this method was used to process a dataset, and promising results were achieved. Finally, experimental comparisons with similar algorithms from recent years show a reduction in error across metrics such as the mean absolute error, Manhattan distance, and angle distance error proposed in this paper, with some metrics reduced by more than 10%. This indicates that the proposed solution can achieve higher accuracy in long-term viewport prediction, and the introduction of attention mechanism and multi-user similarity comparison can aid in improving model performance.
  • loading

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return