Dual-Branch No-Reference Quality Assessment Method for Screen Content Videos Based on Multi-Dimensional Spatiotemporal Features

LAI Yilin; CHEN Yuping; LIU Zhihong; ZHU Xiancheng; ZENG Huanqiang

doi:10.12466/xhcl.2026.02.003

Citation:LAI Yilin, CHEN Yuping, LIU Zhihong, et al. Dual-branch no-reference quality assessment method for screen content videos based on multi-dimensional spatiotemporal featuresJ. Journal of Signal Processing, 2026, 42(2): 148-157.DOI: 10.12466/xhcl.2026.02.003.

Citation:

Dual-Branch No-Reference Quality Assessment Method for Screen Content Videos Based on Multi-Dimensional Spatiotemporal Features

Abstract

Abstract

The widespread adoption of smart devices has led to the extensive application of screen content videos in fields such as remote education and live streaming. Thus the quality assessment of these videos is crucial for ensuring a satisfactory visual experience. Unlike natural scene videos， screen content contains many synthetic elements such as text and graphics， resulting in more complex distortion types. Therefore， there is a need to develop a no-reference quality assessment model that aligns with human visual characteristics. However， existing methods struggle to effectively handle a high dynamic range and composite distortions， and the high redundancy and strong temporal dependencies in video data constrain feature extraction efficiency and the accuracy of quality perception. To address these challenges， this study proposed a dual-branch architecture for no-reference screen content video quality assessment. For complex distortion patterns， we constructed a spatial perception branch to extract spatial structural information and noise distribution from key frames. To reduce video redundancy and suppress shallow dependencies， we introduced a tube-based masked spatiotemporal encoding mechanism that captures deeper motion features. To address the difficulties encountered in temporal modeling， we designed a temporal perception enhancement module that integrates multi-dimensional features to generate final quality scores. Experimental findings revealed that our method achieved a 2.3% improvement in the weighted Spearman Rank-Order Correlation Coefficient （SROCC） compared with the second-best model on two mainstream datasets， significantly enhancing both the perceptual consistency and generalization capability in screen content video quality assessment.

FullText(HTML)

References (23)

Cited By

Dual-Branch No-Reference Quality Assessment Method for Screen Content Videos Based on Multi-Dimensional Spatiotemporal Features

Abstract

Catalog

Export File

Citation

Format

Content