Abstract:
The common distortion phenomenon in self-media videos brings new challenges to video quality assessment. Research on the theory of visual perception shows that there is an iterative mechanism of perception in the human visual system. That is, the perceptual evaluation of the video is a process of forward and backward iterative correction. Inspired by this, this paper introduces this mechanism into video quality assessment and proposes a quality evaluation method based on high-order deep Spatio-temporal information. Specifically, this paper proposes second-order covariance aggregation to extract high-order intra-frame information, introduces a fast iterative GRU structure for deep inter-frame information modeling, and then uses feature layer pooling aggregation and multi-layer perceptron regression to get the video score. Experimental results show that the prediction results are in good agreement with human subjective quality scores, which are significantly better than the existing no-reference quality assessment algorithms.