Omnidirectional Image Quality Assessment Algorithm Based on Stereo Perception
-
Abstract
Blind omnidirectional image quality assessment (BOIQA) aims to objectively assess the human-perceived quality of omnidirectional images without relying on original image quality information. With the continuous evolution of virtual reality (VR) technology, the importance of BOIQA is increasingly pronounced. However, extant algorithms for omnidirectional image quality assessment exhibit certain constraints, including inadequacies in accurately simulating the browsing behavior of a viewer and deficiencies in effectively incorporating the stereoscopic perception processes of the viewer. These limitations impede the precision of omnidirectional image quality evaluation algorithms. To solve this problem, this paper proposes an algorithm for omnidirectional image quality assessment based on viewport perception and immersive stereoscopic perception interaction. In particular, the proposed methodology integrates the SPHORB algorithm, which enables the formulation of a systematic approach for the extraction of viewports. Leveraging this algorithm, a plethora of significant viewpoints can be meticulously extracted from the spherical domain. Subsequent to the acquisition of multiple feature-rich viewpoints, a meticulous selection procedure is executed to discern the ultimate 20 crucial viewpoints. These meticulously chosen viewpoints function as the pivotal coordinates for viewport sampling, encapsulating regions with the highest propensity for human visual focus and attention. Following the completion of viewport content filtering and sampling, the extracted contents are concurrently fed into the feature encoder to facilitate the extraction of viewport-specific features. Acknowledging the significance of both shallow and deep features in quality score prediction, this study endeavors to extract multi-scale features for each viewport, thereby augmenting the perceptual feature space. However, current methodologies for facilitating information exchange among multiple viewport contents exhibit certain limitations. To address this, we introduce a viewport feature interaction module designed to facilitate cross-viewport information exchange among the multiple input viewport contents. Within the module, the self-attention mechanism of the Transformer is employed to calculate the correlations between each viewport and others, effectively capturing the interdependencies among different viewports. This facilitates the modeling of information interaction processes between input sequences of viewports. Furthermore, this paper explores the acquisition of stereoscopic information using the entire omnidirectional image in the absence of viewport sampling, enabling the modeling of the stereoscopic perception process. In this module, the geometric distortions caused by projecting panoramic images onto a plane are overcome by directly extracting features on the spherical surface. Simultaneously, this approach enables for the extraction of continuous perceptual features from the spherical surface, thereby effectively enabling the construction of a sense of stereopsis. Through the perceptual assessment of immersive stereoscopic viewing experiences, the algorithm further augments evaluation accuracy. The experimental results serve as empirical validation for the effectiveness of the algorithm introduced in this study. Compared to the current cutting-edge quality assessment algorithms, the Spearman Rank-Order Correlation Coefficient (SROCC) and Pearson Linear Correlation Coefficient (PLCC) metrics show notable improvements. Specifically, on the publicly accessible CVIQD dataset, the SROCC and PLCC metrics exhibit enhancements of 0.72% and 0.70%, respectively. Moreover, on the OIQA dataset, these metrics demonstrate even more significant improvements, with enhancements of 1.10% and 0.54%, respectively.
-
-