基于立体感感知的全景图像质量评价算法

Omnidirectional Image Quality Assessment Algorithm Based on Stereo Perception

  • 摘要: 无参考全景图像质量评价旨在客观衡量全景图像的人类视觉感知质量,而无需依赖原始图像的质量信息。随着虚拟现实技术的迅猛发展,全景图像质量评价的重要性日益凸显。然而,现有全景图像质量评价算法仍存在着一些限制,如不能很好模拟观察者的浏览过程、未能有效考虑观看者的立体感知过程等。这严重影响了全景图像质量评价的准确性。为解决这一问题,本文提出一种基于沉浸式立体感知和视口感知交互的无参考全景图像质量评价算法。首先,设计一种视口提取策略,通过在球形域上提取特征视点,选择具有较高被观察概率的视点。对选定的视点提取相应的视口内容,并将多个视口内容并行输入特征编码器,以实现多尺度视口特征的提取。随后,鉴于当前实现多个视口间信息交互的方式尚存在局限性,本文提出一个视口特征交互模块,旨在实现对输入的多个视口内容进行跨视口的信息交互。最后,本文还探索了在缺乏视口采样的情况下,利用整个全景图像实现对立体感信息的获取,以实现对立体感过程建模从而提高整体评价性能。实验结果证明了本文提出算法的有效性,与当前最先进的质量评价算法相比之下,斯皮尔曼等级相关系数(Spearman Rank Order Correlation Coefficient,SROCC)指标和皮尔逊线性相关系数(Linear Pearson Correlation Coefficient,PLCC)在公开数据集CVIQD上分别达到0.72%和0.70%的提升,而在数据集OIQA上分别达到了1.10%和0.54%的提升。

     

    Abstract: ‍ ‍Blind omnidirectional image quality assessment (BOIQA) aims to objectively assess the human-perceived quality of omnidirectional images without relying on original image quality information. With the continuous evolution of virtual reality (VR) technology, the importance of BOIQA is increasingly pronounced. However, extant algorithms for omnidirectional image quality assessment exhibit certain constraints, including inadequacies in accurately simulating the browsing behavior of a viewer and deficiencies in effectively incorporating the stereoscopic perception processes of the viewer. These limitations impede the precision of omnidirectional image quality evaluation algorithms. To solve this problem, this paper proposes an algorithm for omnidirectional image quality assessment based on viewport perception and immersive stereoscopic perception interaction. In particular, the proposed methodology integrates the SPHORB algorithm, which enables the formulation of a systematic approach for the extraction of viewports. Leveraging this algorithm, a plethora of significant viewpoints can be meticulously extracted from the spherical domain. Subsequent to the acquisition of multiple feature-rich viewpoints, a meticulous selection procedure is executed to discern the ultimate 20 crucial viewpoints. These meticulously chosen viewpoints function as the pivotal coordinates for viewport sampling, encapsulating regions with the highest propensity for human visual focus and attention. Following the completion of viewport content filtering and sampling, the extracted contents are concurrently fed into the feature encoder to facilitate the extraction of viewport-specific features. Acknowledging the significance of both shallow and deep features in quality score prediction, this study endeavors to extract multi-scale features for each viewport, thereby augmenting the perceptual feature space. However, current methodologies for facilitating information exchange among multiple viewport contents exhibit certain limitations. To address this, we introduce a viewport feature interaction module designed to facilitate cross-viewport information exchange among the multiple input viewport contents. Within the module, the self-attention mechanism of the Transformer is employed to calculate the correlations between each viewport and others, effectively capturing the interdependencies among different viewports. This facilitates the modeling of information interaction processes between input sequences of viewports. Furthermore, this paper explores the acquisition of stereoscopic information using the entire omnidirectional image in the absence of viewport sampling, enabling the modeling of the stereoscopic perception process. In this module, the geometric distortions caused by projecting panoramic images onto a plane are overcome by directly extracting features on the spherical surface. Simultaneously, this approach enables for the extraction of continuous perceptual features from the spherical surface, thereby effectively enabling the construction of a sense of stereopsis. Through the perceptual assessment of immersive stereoscopic viewing experiences, the algorithm further augments evaluation accuracy. The experimental results serve as empirical validation for the effectiveness of the algorithm introduced in this study. Compared to the current cutting-edge quality assessment algorithms, the Spearman Rank-Order Correlation Coefficient (SROCC) and Pearson Linear Correlation Coefficient (PLCC) metrics show notable improvements. Specifically, on the publicly accessible CVIQD dataset, the SROCC and PLCC metrics exhibit enhancements of 0.72% and 0.70%, respectively. Moreover, on the OIQA dataset, these metrics demonstrate even more significant improvements, with enhancements of 1.10% and 0.54%, respectively.

     

/

返回文章
返回