基于子空间学习和伪标签回归的无监督特征选择

Subspace Learning and Virtual Label Regression Based Unsupervised Feature Selection

  • 摘要: 信息技术的快速发展产生了大量无标签高维数据。为了能够更好地处理这些数据,提出了一种基于子空间学习和伪标签回归的无监督特征选择方法。首先,从矩阵分解的角度将子空间学习和特征选择结合在一个框架中,2,1〗范数保证稀疏,在寻找原始数据空间低维表示的同时进行特征选择;其次,利用回归函数来学习特征子空间和伪标签之间的映射关系,利用伪标签和回归函数来指导无监督特征选择,以使选择出来的特征更具判别力;最后,通过引入图拉普拉斯来挖掘隐藏在样本空间和特征空间的局部结构信息。在六个公开的数据集上进行了实验,实验结果表明该方法要优于其他几种先进的无监督特征选择算法。

     

    Abstract: With the rapid development of information technology, a lot of unlabeled high-dimensional data are generated. To cope with these data, in this paper, we proposed a subspace learning and virtual label regression based unsupervised feature selection method. First, from the viewpoint of matrix factorization, we combined subspace learning and feature selection into a joint framework and constrained the feature selection matrix with an L_2,1-norm to select features while finding the low-dimensional representation of the original data space. Then, we utilized the regression function to learn the mapping relationship between the feature subspace and the virtual label space. With the guide of virtual label matrix and regression function, we can select the discriminative features. Finally, we introduced a graph Laplacian to explore the local information hidden in the data space and feature space. We conducted extensive experiments on six public datasets. The results show that our method is superior to some state-of-the-art unsupervised feature selection methods.

     

/

返回文章
返回