值域空间超球面上的判别分析

Range Space Hyperspherical Discriminant Analysis

  • 摘要: Fisher线性判别分析(LDA)是模式识别中使用最广泛的线性分析方法之一。然而,实际应用中,样本数量相对于样本空间的维数而言是很少的,即样本在高维空间中呈稀疏分布。LDA采用基于欧式距离的度量方法将会使判别向量趋向于较大的类间距离。从而,可能融合距离较近的类。我们用超球面模型表示数据在高维空间中的结构信息,提出一种值域空间中的超球面判别分析方法(RHDA)。RHDA方法将数据映射到其值域空间的单位超球面上;在值域空间超球面上计算各个子类的判别子空间;最后,计算测试样本与各个判别子空间中子类均值向量间的距离。RHDA将测试样本判别为第 类仅当测试样本与第 类的均值向量的距离最小。超球面判别分析采用单位超球面上数据的归一化向量来表示样本向量的结构信息,它主要针对于基于欧式距离的判别分析所引起的判别向量偏离问题。最后本文还提出了值域空间超球面核判别分析方法。超球面核判别分析方法为高维空间中对不同数据采用不同映射提供了可能。在不同数据库上的分类实验结果证实了RHDA相对于 LDA及其相关推广算法的优良性。

     

    Abstract: Fisher’s discriminant analysis (LDA) is one of the most widely used linear methods in pattern recognition. However, in practical case, the number of samples is relatively small respect to the dimention of sample vector space. Samples distribut sparsely in high dimensional space. LDA which is based on the Euclidean distance metric tends to the great inter-distance, which results to merge close classes. We adopt hypersphere model to denote high dimensional data structure information and present a range space hypersphere discriminant analysis (RHDA). RHDA maps data on a unit hypersphere of the range space and compute the discriminant space of each subclass. It computes the distance between test sample and the center of each class in the class discriminant sapce and classifies the test sample into class i only the distance from the test sample to the center of class i is the smallest. RHDA utilizes a normalized vector of the unit hypersphere to denote the structure information of a sample vector. It is designed for the deviation of problem of LDA. The kernel approach of range space hyperspherical discriminant analysis was also presented in this paper. This enables different maps for different kinds of data in high dimensional space. Experiment results on different databases verified the good performance beyond LDA and its relative developments.

     

/

返回文章
返回