RPCL算法的理论发展和应用
Theoretical Developments and Applications of RPCL Algorithm
-
摘要: 在传统的聚类分析中,通常需要针对给定的数据选择出正确或合理的类别数,否则算法无法得到理想的聚类分析结果。当采用竞争学习(Competitive Learning, CL)算法进行聚类分析时也面临着同样的问题。然而,一般数据集中实际聚类个数(或竞争单元个数)的推断与选择却是一个十分困难的问题。为了解决这一难题,对手惩罚竞争学习(Rival Penalized Competitive Learning, RPCL)算法建立了一种有效的思想和方法。它通过预设较大的聚类个数,在竞争学习中引入了对手惩罚的机制,自动地选择出正确的聚类中心与个数,并将多余的聚类中心排除到无穷点或远离数据的地方。这种独特的思想和方法为聚类分析开辟了一条崭新的途径。本文将深入分析RPCL算法的理论发展,包括产生的根源及其思想、理论基础、在不同情况下的推广和变式,并且总结了RPCL算法在各个领域中的应用。Abstract: In conventional clustering analysis, it is generally assumed that the correct or appropriate number of clusters in a given dataset is given in advance, otherwise, the clustering algorithms cannot lead to a reasonable clustering result. When a competitive learning (CL) algorithm is applied to clustering analysis, it is faced with the same problem. In fact, the selection of number of clusters (or competitive units in the CL algorithm) for a dataset is a very difficult problem. As for this difficult problem, Rival Penalized Competitive Learning (RPCL) algorithm provides an effective idea and method. By overestimating the number of clusters in the given dataset, it has the ability of automatically allocating an appropriate number of cluster centers in the data and pushing out the extra cluster centers far away to the infinity by adopting the rival penalized mechanism into the conventional competitive learning. This favourable idea and method has opened up a new way on clustering analysis. This paper reviews the developments and applications of RPCL algorithm, including the origin of its idea, mathematical derivation and analysis, generalized versions for different types of data, and various applications.