基于广义最大相关熵准则的宽度学习系统
Broad Learning System Based on Generalized Maximum Correntropy Criterion
-
摘要: 宽度学习系统(broad learning system,BLS)是近几年提出的一种新型判别学习方法,具有结构简单,训练快速的特点,在各种回归和分类问题上得到广泛应用。然而标准的BLS是在最小均方误差(MMSE)准则下推导出来的,对异常值的存在十分敏感,这无疑降低了系统的准确性。为了提高BLS的鲁棒性,有学者提出了最大相关熵准则(MCC)的BLS(C-BLS)。相对于最小均方误差准则,最大相关熵准则包含了更多的高阶误差信息,所以C-BLS对异常值具有良好的鲁棒性。但考虑到相关熵中默认的核函数固定为高斯核,这并不适用于绝大多数情况。本文中引入了以广义高斯密度(GGD)函数作为核函数的广义相关熵,并将广义最大相关熵准则(GMCC)应用于BLS,提出了新的鲁棒算法(GC-BLS)。相较于高斯核函数,广义高斯密度函数更为灵活,高斯核可以看作它的一个特例,在选取适当参数时,GC-BLS将退化为C-BLS,这使得新算法至少能获得与C-BLS算法相当的性能。实验中以均方根误差作为标准,在回归数据集与时间序列数据集上对新算法进行检验,在绝大多数情况下,GC-BLS都能取得相较于其他算法更小的均方根误差。实验表明,该算法是非常稳定的。仿真结果验证了理论上的期望,并验证了新算法的性能。Abstract: Broad learning system (BLS) is a new kind of discriminative learning method proposed in recent years. It has the characteristics of simple structure and fast training, and has been widely used in various regression and classification problems. However, the standard BLS is derived under the Minimum Mean Square Error (MMSE) criterion, which is very sensitive to the existence of outliers, which undoubtedly reduces the accuracy of the system. In order to improve the robustness of BLS, some scholars have proposed BLS with maximum correntropy criterion (MCC) (C-BLS). Compared with the minimum mean square error criterion, the maximum correntropy criterion contains more high-order error information, so C-BLS has good robustness to outliers. However, considering that the default kernel function in correntropy is fixed as Gaussian kernel, this is not applicable in the vast majority of cases. In this paper, the generalized correntropy with generalized Gaussian density (GGD) function as the kernel function is introduced, and the generalized maximum correntropy criterion (GMCC) is applied to BLS, and a new robust algorithm (GC-BLS) is proposed. GC-BLS can be regarded as a special case of GC-BLS. When appropriate parameters are selected, GC-BLS will degenerate to C-BLS, which makes the new algorithm at least obtain the same performance as C-BLS algorithm. In the experiment, the root mean square error is used as the standard to test the new algorithm on regression data sets and time series data sets. In most cases, GC-BLS can achieve smaller root mean square error than other algorithms. Experiments show that the algorithm is very stable. Simulation results validate the theoretical expectations and demonstrate the performance of the new algorithm.