Abstract:
Classification of omics data suffers from the high error rate due to their high dimensional and small sample size characteristics. To overcome the problem, this paper proposes an ensemble feature selection for omics data classification based on constrained niching binary particle swarm optimization (PSO). Particularly, optimal feature subsets in terms of best classification accuracy are identified by the binary PSO. The proposed method introduces constraint on the particle encoding to constrain the number of selected features, and niching technique from multimodal optimization is imposed to enable the algorithm to obtain multiple diverse feature subsets in a single run. Afterward, multiple base classifiers built on the obtained feature subsets are combined into a stronger classifier which is applied to classify the omics data. Experimental results on realworld omics datasets demonstrate that the proposed feature selection method can stably select compact feature subsets and obtain promising classification performance.