Abstract:
A Fast Multi-label Learning based on Hashing algorithm (HFMLL) is proposed to solve the problem that many current multi-label learning algorithms are usually time-consuming and difficult to handle large-scale data. The method combines the hashing algorithm and the multi-label algorithm. The HFMLL algorithm takes advantage of a Locality Sensitive Hashing (LSH) to get its neighboring instances for each unseen instance, and calculates the label correlation by estimating the similarity of labels through a min-wise independent permutations locality sensitive hashing (MinHash) scheme. Then, maximum a posteriori principle is used to predict the label set for unseen instances by considering their statistical information attained from all related labels of the neighboring instances. Experiments show that our proposed HFMLL algorithm is superior to current multi-label algorithm in maintaining high classification performance, besides, the method is significantly faster than and achieves the comparable performance with the state-of-art multi-label learning methods ,which can be widely applied to large-scale data sets.