Learning k for kNN Classification

Shichao Zhang; Xuelong Li; Ming Zong; Xiaofeng Zhu; Debo Cheng

doi:10.1145/2990508

Outline

Learning k for kNN Classification

Debo Cheng

2017, ACM Transactions on Intelligent Systems and Technology

https://doi.org/10.1145/2990508

visibility

…

description

19 pages

link

1 file

Abstract

The K Nearest Neighbor (kNN) method has widely been used in the applications of data mining and machine learning due to its simple implementation and distinguished performance. However, setting all test data with the same k value in the previous kNN methods has been proven to make these methods impractical in real applications. This article proposes to learn a correlation matrix to reconstruct test data points by training data to assign different k values to different test data points, referred to as the Correlation Matrix kNN (CM-kNN for short) classification. Specifically, the least-squares loss function is employed to minimize the reconstruction error to reconstruct each test data point by all training data points. Then, a graph Laplacian regularizer is advocated to preserve the local structure of the data in the reconstruction process. Moreover, an ℓ 1 -norm regularizer and an ℓ 2, 1 -norm regularizer are applied to learn different k values for different test data and to result ...

References (55)

Enrico Blanzieri and Farid Melgani. 2008. Nearest neighbor classification of remote sensing images with the maximal margin principle. IEEE Trans. Geosci. Remote Sens. 46, 6 (2008), 1804-1811.
Jiahua Chen and Jun Shao. 2001. Jackknife variance estimation for nearest-neighbor imputation. J. Am. Statist. Assoc. 96, 453 (2001), 260-269.
Xiai Chen, Zhi Han, Yao Wang, Yandong Tang, and Haibin Yu. 2016. Nonconvex plus quadratic penalized low-rank and sparse decomposition for noisy image alignment. Sci. Chin. Infor. Sci. 5 (2016), 1-13.
Debo Cheng, Shichao Zhang, Xingyi Liu, Ke Sun, and Ming Zong. 2015. Feature selection by combining subspace learning with sparse representation. Multimedia Syst. (2015), 1-7.
Ingrid Daubechies, Ronald DeVore, Massimo Fornasier, and C. Sinan G ünt ürk. 2010. Iteratively reweighted least squares minimization for sparse recovery. Commun. Pure Appl. Math. 63, 1 (2010), 1-38.
Yongsheng Dong, Dacheng Tao, and Xuelong Li. 2015b. Nonnegative multiresolution representation-based texture image classification. ACM Trans. Intell. Syst. Technol. 7, 1 (2015), 4.
Zhen Dong, Wei Liang, Yuwei Wu, Mingtao Pei, and Yunde Jia. 2015a. Nonnegative correlation coding for image classification. Sci. Chin. Infor. Sci. 59, 1 (2015), 1-14.
Jianping Fan, Jinye Peng, Ling Gao, and Ning Zhou. 2015. Hierarchical learning of tree classifiers for large-scale plant species identification. IEEE Trans. Image Process. 24, 11 (2015), 4172-84.
Pedro J. García-Laencina, José-Luis Sancho-Gómez, Aníbal R. Figueiras-Vidal, and Michel Verleysen. 2009. K nearest neighbours with mutual information for simultaneous classification and missing data impu- tation. Neurocomputing 72, 7 (2009), 1483-1493.
Mohammad Ghasemi Hamed, Mathieu Serrurier, and Nicolas Durand. 2012. Possibilistic knn regression using tolerance intervals. In Advances in Computational Intelligence. 410-419.
Xiaofei He, Chiyuan Zhang, Lijun Zhang, and Xuelong Li. 2016. A-optimal projection for image representa- tion. IEEE Trans. Pattern Anal. Mach. Intell. 38, 5 (2016), 1009-1015.
Boyu Li, Yun Wen Chen, and Yan Qiu Chen. 2008. The nearest neighbor algorithm of local probability centers. IEEE Trans. Syst. Man Cybernet. B 38, 1 (2008), 141-154.
Xuelong Li, Qun Guo, and Xiaoqiang Lu. 2016. Spatiotemporal statistics for video quality assessment. IEEE Trans. Image Process. 25, 7 (2016), 3329-3342.
Xuelong Li, Lichao Mou, and Xiaoqiang Lu. 2015. Scene parsing from an MAP perspective. IEEE Trans. Cybernet. 45, 9 (2015), 1876-1886.
Xuelong Li and Yanwei Pang. 2009. Deterministic column-based matrix decomposition. IEEE Trans. Knowl. Data Eng. 22, 1 (2009), 145-149.
Xuelong Li, Zhigang Wang, and Xiaoqiang Lu. 2016. Surveillance video synopsis via scaling down objects. IEEE Trans. Image Process. 25, 2 (2016), 740-755.
Fan Liu, Jinhui Tang, Yan Song, Liyan Zhang, and Zhenmin Tang. 2015. Local structure-based sparse representation for face recognition. ACM Trans. Intell. Syst. Technol. 7, 1 (2015), 2.
Chen Luo, Jia Zeng, Mingxuan Yuan, Wenyuan Dai, and Qiang Yang. 2016. Telco user activity level prediction with massive mobile broadband data. ACM Trans. Intell. Syst. Technol. 7, 4 (2016), 63.
Minnan Luo, Fuchun Sun, and Huaping Liu. 2014. Joint block structure sparse representation for multi- input-multi-output (MIMO) T-S fuzzy system identification. IEEE Trans. Fuzzy Syst. 22, 6 (2014), 1387-1400.
Tristan Mary-Huard and Stephane Robin. 2009. Tailored aggregation for classification. IEEE Trans. Pattern Anal. Mach. Intell. 31, 11 (2009), 2098-2105.
Phayung Meesad and Kairung Hengpraprohm. 2008. Combination of knn-based feature selection and knnbased missing-value imputation of microarray data. In ICICIC. 341-341.
Amir Navot, Lavi Shpigelman, Naftali Tishby, and Eilon Vaadia. 2006. Nearest neighbor based feature selection for regression and its application to neural activity. (2006).
Karl S. Ni and Truong Q. Nguyen. 2009. An adaptable-nearest neighbors algorithm for MMSE image inter- polation. IEEE Trans. Image Process. 18, 9 (2009), 1976-1987.
X. Niyogi. 2004. Locality preserving projections. In NIPS, Vol. 16. 153.
Yongsong Qin, Shichao Zhang, Xiaofeng Zhu, Jilian Zhang, and Chengqi Zhang. 2007. Semi-parametric optimization for missing data imputation. Appl. Intell. 27, 1 (2007), 79-88.
F. Sahigara, D. Ballabio, R. Todeschini, and V. Consonni. 2014. Assessing the validity of QSARs for ready biodegradability of chemicals: An applicability domain perspective. Curr. Comput.-Aid. Drug Des. 10, 10 (2014), 137-147.
Ziqiang Shi, Jiqing Han, and Tieran Zheng. 2013. Audio classification with low-rank matrix representation features. ACM Trans. Intell. Syst. Technol. 5, 1 (2013), 15.
Yang Song, Jian Huang, Ding Zhou, Hongyuan Zha, and C. Lee Giles. 2007. Iknn: Informative k-nearest neighbor pattern classification. In PKDD. 248-264.
Jimeng Sun and Chandan K. Reddy. 2013. Big data analytics for healthcare. In KDD. 1525-1525.
Yu Sun, Jianzhong Qi, Yu Zheng, Zhang, and Rui. 2015. K-nearest neighbor temporal aggregate queries. Inproceedings (2015).
Lu An Tang, Yu Zheng, Xing Xie, Jing Yuan, Xiao Yu, and Jiawei Han. 2011. Retrieving k-Nearest Neighboring Trajectories by a Set of Point Locations. Springer, Berlin, 223-241 pages.
Pascal Vincent and Yoshua Bengio. 2001. K-local hyperplane and convex distance nearest neighbor algo- rithms. In NIPS. 985-992.
Hui Wang. 2006. Nearest neighbors by neighborhood counting. IEEE Trans. Pattern Anal. Mach. Intell. 28, 6 (2006), 942-953.
Yilun Wang, Yu Zheng, and Yexiang Xue. 2014. Travel time estimation of a path using sparse trajectories. In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 25-34.
Kilian Q. Weinberger and Lawrence K. Saul. 2006. Distance metric learning for large margin nearest neigh- bor classification. J. Mach. Learn. Res. 10, 1 (2006), 207-244.
Xindong Wu, Huanhuan Chen, Gongqing Wu, Jun Liu, Qinghua Zheng, Xiaofeng He, Aoying Zhou, Zhong- Qiu Zhao, Bifang Wei, Ming Gao, and others. 2015. Knowledge engineering with big data. IEEE Intell. Syst. 30, 5 (2015), 46-55.
Xindong Wu, Vipin Kumar, J. Ross Quinlan, Joydeep Ghosh, Qiang Yang, Hiroshi Motoda, Geoffrey J. McLachlan, Angus Ng, Bing Liu, S. Yu Philip, and others. 2008. Top 10 algorithms in data mining. Knowl. Infor. Syst. 14, 1 (2008), 1-37.
Xindong Wu, Xingquan Zhu, Gong-Qing Wu, and Wei Ding. 2014. Data mining with big data. IEEE Trans- actions on Knowledge and Data Engineering 26, 1 (2014), 97-107.
Chunlei Yang, Jialie Shen, Jinye Peng, and Jianping Fan. 2012. Image collection summarization via dictio- nary learning for sparse representation. Pattern Recogn. 46, 3 (2012), 948-961.
Zizhen Yao and Walter L. Ruzzo. 2006. A regression-based K nearest neighbor algorithm for gene function prediction from heterogeneous data. BMC Bioinfor. 7, Suppl. 1 (2006), S11.
Renzhen Ye and Xuelong Li. 2016. Compact structure hashing via sparse and similarity preserving embed- ding. IEEE Trans. Cybernet. 46, 3 (2016), 718-729.
Chengqi Zhang, Yongsong Qin, Xiaofeng Zhu, Jilian Zhang, and Shichao Zhang. 2006. Clustering-based miss- ing value imputation for data preprocessing. In 2006 4th IEEE International Conference on Industrial Informatics. 1081-1086.
Chengqi Zhang, Xiaofeng Zhu, Jilian Zhang, Yongsong Qin, and Shichao Zhang. 2007. GBKII: An imputation method for missing values. In PAKDD. 1080-1087.
Shizhao Zhang. 2010. KNN-CF approach: Incorporating certainty factor to kNN classification. IEEE Intell. Infor. Bull. 11, 1 (2010), 24-33.
Shichao Zhang. 2011. Shell-neighbor method and its application in missing data imputation. Appl. Intell. 35, 1 (2011), 123-133.
Shichao Zhang, Debo Cheng, Ming Zong, and Lianli Gao. 2016. Self-representation nearest neighbor search for classification. Neurocomputing 195 (2016), 137-142.
Shichao Zhang, Ming Zong, Ke Sun, Yue Liu, and Debo Cheng. 2014. Efficient kNN algorithm based on graph sparse reconstruction. In ADMA. 356-369.
Yuejie Zhang, Lei Cen, Cheng Jin, Xiangyang Xue, and Jianping Fan. 2011. Learning inter-related statistical query translation models for English-Chinese bi-directional CLIR. In International Joint Conference on Artificial Intelligence. 1915-1920.
Xiaofeng Zhu, Zi Huang, Hong Cheng, Jiangtao Cui, and Heng Tao Shen. 2013a. Sparse hashing for fast multimedia search. ACM Trans. Infor. Syst. 31, 2 (2013), 9.
Xiaofeng Zhu, Zi Huang, Yang Yang, Heng Tao Shen, Changsheng Xu, and Jiebo Luo. 2013b. Self-taught dimensionality reduction on the high-dimensional small-sized data. Pattern Recogn. 46, 1 (2013), 215- 229.
Xiaofeng Zhu, Xuelong Li, and Shichao Zhang. 2016a. Block-row sparse multiview multilabel learning for image classification. IEEE Trans. Cybernet. 46, 2 (2016), 450-461.
Xiaofeng Zhu, Xuelong Li, Shichao Zhang, Chunhua Ju, and Xindong Wu. 2016b. Robust joint graph sparse coding for unsupervised spectral feature selection. IEEE Trans. Neur. Netw. Learn. Syst. (2016).
Xiaofeng Zhu, Heung-Il Suk, and Dinggang Shen. 2014. Matrix-similarity based loss function and feature selection for alzheimer's disease diagnosis. In CVPR. 3089-3096.
Xiaofeng Zhu, Shichao Zhang, Zhi Jin, Zili Zhang, and Zhuoming Xu. 2011. Missing value estimation for mixed-attribute data sets. IEEE Trans. Knowl. Data Eng. 23, 1 (2011), 110-121.
Xiaofeng Zhu, Shichao Zhang, Jilian Zhang, and Chengqi Zhang. 2007. Cost-sensitive imputing missing values with ordering. In AAAI. 1922-1923.

Holter systems record the electrocardiogram (ECG), which is used to identify beat families according to their origin and severity. Many systems have been proposed using signal conditioning and machine learning (ML) classification algorithms for beat family recognition. However, the design stage of these systems does not always consider the impact that tuning the intermediate blocks has on the beat family classification and the overall accuracy. We propose to use a new index based on the confusion matrices and bootstrap resampling to summarize the global performance for all family beats, so-called differential beat accuracy (DBA), which is obtained as the total number of beats correctly classified in each class minus the total number of beats incorrectly classified. We addressed the sensitivity of the different subblocks when creating a simple beat family classifier consisting of signal preprocessing blocks and a simple k-Nearest Neighbors classifier. The MIT-BIH Arrhythmia database was used for this purpose, following existing literature on the field. We benchmarked two implementations, one for biclass classification (supraventricular vs. non-supraventricular origin) and another for multiclass beat labeling. The usual preprocessing stages were scrutinized with the DBA to evaluate their impact on the quality of the complete ML system, such as signal detrending and filtering, beat balancing, or inter-beat distance. With the support of the DBA, our methodology was able to detect significant differences in terms of some of the options in the algorithm design. For instance, balancing the number of beats in each class for training significantly improved the classification accuracy of the minority classes at 3.22% for the multiclass dataset but not for the biclass dataset. Also, accuracy improved significantly by about 6% for the biclass regrouping without data normalization, whereas overall accuracy improved significantly by about 7% for the multiclass regrouping with data normalization. In addition, the analysis of the statistical dispersion of confusion matrices showed that this database should be considered with caution when training ML-based family classifiers. We can conclude that the proposed DBA can provide us with statistically principled criteria for designing ML-based classifiers and reducing their bias in strongly unbalanced beat family datasets. INDEX TERMS Machine learning, ECG morphology, MIT-BIH database, family beats, differential confusion matrix, differential beat accuracy. ALBA VADILLO-VALDERRAMA was born in Madrid, Spain. She received the B.Sc. degree in telecommunication engineering from Rey Juan Carlos University, Madrid, in 2016, and the master's degree in visual analytics and big data from the International University of La Rioja, in 2019. Her research interests include machine learning, pattern recognition, artificial neural networks, and deep learning. REBECA GOYA-ESTEBAN received the B.Sc. degree in telecommunication engineering from

Learning k for kNN Classification

Sign up for access to the world's latest research

Abstract

Related papers

References (55)

Related papers

Related topics

Cited by