An Effective Algorithm for Computing Reducts in Decision Tables
Journal of Computer Science and Cybernetics
https://doi.org/10.15625/1813-9663/38/3/17450Abstract
Attribute reduction is one important part researched in rough set theory. A reduct from a decision table is a minimal subset of the conditional attributes which provide the same information for classification purposes as the entire set of available attributes. The classification task for the high dimensional decision table could be solved faster if a reduct, instead of the original whole set of attributes, is used. In this paper, we propose a reduct computing algorithm using attribute clustering. The proposed algorithm works in three main stages. In the first stage, irrelevant attributes are eliminated. In the second stage relevant attributes are divided into appropriately selected number of clusters by Partitioning Around Medoids (PAM) clustering method integrated with a special metric in attribute space which is the normalized variation of information. In the third stage, the representative attribute from each cluster is selected that is the most class-related. The selected attrib...
References (37)
- M. Alimoussa, A. Porebski, N. Vandenbroucke, R. Thami, and S. El Fkihi, "Clustering-based sequential feature selection approach for high dimensional data dlassification," in Proceedings of the 16th International Joint Conference on Computer Vision, Imaging and Computer Graph- ics Theory and Applications (VISIGRAPP 2021) -Volume 4: VISAPP, pp. 122-132, 2021. https://www.scitepress.org/Papers/2021/102595/102595.pdf
- Q.A. Al-Radaideh, M.N. Sulaiman, M.H. Selamat and H. Ibrahim, "Approximate reduct com- putation by rough sets based attribute weighting," 2005 IEEE International Conference on Granular Computing, vol. 2, 2005, pp. 383-386. Doi: 10.1109/GRC.2005.1547317.
- R. Bello and R. Falcon, "Rough sets in machine learning: A review." Chapter in Studies in Computational Intelligence, 2017. http://dx.doi.org/10.1007/978-3-319-54966-8 5
- G. Chandrashekar, and F. Sahin, "A survey on feature selection methods," Computers & Electrical Engineering, vol. 40, no. 1, pp. 16 -28. 40th-year commemorative issue, 2014. https://doi.org/10.1016/j.compeleceng.2013.11.024
- S. Chormunge and S. Jena, "Correlation based feature selection with clustering for high dimen- sional data", Journal of Electrical Systems and Information Technology, vol. 5, no. 3, 2018, pp. 542-549, https://doi.org/10.1016/j.jesit.2017.06.004
- D. Dua and C. Graff, "UCI Machine Learning Repositories", 2019, http://archive.ics.uci.edu/ml/
- K. Gao, M. Liu, K. Chen, N. Zhou, and J. Chen, "Sampling-based tasks schedul- ing in dynamic grid environment", Proceedings of the 5th WSEAS Int. Conf. on Sim- ulation, Modeling and Optimization, Corfu, Greece, August 17-19, 2005 (p. 25-30). https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.582.5929&rep=rep1&type=pdf
- D.E. Goldberg, Genetic Algorithms in Search, Optimization and Machine Learning. Boston, MA, USA: Addison Wesley, 1989.
- E. Guyon and A. Elisseeff, "An Introduction to variable and feature selection", Journal of Machine Learning Research, vol. 3, pp. 1157-1182, 2003.
- D. Harris, and A.V. Niekerk, "Feature clustering and ranking for selecting stable features from high dimensional remotely sensed data," International Journal of Remote Sensing, vol. 39, no. 23, pp. 8934-8949, 2018. https://doi.org/10.1080/01431161.2018.1500730
- T.P. Hong and Y.L. Liou, "Attribute clustering in high dimensional feature spaces," 2007 In- ternational Conference on Machine Learning and Cybernetics, 2007, pp. 2286-2289. Doi: 10.1109/ICMLC.2007.4370526.
- T.P. Hong, P.C. Wang, and Y.C. Lee, "An effective attribute clustering approach for feature selection and replacement," Cybernetics and Systems: An International Journal, vol. 40, no. 8, pp. 657-669, 2009. Doi: 10.1080/01969720903294585.
- T.P. Hong, P.C. Wang, and C.K. Ting, "An evolutionary attribute clustering and selection method based on feature similarity," IEEE Congress on Evolutionary Computation, 2010, pp. 1-5. Doi: 10.1109/CEC.2010.5585918.
- T.P. Hong, Y.L. Liou, S.L. Wang, and B. Vo, "Feature selection and replacement by clustering attributes," Vietnam J Comput Sci, vol. 1, pp. 47-55, 2014. https://doi.org/10.1007/s40595- 013-0004-3
- T.P. Hong, C.H. Chen, and F.S. Lin, "Using group genetic algorithm to improve per- formance of attribute clustering," Applied Soft Computing, vol. 29, pp. 371-378, 2015. https://doi.org/10.1016/j.asoc.2015.01.001
- A. Jakulin, "Machine learning based on attribute interactions," PhD Dissertation, [na spletu], Univerza v Ljubljani, Fakulteta za računalništvo in informatiko. [Dostopano 20 september 2022].
- A. Janusz and D. Slezak, "Utilization of attribute clustering methods for scalable computation of reducts from high-dimensional data," 2012 Federated Conference on Computer Science and Information Systems (FedCSIS), 2012, pp. 295-302.
- A. Janusz and D. Slezak, "Rough set methods for attribute clustering and selection," Applied Artificial Intelligence, vol. 28, no. 3, pp. 220-242, 2014. Doi: 10.1080/08839514.2014.883902
- R. Jensen and Q. Shen, "A rough set-aided system for sorting WWW bookmarks," in: N. Zhong et al. (Eds.), Web Intelligence: Research and Development. WI 2001. Lecture Notes in Computer Science(), vol. 2198. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45490-X 10
- L. Kaufman and P.J. Rousseeuw, Computing Groups in Data: An Introduction to Cluster Anal- ysis. John Wiley & Sons, Toronto, 1990.
- K. Kira and L.A. Rendell, "The feature selection problem: Traditional methods and a new algorithm," Proceedings of Nineth National Conference on Artificial Intelligence, pp. 129-134, 1992. https://www.aaai.org/Papers/AAAI/1992/AAAI92-020.pdf
- R. Kohavi and G.H. John, "Wrappers for feature subset selection," Artificial Intelligence, vol. 97, no. 1-2, pp. 273-324, 1997. https://doi.org/10.1016/S0004-3702(97)00043-X
- H. Liu and H. Motoda, Computational Methods of Feature Selection. Chapman and Hall/CRC Press, 2007.
- L. C. Molina, L. Belanche, and A. Nebot, "Feature selection algorithms: A survey and experi- mental evaluation," 2002 IEEE International Conference on Data Mining, 2002. Proceed- ings., 2002, pp. 306-313. Doi: 10.1109/ICDM.2002.1183917.
- F. Pacheco, M. Cerrada, R.V. Sánchez, D. Cabrera, C. Li, and José Valente de Oliveira, "At- tribute clustering using rough set theory for feature selection in fault severity classification of rotating machinery," Expert Systems With Applications, vol. 71, pp. 69-86, pp. 69-86, 2017. https://doi.org/10.1016/j.eswa.2016.11.024
- Z. Pawlak, Rough Sets, Theoretical Aspects of Reasoning About Data. Kluwer Academic Pub- lishers, 1991.
- P.J. Rousseeuw, "Silhouettes: A graphical aid to the interpretation and validation of cluster analysis," Journal of Computational and Applied Mathematics, vol. 20, pp. 53-65, 1987. https://doi.org/10.1016/0377-0427(87)90125-7
- A. Skowron, C. Rauszer, "The Discernibility Matrices and Functions in Information Systems," In: S lowiński, R. (eds) Intelligent Decision Support. Theory and Decision Library, vol.
- Springer, Dordrecht, 1992. https://doi.org/10.1007/978-94-015-7975-9 21
- Q. Song, J. Ni and G. Wang, "A fast clustering-based feature subset selection algorithm for high-dimensional data," In IEEE Transactions on Knowledge and Data Engineering, vol. 25, no. 1, pp. 1-14, Jan. 2013. Doi: 10.1109/TKDE.2011.181.
- H. Q. Sun and Z. Xiong, "Finding minimal reducts from incomplete information systems," Proceedings of the 2003 International Conference on Machine Learning and Cybernetics (IEEE Cat. No.03EX693), vol. 1, 2003, pp. 350-354. Doi: 10.1109/ICMLC.2003.1264500.
- G.Y. Wang, H. Yu, and D. Yang, "Decision table reduction based on conditional information entropy," Chinese Journal of Computers, vol. 25, no. 7, pp. 759-766, 2002.
- Y. Zhao, R and Data Mining: Examples and Case Studies. Published by Elsevier, December 2012. https://www.webpages.uidaho.edu/∼stevel/517/RDataMining-book.pdff
- J. Wroblewski, "Computing minimal reducts using genetic algorithms," in The Second Annual Join Conference on Information Sciences, pp. 186-189, 1995. http://www.cs.sjsu.edu/ khuri/Aalto 2017/ge short.pdf
- Q. Zhang, Q. Xie, and G. Wang, "A survey on rough set theory and its applica- tions," CAAI Transactions on Intelligence Technology, vol. 1, no. 4, pp. 323-333, 2016. https://doi.org/10.1016/j.trit.2016.11.001
- K. Zhu and J. Yang, "A cluster-based sequential feature selection algorithm," 2013 Ninth International Conference on Natural Computation (ICNC), 2013, pp. 848-852. Doi: 10.1109/ICNC.2013.6818094.
- X. Zhu, Y. Wang, Y. Li, Y. Tan, G. Wang, and Q. Song, "A new unsupervised feature selection algorithm using similarity-based feature clustering," Computational Intelligence, vol. 35, no. 1, pp. 2-22, 2019. https://doi.org/10.1111/coin.12192