Academia.eduAcademia.edu

Outline

Explainable extreme boosting model for breast cancer diagnosis

International Journal of Electrical and Computer Engineering (IJECE)

https://doi.org/10.11591/IJECE.V13I5.PP5764-5769

Abstract

This study investigates the Shapley additive explanation (SHAP) of the extreme boosting (XGBoost) model for breast cancer diagnosis. The study employed Wisconsin’s breast cancer dataset, characterized by 30 features extracted from an image of a breast cell. SHAP module generated different explainer values representing the impact of a breast cancer feature on breast cancer diagnosis. The experiment computed SHAP values of 569 samples of the breast cancer dataset. The SHAP explanation indicates perimeter and concave points have the highest impact on breast cancer diagnosis. SHAP explains the XGB model diagnosis outcome showing the features affecting the XGBoost model. The developed XGB model achieves an accuracy of 98.42%.

References (27)

  1. X. Y. Liew, N. Hameed, and J. Clos, "An investigation of XGBoost-based algorithm for breast cancer classification," Machine Learning with Applications, vol. 6, Dec. 2021, doi: 10.1016/j.mlwa.2021.100154.
  2. T. A. Assegie, R. L. Tulasi, V. Elanangai, and N. K. Kumar, "Exploring the performance of feature selection method using breast cancer dataset," Indonesian Journal of Electrical Engineering and Computer Science (IJEECS), vol. 25, no. 1, pp. 232-237, Jan. 2022, doi: 10.11591/ijeecs.v25.i1.pp232-237.
  3.  ISSN: 2088-8708
  4. Int J Elec & Comp Eng, Vol. 13, No. 5, October 2023: 5764-5769
  5. H. Dhahri, E. Al Maghayreh, A. Mahmood, W. Elkilani, and M. F. Nagi, "Automated breast cancer diagnosis based on machine learning algorithms," Journal of Healthcare Engineering, vol. 2019, pp. 1-11, Nov. 2019, doi: 10.1155/2019/4253641.
  6. S. Kabiraj et al., "Breast cancer risk prediction using XGBoost and random forest algorithm," in 2020 11 th International Conference on Computing, Communication and Networking Technologies (ICCCNT), Jul. 2020, pp. 1-4, doi: 10.1109/ICCCNT49239.2020.9225451.
  7. H. Liang, J. Li, H. Wu, L. Li, X. Zhou, and X. Jiang, "Mammographic classification of breast cancer microcalcifications through extreme gradient boosting," Electronics, vol. 11, no. 15, Aug. 2022, doi: 10.3390/electronics11152435.
  8. A. Derangula, S. Edara, and P. K. Karri, "Feature selection of breast cancer data using gradient boosting techniques of machine learning," European Journal of Molecular and Clinical Medicine, vol. 7, no. 2, pp. 3488-3504, 2020.
  9. M. S. K. Inan, R. Hasan, and F. I. Alam, "A hybrid probabilistic ensemble based extreme gradient boosting approach for breast cancer diagnosis," in 2021 IEEE 11 th Annual Computing and Communication Workshop and Conference (CCWC), Jan. 2021, pp. 1029-1035, doi: 10.1109/CCWC51732.2021.9376007.
  10. M. Phankokkruad, "Cost-sensitive extreme gradient boosting for imbalanced classification of breast cancer diagnosis," in 2020 10 th IEEE International Conference on Control System, Computing and Engineering (ICCSCE), Aug. 2020, pp. 46-51, doi: 10.1109/ICCSCE50387.2020.9204948.
  11. A. Vamvakas, D. Tsivaka, A. Logothetis, K. Vassiou, and I. Tsougos, "Breast cancer classification on multiparametric MRI- increased performance of boosting ensemble methods," Technology in Cancer Research and Treatment, vol. 21, Jan. 2022, doi: 10.1177/15330338221087828.
  12. B. Kurian and V. Jyothi, "Breast cancer prediction using an optimal machine learning technique for next generation sequences," Concurrent Engineering, vol. 29, no. 1, pp. 49-57, Mar. 2021, doi: 10.1177/1063293X21991808.
  13. S. Vijayalakshmi et al., "Multi-modal prediction of breast cancer using particle swarm optimization with non-dominating sorting," International Journal of Distributed Sensor Networks, vol. 16, no. 11, Nov. 2020, doi: 10.1177/1550147720971505.
  14. K. Rajendran, M. Jayabalan, and V. Thiruchelvam, "Predicting breast cancer via supervised machine learning methods on class imbalanced data," International Journal of Advanced Computer Science and Applications, vol. 11, no. 8, 2020, doi: 10.14569/IJACSA.2020.0110808.
  15. H. El Massari, N. Gherabi, S. Mhammedi, H. Ghandi, F. Qanouni, and M. Bahaj, "An ontological model based on machine learning for predicting breast cancer," International Journal of Advanced Computer Science and Applications, vol. 13, no. 7, 2022, doi: 10.14569/IJACSA.2022.0130715.
  16. M. N. Haque et al., "Predicting characteristics associated with breast cancer survival using multiple machine learning approaches," Computational and Mathematical Methods in Medicine, vol. 2022, pp. 1-12, Apr. 2022, doi: 10.1155/2022/1249692.
  17. C. Hou et al., "Predicting breast cancer in Chinese women using machine learning techniques: algorithm development," JMIR Medical Informatics, vol. 8, no. 6, Jun. 2020, doi: 10.2196/17364.
  18. W. Wu and S. Faisal, "A data-driven principal component analysis-support vector machine approach for breast cancer diagnosis: Comparison and application," Transactions of the Institute of Measurement and Control, vol. 42, no. 7, pp. 1301-1312, Apr. 2020, doi: 10.1177/0142331219889221.
  19. M. M. Islam, H. Iqbal, M. R. Haque, and M. K. Hasan, "Prediction of breast cancer using support vector machine and K-nearest neighbors," in 2017 IEEE Region 10 Humanitarian Technology Conference (R10-HTC), Dec. 2017, pp. 226-229, doi: 10.1109/R10- HTC.2017.8288944.
  20. M. M. Islam, M. R. Haque, H. Iqbal, M. M. Hasan, M. Hasan, and M. N. Kabir, "Breast cancer prediction: A comparative study using machine learning techniques," SN Computer Science, vol. 1, no. 5, Sep. 2020, doi: 10.1007/s42979-020-00305-w.
  21. D. Singh and M. Singh, "Classification of mammograms using support vector machine," International Journal of Signal Processing, Image Processing and Pattern Recognition, vol. 9, no. 5, pp. 259-268, May 2016, doi: 10.14257/ijsip.2016.9.5.23.
  22. T. A. Assegie, "Evaluation of the Shapley additive explanation technique for ensemble learning methods," Proceedings of Engineering and Technology Innovation, vol. 21, pp. 20-26, Apr. 2022, doi: 10.46604/peti.2022.9025.
  23. A. Moncada-Torres, M. C. van Maaren, M. P. Hendriks, S. Siesling, and G. Geleijnse, "Explainable machine learning can outperform cox regression predictions and provide insights in breast cancer survival," Scientific Reports, vol. 11, no. 1, Mar. 2021, doi: 10.1038/s41598-021-86327-7.
  24. M. R. Zafar and N. M. Khan, "DLIME: A deterministic local interpretable model-agnostic explanations approach for computer- aided diagnosis systems," Prepr. arXiv.1906.10263, Jun. 2019.
  25. H. Hakkoum, A. Idri, and I. Abnane, "Artificial neural networks interpretation using LIME for breast cancer diagnosis," in WorldCIST 2020: Trends and Innovations in Information Systems and Technologies, 2020, pp. 15-24, doi: 10.1007/978-3-030- 45697-9_2.
  26. T. A. Assegie and S. J. Sushma, "A support vector machine and decision tree based breast cancer prediction," International Journal of Engineering and Advanced Technology, vol. 9, no. 3, pp. 2972-2976, Feb. 2020, doi: 10.35940/ijeat.A1752.029320.
  27. Y. Zhang, Y. Weng, and J. Lund, "Applications of explainable artificial intelligence in diagnosis and surgery," Diagnostics, vol. 12, no. 2, Jan. 2022, doi: 10.3390/diagnostics12020237.