Astha Malviya

CardioCare: A Heart Disease Prediction System using Random Forest Machine Learning Model

Astha Malviya, 2025

Background: Cardiovascular diseases (CVDs) continue to pose a critical public health challenge, c... more Background: Cardiovascular diseases (CVDs) continue to pose a critical public health challenge, contributing to approximately 31% of global deaths, according to the World Health Organization (WHO). Among these, hypertensive and ischemic heart diseases are particularly prevalent and often go undetected until serious complications arise. Traditional diagnostic procedures, though accurate, are often time-consuming, resourceintensive, and inaccessible to many, especially in remote and underserved regions. Moreover, the reliance on clinical judgment and subjective interpretation can sometimes lead to delayed or missed diagnoses. With the increasing availability of healthcare data and computational tools, machine learning has emerged as a powerful approach for developing intelligent systems that can assist in the early detection and prevention of chronic illnesses. By analysing large volumes of patient data and identifying hidden patterns, ML-based models have the potential to enhance clinical decision-making, reduce diagnostic errors, and improve patient outcomes. However, despite promising research, the real-world implementation of such systems remains limited, and many existing models lack proper deployment or validation across diverse populations. Addressing this gap, our study introduces CardioCare, a machine learning-powered heart disease prediction tool that not only offers high predictive accuracy but also ensures practical usability through a web-based interface, bridging the divide between academic models and accessible digital healthcare solutions. Materials and Methods: In this prospective the study utilized the Cleveland Heart Disease dataset from the UCI repository, containing 14 clinical features from 303 patient records. After cleaning and preprocessing the data through encoding and normalization, the dataset was split into training and testing sets (80:20). Various supervised machine learning models, including Logistic Regression, K-Nearest Neighbours, SVM, and Random Forest, were applied. Performance was evaluated using accuracy, precision, recall, and F1-score. The Random Forest classifier outperformed others, achieving an accuracy of 85%. To support practical usability, the final model was deployed through a Flask-based web application for real-time heart disease risk prediction. Results: Among the evaluated machine learning models, the Random Forest Classifier achieved the highest performance with an accuracy of 85%, precision of 83%, recall of 84%, and F1-score of 83.5%. It outperformed Logistic Regression, K-Nearest Neighbour, and Support Vector Machine in terms of both consistency and predictive capability. The model demonstrated strong generalization on unseen data. These results validate the model's effectiveness in classifying heart disease presence based on key clinical attributes.

Download

Uploads

Papers by Astha Malviya

Log In