Comparative analysis of machine learning methods for analyzing security practice in electronic health records’ logs
2020 IEEE International Conference on Big Data (Big Data), 2020
Electronic health records (EHR) consists of broad, numerous and erratic accesses through self-aut... more Electronic health records (EHR) consists of broad, numerous and erratic accesses through self-authorizations and "brake the glass" scenarios. This is to fulfil the availability aspect of the the CIA (confidentiality, integrity) due to the time sensitive nature in healthcare especially during health emergency situations. Adversaries can use this as opportunity to illegitimately access patients records, thereby, compromising the entire EHR system.To avert this, a comparative analysis of machine learning classification methods was conducted with simulated EHR logs. The methods which were compared are Multinomial Naive Bayes(multnb), Bernoulli Naive Bayes (bernnb), Support Vector Machine (svm), Neural Network (nn), K-Nearest Neighbours(knn), Logistic Regression (lr), Random Forest (rf), and Decision Tree (dt).The experiment results show that all of the machine learning models used in this work performed very well for the role classification task but, Decision Tree (dt) and Random Forrest (rf) obtained the best result among all of the methods with the same accuracy value of 0.889 on all three datasets. For the anomaly detection task, generally, our proposed approach obtained a high recall and accuracy but low precision and F1-score. Soft Classification approach performed better than the Hard Classification approach. The best performance was achieved with Bernoulli Naive Bayes with none normalised data, with an F1-score of 0.893.
Uploads
Papers by Prosper Yeng