CS Academic Advising Committee

Adaboost Ensemble with Simple Genetic Algorithm for Student Prediction Model

International Journal of Computer Science and Information Technology, 2013

Predicting the student performance is a great concern to the higher education managements.This pr... more Predicting the student performance is a great concern to the higher education managements.This prediction helps to identify and to improve students' performance.Several factors may improve this performance.In the present study, we employ the data mining processes, particularly classification, to enhance the quality of the higher educational system. Recently, a new direction is used for the improvement of the classification accuracy by combining classifiers.In thispaper, we design and evaluate a fastlearning algorithm using AdaBoost ensemble with a simple genetic algorithmcalled "Ada-GA" where the genetic algorithm is demonstrated to successfully improve the accuracy of the combined classifier performance. The Ada-GA algorithm proved to be of considerable usefulness in identifying the students at risk early, especially in very large classes. This early prediction allows the instructor to provide appropriate advising to those students. The Ada/GA algorithm is implemented and tested on ASSISTments dataset, the results showed that this algorithm hassuccessfully improved the detection accuracy as well as it reduces the complexity of computation.

Download

Feature Selection and Classification Using CatBoost Method for Improving the Performance of Predicting Parkinson’s Disease

Advances on Smart and Soft Computing, 2020

Several studies investigated the diagnosis of Parkinson’s disease (PD), which utilized machine le... more Several studies investigated the diagnosis of Parkinson’s disease (PD), which utilized machine learning methods such as support vector machine, neural network, Naive Bayes and K-nearest neighbor. In addition, different ensemble methods were used such as bagging, random forest and boosting. On the other hand, different feature ranking methods have been used to reduce the data dimensionality by selecting the most important features. In this paper, the ensemble methods, random forest, XGBoost and CatBoost were used to find the most important features for predicting PD. The effect of these features with different thresholds was investigated in order to obtain the best performance for predicting PD. The results showed that CatBoost method obtained the best results.

Internet of Drones Intrusion Detection Using Deep Learning

Electronics

Flying Ad Hoc Network (FANET) or drones’ technologies have gained much attraction in the last few... more Flying Ad Hoc Network (FANET) or drones’ technologies have gained much attraction in the last few years due to their critical applications. Therefore, various studies have been conducted on facilitating FANET applications in different fields. In fact, civil airspaces have gradually adopted FANET technology in their systems. However, FANET’s special roles made it complex to support emerging security threats, especially intrusion detection. This paper is a step forward towards the advances in FANET intrusion detection techniques. It investigates FANET intrusion detection threats by introducing a real-time data analytics framework based on deep learning. The framework consists of Recurrent Neural Networks (RNN) as a base. It also involves collecting data from the network and analyzing it using big data analytics for anomaly detection. The data collection is performed through an agent working inside each FANET. The agent is assumed to log the FANET real-time information. In addition, it...

Download

Performance of authorship attribution classifiers with short texts: application of religious Arabic fatwas

International Journal of Data Mining, Modelling and Management, 2020

Although authorship attribution is a well-known problem in authorship analysis domain, researches... more Although authorship attribution is a well-known problem in authorship analysis domain, researches on Arabic contexts are still limited. In addition, examining the performance of the attribution methods on training set with short textual documents is also not considered well in other languages, such as English, Chinese, Spanish and Dutch. Therefore, this current work aims at examining the performance of attribution classifiers in the context of short Arabic textual documents. The experimental part of this work is conducted with well-known classifiers namely: decision tree C4.5 method, naive Bayes model, K-NN method, Markov model, SMO and Burrows Delta method. We experiment with various features combination. The results show that combining the word-based lexical features with the structural features yields the best accuracy. At this end, we use this combination as a baseline for further investigation. We also examine the effect of combining the n-gram features. The results indicate th...

Analysis the Arabic Authorship Attribution Using Machine Learning Methods: Application on Islamic Fatwā

In context of Arabic, the authorship attribution (AA) problem is not addressed well comparing wit... more In context of Arabic, the authorship attribution (AA) problem is not addressed well comparing with other natural languages such English, Chinese and Dutch. This paper addresses the attribution problem in context of Islamic fatwā’. To the best of our knowledge, this is the first study of its kind that addresses this problem in such domain. In term of attribution methods, three machine-learning classifiers namely, the locally weighted learning (LWL) classifier, decision tree C4.5, and Random Forest (RF) are used. The experiment is performed with a selected list of stylomatric features. To extract the most discriminating features, various feature selection techniques are used. The experimental results show that the classifiers have different behaviour respect each feature reduction techniques. Among the used classifiers, the C4.5 method gives the best accuracy.

Combination of Stylo-based Features and Frequency-based Features for Identifying the Author of Short Arabic Text

International Conference on Intelligent Systems: Theories and Applications, 2018

Authorship verification (AV) is a binary classification task which aims at verifying whether a gi... more Authorship verification (AV) is a binary classification task which aims at verifying whether a given text is written by a specific author. In terms of Arabic language, this task is poorly addressed especially with short texts. The current study examines the performance of authorship verifications in the context of short Arabic documents. The Bagging classifier is applied on two different datasets. First, a balanced dataset is examined with different features combinations. In terms of authorship features, two features types are used: stylo-based features (SF) and frequency-based features (FF). And secondly, the same experiment is conducted with an unbalanced dataset.

Feature Selection and Classification Using CatBoost Method for Improving the Performance of Predicting Parkinson’s Disease

Several studies investigated the diagnosis of Parkinson’s disease (PD), which utilized machine le... more Several studies investigated the diagnosis of Parkinson’s disease (PD), which utilized machine learning methods such as support vector machine, neural network, Naive Bayes and K-nearest neighbor. In addition, different ensemble methods were used such as bagging, random forest and boosting. On the other hand, different feature ranking methods have been used to reduce the data dimensionality by selecting the most important features. In this paper, the ensemble methods, random forest, XGBoost and CatBoost were used to find the most important features for predicting PD. The effect of these features with different thresholds was investigated in order to obtain the best performance for predicting PD. The results showed that CatBoost method obtained the best results.

Performance of authorship attribution classifiers with short texts: application of religious Arabic fatwas

International Journal of Data Mining, Modelling and Management

Although authorship attribution is a well-known problem in authorship analysis domain, researches... more Although authorship attribution is a well-known problem in authorship analysis domain, researches on Arabic contexts are still limited. In addition, examining the performance of the attribution methods on training set with short textual documents is also not considered well in other languages, such as English, Chinese, Spanish and Dutch. Therefore, this current work aims at examining the performance of attribution classifiers in the context of short Arabic textual documents. The experimental part of this work is conducted with well-known classifiers namely: decision tree C4.5 method, naive Bayes model, K-NN method, Markov model, SMO and Burrows Delta method. We experiment with various features combination. The results show that combining the word-based lexical features with the structural features yields the best accuracy. At this end, we use this combination as a baseline for further investigation. We also examine the effect of combining the n-gram features. The results indicate that some classifiers show an improvement while the others do not. In addition, the results show that the naive Bayes method gives the highest accuracy among all the attribution classifiers.

The effect of training set size in authorship attribution: application on short arabic texts

International Journal of Electrical and Computer Engineering (IJECE), Feb 1, 2019

Authorship attribution (AA) is a subfield of linguistics analysis, aiming to identify the origina... more Authorship attribution (AA) is a subfield of linguistics analysis, aiming to identify the original author among a set of candidate authors. Several research papers were published and several methods and models were developed for many languages. However, the number of related works for Arabic is limited. Moreover, investigating the impact of short words length and training set size is not well addressed. To the best of our knowledge, no published works or researches, in this direction or even in other languages, are available. Therefore, we propose to investigate this effect, taking into account different stylomatric combination. The Mahalanobis distance (MD), Linear Regression (LR), and Multilayer Perceptron (MP) are selected as AA classifiers. During the experiment, the training dataset size is increased and the accuracy of the classifiers is recorded. The results are quite interesting and show different classifiers behaviors. Combining word-based stylomatric features with n-grams provides the best accuracy reached in average 93%.

Download

Adaboost Ensemble with Simple Genetic Algorithm for Student Prediction Model

Predicting the student performance is a great concern to the higher education managements.This pr... more Predicting the student performance is a great concern to the higher education managements.This prediction helps to identify and to improve students' performance.Several factors may improve this performance.In the present study, we employ the data mining processes, particularly classification, to enhance the quality of the higher educational system. Recently, a new direction is used for the improvement of the classification accuracy by combining classifiers.In thispaper, we design and evaluate a fastlearning algorithm using AdaBoost ensemble with a simple genetic algorithm called “Ada-GA” where the genetic algorithm is demonstrated to successfully improve the accuracy of the combined classifier performance. The Ada-GA algorithm proved to be of considerable usefulness in identifying the students at risk early, especially in very large classes. This early prediction allows the instructor to provide appropriate advising to those students. The Ada/GA algorithm is implemented and test...

Download

The effect of training set size in authorship attribution: application on short arabic texts

International Journal of Electrical and Computer Engineering (IJECE), Feb 1, 2019

Authorship attribution (AA) is a subfield of linguistics analysis, aiming to identify the origina... more Authorship attribution (AA) is a subfield of linguistics analysis, aiming to identify the original author among a set of candidate authors. Several research papers were published and several methods and models were developed for many languages. However, the number of related works for Arabic is limited. Moreover, investigating the impact of short words length and training set size is not well addressed. To the best of our knowledge, no published works or researches, in this direction or even in other languages, are available. Therefore, we propose to investigate this effect, taking into account different stylomatric combination. The Mahalanobis distance (MD), Linear Regression (LR), and Multilayer Perceptron (MP) are selected as AA classifiers. During the experiment, the training dataset size is increased and the accuracy of the classifiers is recorded. The results are quite interesting and show different classifiers behaviors. Combining word-based stylomatric features with n-grams provides the best accuracy reached in average 93%.

Download

Uploads

Papers by CS Academic Advising Committee

Log In