Classification of the different categories of small peptides is a challenging research area in bioinformatics research. However, machine learning based approaches are widely experimented in the literature with enormous success. For...
moreClassification of the different categories of small peptides is a challenging research area in bioinformatics research. However, machine learning based approaches are widely experimented in the literature with enormous success. For excellent learning of the classifiers, few numbers of informative features are important. This research explores a comparative study between various supervised feature selection methods such as Document Frequency (DF), Chi-Squared ( 2 ), Information Gain (IG), Gain Ratio (GR), Relief F (RF), and One R (OR). The corpus of small peptides data is selected from ARAPEP repository. Bayesian Classifier is taken to classify the different categories of the given corpus with the help of features selected by above feature selection techniques. Results of this study shows that RF is the excellent feature selection technique amongst other in terms of classification accuracy and false positive rate whereas DF and 2 were not so effective methods. Bayesian classifier has proven its worth in this study in terms of good performance accuracy and low false positives. Small peptides Identification, Machine Learning Classifiers, Pattern Recognition