Data clustering is a challenging task to gain insights into data in various fields. In this paper... more Data clustering is a challenging task to gain insights into data in various fields. In this paper, an Enhanced Quantum-Inspired Evolutionary Fuzzy C-Means (EQIE-FCM) algorithm is proposed for data clustering. In the EQIE-FCM, quantum computing concept is utilized in combination with the FCM algorithm to improve the clustering process by evolving the clustering parameters. The improvement in the clustering process leads to improvement in the quality of clustering results. To validate the quality of clustering results achieved by the proposed EQIE-FCM approach, its performance is compared with the other quantum-based fuzzy clustering approaches and also with other evolutionary clustering approaches. To evaluate the performance of these approaches, extensive experiments are being carried out on various benchmark datasets and on the protein database that comprises of four superfamilies. The results indicate that the proposed EQIE-FCM approach finds the optimal value of fitness function and the fuzzifier parameter for the reported datasets. In addition to this, the proposed EQIE-FCM approach also finds the optimal number of clusters and more accurate location of initial cluster centers for these benchmark datasets. Thus, it can be regarded as a more efficient approach for data clustering.
This paper presents a comprehensive study on clustering: exiting methods and developments made at... more This paper presents a comprehensive study on clustering: exiting methods and developments made at various times. Clustering is defined as an unsupervised learning where the objects are grouped on the basis of some similarity inherent among them. There are different methods for clustering the objects such as hierarchical, partitional, grid, density based and model based. The approaches used in these methods are discussed with their respective states of art and applicability. The measures of similarity as well as the evaluation criteria, which are the central components of clustering are also presented in the paper. The applications of clustering in some fields like image segmentation, object and character recognition and data mining are highlighted.
A large proportion of the United States' underinsured population relies on free health clinics fo... more A large proportion of the United States' underinsured population relies on free health clinics for their health care needs. With only a few free health clinics nationwide hosting specialty clinics, a small subset of which are dermatology clinics, there is a dearth of information in the literature on which dermatological pathologies and treatment modalities are most common in this setting. The purpose of this study was to establish the most common dermatological conditions and treatments in the free health care setting as well as understand which facets of care need improvement. A total of 57 patients with dermatological findings were identified at an urban student-run free health clinic in the southern United States in the past two years (2019-2021). Information reviewed for each patient included general demographics, chief complaint, medical/surgical history, treatments/procedures required for each visit, treatments/procedures available for each visit, referrals, and follow-up rate. Qualitative analysis was performed. The median age of the patients that presented with dermatological findings was 40 while the most common ethnicities were white (26.2%), Hispanic/Latino (28.6%), and black (28.6%). The most common chief complaints were rashes and cysts with a majority (63.2%) of these patients presenting to this particular clinic for the first time. Seven patients (12.3%) were unable to receive treatment due to expense, procedure unavailability, or an unknown reason. The most common treatment prescribed included a topical steroid. A majority (71.9%) of the patients were unable to follow up as scheduled. A majority of patients (81.2%) that were able to follow up were adherent to their prescribed medication. Although dermatological conditions are plentiful in the free health care setting, the literature currently contains no information regarding this topic. This may be due to low patient follow-up rates and inadequately charted outcomes on often outdated electronic health records. In order to best care for dermatology patients in this setting, it is necessary to understand the barriers to care and available treatment options.
Genome sequencing projects are rapidly increasing the number of high-dimensional protein sequence... more Genome sequencing projects are rapidly increasing the number of high-dimensional protein sequence datasets. Clustering a high-dimensional protein sequence dataset using traditional machine learning approaches poses many challenges. Many different feature extraction methods exist and are widely used. However, extracting features from millions of protein sequences becomes impractical because they are not scalable with current algorithms. Therefore, there is a need for an efficient feature extraction approach that extracts significant features. We have proposed two scalable feature extraction approaches for extracting features from huge protein sequences using Apache Spark, which are termed 60d-SPF (60-dimensional Scalable Protein Feature) and 6d-SCPSF (6-dimensional Scalable Co-occurrence-based Probability-Specific Feature). The proposed 60d-SPF and 6d-SCPSF approaches capture the statistical properties of amino acids to create a fixed-length numeric feature vector that represents each protein sequence
A Novel Scalable Feature Extraction Approach for COVID-19 Protein Sequences and their Cluster Analysis with Kernelized Fuzzy Algorithm
2022 IEEE International Conference on Big Data and Smart Computing (BigComp), 2022
COVID-19 (Coronavirus Disease-19), a disease caused by the SARS-CoV-2 virus, was declared a pande... more COVID-19 (Coronavirus Disease-19), a disease caused by the SARS-CoV-2 virus, was declared a pandemic by the World Health Organization on March 11, 2020. To solve the global problem of analysis of different variants of COVID-19 genome sequences, there is a need to develop intel-ligent, scalable machine learning techniques that can process and analyze important COVID-19 protein data by utilizing the Big Data framework. For this, we have first proposed a feature extraction approach for COVID-19 protein data named Scalable Distributed Co-occurrence-based Probability-Specific Feature extraction approach (SDCPSF). The proposed SDCPSF approach is executed on the Apache Spark cluster to preprocess the massive COVID-19 protein sequences. The proposed SDCPSF represents each variable-length COVID-19 protein sequence with fixed length six dimensions numeric feature vectors. Then the extracted features are used as input to the kernelized fuzzy clustering algorithms, i.e., KSRSIO-FCM and KSLFCM, which efficiently performs clustering of big data due to its in-memory cluster computing technique and thus forms clusters of COVID-19 genome sequences. Furthermore, the performance of KSRSIO-FCM is compared with another scalable clustering algorithm, i.e., KSLFCM, in terms of the Silhouette index (SI) and Davies-Bouldin index (DBI).
Ballistic impact is generally a low-mass high velocity impact caused by a propelling source. Natu... more Ballistic impact is generally a low-mass high velocity impact caused by a propelling source. Natural fiber composites are mainly price-driven commodity composites that have useable structural properties at relatively low cost. The manufacture, use and removal of traditional composite structures usually made of glass, carbon and aramid fibres are considered critically because of the growing environmental consciousness. Advantages of natural fibers over traditional reinforcing fibers such as glass and carbon are: low cost, low density, acceptable specific properties, ease of separation, enhanced energy recovery, CO 2 and biodegradability. There is a growing interest in the use of natural/ biofibres as reinforcing components for thermoplastics and thermosets. Although thermoplastics have the added advantage of recycling possibilities; thermosets are targeted to obtain much improved mechanical properties as compared to thermoplastics in the resulting bio-composites.
The dysregulation of glucose metabolism that includes the modification of biomolecules with the h... more The dysregulation of glucose metabolism that includes the modification of biomolecules with the help of glycation reaction results in the formation of advanced glycation end products (AGEs). The formation of AGEs may activate receptors for advanced glycation end products which induce intracellular signaling, ultimately enhancing oxidative stress, a well-known contributor to type 2 diabetes mellitus. In addition, AGEs are possible therapeutic targets for the treatment of type 2 diabetes mellitus and its complications. This review article highlights the antioxidant, anti-inflammatory, and antidiabetic properties of the Nymphaea species, and the screening of such aquatic plants for antiglycation activity may provide a safer alternative to the adverse effects related to glucotoxicity. Since oxidation and glycation are relatively similar to each other, therefore, there is a possibility that the Nymphaea species may also have antiglycating properties because of its powerful antioxidant pr...
Exploration of artemisinin derivatives and synthetic peroxides in antimalarial drug discovery research
European Journal of Medicinal Chemistry, 2021
Malaria is a life-threatening infectious disease caused by protozoal parasites belonging to the g... more Malaria is a life-threatening infectious disease caused by protozoal parasites belonging to the genus Plasmodium. It caused an estimated 405,000 deaths and 228 million malaria cases globally in 2018 as per the World Malaria Report released by World Health Organization (WHO) in 2019. Artemisinin (ART), a "Nobel medicine" and its derivatives have proven potential application in antimalarial drug discovery programs. In this review, antimalarial activity of the most active artemisinin derivatives modified at C-10/C-11/C-16/C-6 positions and synthetic peroxides (endoperoxides, 1,2,4-trioxolanes, 1,2,4-trioxanes, and 1,2,4,5-tetraoxanes) are systematically summarized. The developmental trend of ART derivatives, and cyclic peroxides along with their antimalarial activity and how the activity is affected by structural variations on different sites of the compounds are discussed. This compilation would be very useful towards scaffold hopping aimed at avoiding the unnecessary complexity in cyclic peroxides, and ultimately act as a handy resource for the development of potential chemotherapeutics against Plasmodium species.
This is a PDF file of an article that has undergone enhancements after acceptance, such as the ad... more This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
This is a PDF file of an article that has undergone enhancements after acceptance, such as the ad... more This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
In this paper, an enhanced quantum-based neural network learning algorithm (EQNN-S) which constru... more In this paper, an enhanced quantum-based neural network learning algorithm (EQNN-S) which constructs a neural network architecture using the quantum computing concept is proposed for signature verification. The quantum computing concept is used to decide the connection weights and threshold of neurons. A boundary threshold parameter is introduced to optimally determine the neuron threshold. This parameter uses min, max function to decide threshold, which assists efficient learning. A manually prepared signature dataset is used to test the performance of the proposed algorithm. To uniquely identify the signature, several novel features are selected such as the number of loops present in the signature, the boundary calculation, the number of vertical and horizontal dense patches, and the angle measurement. A total of 45 features are extracted from each signature. The performance of the proposed algorithm is evaluated by rigorous training and testing with these signatures using partitions of 60-40 and 70-30%, and a tenfold cross-validation. To compare the results derived from the proposed quantum neural network, the same dataset is tested on support vector machine, multilayer perceptron, back propagation neural network, and Naive Bayes. The performance of the proposed algorithm is found better when compared with the above methods, and the results verify the effectiveness of the proposed algorithm.
In this paper, an Advanced Quantum-based Neural Network Classifier (AQNN) is proposed. The propos... more In this paper, an Advanced Quantum-based Neural Network Classifier (AQNN) is proposed. The proposed AQNN is used to form an objectionable Web content filtering system (OWF). The aim is to design a neural network with a few numbers of hidden layer neurons with the optimal connection weights and the threshold of neurons. The proposed algorithm uses the concept of quantum computing and genetic concept to evolve connection weights and the threshold of neurons. Quantum computing uses qubit as a probabilistic representation which is the smallest unit of information in the quantum computing concept. In this algorithm, a threshold boundary parameter is also introduced to find the optimal value of the threshold of neurons. The proposed algorithm forms neural network architecture which is used to form an objectionable Web content filtering system which detects objectionable Web request by the user. To judge the performance of the proposed AQNN, a total of 2000 (1000 objectionable + 1000 nonobjectionable) Website's contents have been used. The results of AQNN are also compared with QNN-F and well-known classifiers as backpropagation, support vector machine (SVM), multilayer perceptron, decision tree algorithm, and artificial neural network. The results show that the AQNN as classifier performs better than existing classifiers. The performance of the proposed objectionable Web content filtering system (OWF) is also compared with well-known objectionable Web filtering software and existing models. It is found that the proposed OWF performs better than existing solutions in terms of filtering objectionable content.
Data clustering is a challenging task to gain insights into data in various fields. In this paper... more Data clustering is a challenging task to gain insights into data in various fields. In this paper, an Enhanced Quantum-Inspired Evolutionary Fuzzy C-Means (EQIE-FCM) algorithm is proposed for data clustering. In the EQIE-FCM, quantum computing concept is utilized in combination with the FCM algorithm to improve the clustering process by evolving the clustering parameters. The improvement in the clustering process leads to improvement in the quality of clustering results. To validate the quality of clustering results achieved by the proposed EQIE-FCM approach, its performance is compared with the other quantum-based fuzzy clustering approaches and also with other evolutionary clustering approaches. To evaluate the performance of these approaches, extensive experiments are being carried out on various benchmark datasets and on the protein database that comprises of four superfamilies. The results indicate that the proposed EQIE-FCM approach finds the optimal value of fitness function and the fuzzifier parameter for the reported datasets. In addition to this, the proposed EQIE-FCM approach also finds the optimal number of clusters and more accurate location of initial cluster centers for these benchmark datasets. Thus, it can be regarded as a more efficient approach for data clustering.
ensitive information such as credit card information, username, password and social security numb... more ensitive information such as credit card information, username, password and social security number etc, can be stolen using a fake page that imitates trusted website is called phishing. The attacker designs a similar webpage either by copying or making small manipulation to the legitimate page so that the online user cannot distinguish the legitimate and fake websites. A Deep Neural Network (DNN) was introduced to detect the phishing Uniform Resource Locator (URL). Initially, a 30-dimension feature vector was constructed based on URLbased features, Hypertext Markup Language (HTML)-based features and domainbased features. These features were processed in DNN to detect the phishing URL. However, the irrelevant, redundant and noisy features in the dataset increase the complexity of DNN classifier. So the feature selection is required for efficient phishing attack detection. But feature selection is a time-consuming process since it is an independent process. So in this paper, a feature vector is generated by DNN itself using Stacked Denoise Auto Encoder (SDAE). Moreover, the noisy data such as missing features affect the efficiency of phishing detection so the SDAE is trained to reconstruct a clean input feature vector. The initial input feature vector is corrupted by setting some feature vectors as zero. Then, the corrupted feature vector is then plotted with basic auto encoder, to a hidden representation from that the input feature vector is reconstructed. The reconstructed features are given as input to DNN which selects the most relevant features and predicts the phishing URL. Hence the sparse feature representation of SDAE increases the classification accuracy of DNN. The experiments are conducted in Ham, Phishing Corpus and Phishload datasets to prove the effectiveness of DNN-SDAE.
In the past decade, metal-free approaches for C-C bond formation have attracted a great deal of a... more In the past decade, metal-free approaches for C-C bond formation have attracted a great deal of attention due to their ease of use and low cost. This report represents a novel and metal-free synthesis of 3,3'-bisimidazopyridinylmethanes via intermolecular oxidative C(sp(2))-H bond functionalization of imidazo[1,2-a]pyridines with dimethyl sulfoxide as the carbon synthon (CH2) using H2O2 as a mild oxidant under air. A library of 3,3'-bis(2-arylimidazo[1,2-a]pyridin-3-yl)methanes has been achieved in good to excellent yields. The present methodology has been successfully applied to imidazo[2,1-b]thiazoles and imidazo[2,1-b]benzothiazoles. Furthermore, the current approach was also extended for the synthesis of unsymmetrical 3,3'-bisimidazopyridinylmethanes under optimized reaction conditions. A mechanistic pathway is proposed on the basis of experiments with radical scavengers and DMSO-d6 and ESI-MS observations.
2015 IEEE Symposium Series on Computational Intelligence, 2015
Clustering is one of the widely used knowledge discovery techniques to reveal the structures in a... more Clustering is one of the widely used knowledge discovery techniques to reveal the structures in a dataset that can be extremely useful for the analyst. In fuzzy based clustering algorithms, the procedure acquired for choosing the fuzziness parameter m, the number of clusters C and the initial cluster centroids VC is extremely important as it has a direct impact on the formation of final clusters. Moreover, the improper selection of these parameters may lead the algorithms to the local optima. In this paper, we proposed an Enhanced Quantum-Inspired Evolutionary Fuzzy C-Means (EQIE-FCM) algorithm to compute the global optimal value of these parameters. In EQIE-FCM, we utilize the quantum computing concept in combination with fuzzy clustering to evolve the different values of these parameters in several generations. However, in each generation these parameters are represented in terms of a quantum bit (Q). At each generation (g), the quantum bit of these parameters is updated using a quantum rotational gate. Through this, after several generations of evolution, we get the global optimal values of these parameters from a large quantum search space. The EQIE-FCM algorithm is applied on the Pima Indians Diabetes dataset and the performance of EQIE-FCM is compared with another Quantum-inspired Fuzzy Clustering (QIE-FCM) and other three fuzzy based evolutionary clustering algorithms from the literature. Extensive experiments indicate that the EQIE-FCM algorithm outperforms many baseline approaches and can be used an effective clustering algorithm.
Uploads
Papers by Om Patel