Papers by Zaynab Mousavian

ObjectivesTuberculosis (TB) is a bacterial infectious disease caused by Mycobacterium tuberculosi... more ObjectivesTuberculosis (TB) is a bacterial infectious disease caused by Mycobacterium tuberculosis. Annually, an estimated 10 million people are diagnosed with active TB, and approximately 1.4 million dies of the disease. If left untreated, each person with active TB will infect 10 to 15 new individuals every year. Therefore, interrupting disease transmission by accurate early detection and diagnosis, paired with appropriate treatment is of major importance. In this study, we aimed to identify biomarkers associated with the development of active TB that can then be further developed for clinical testing.MethodsWe assessed the relative plasma concentration of 92 proteins associated with inflammation in individuals with active TB (n=19), latent TB (n=13), or healthy controls (n=10). We then constructed weighted protein co-expression networks to reveal correlations between protein expression profiles in all samples. After clustering the networks into four modules, we assessed their ass...

StrongestPath: a Cytoscape application for protein-protein interaction analysis
ABSTRACTBackgroundStrongestPath is a Cytoscape 3 application that enables to look for one or more... more ABSTRACTBackgroundStrongestPath is a Cytoscape 3 application that enables to look for one or more cascades of interactions connecting two single or groups of proteins in a collection of protein-protein interaction (PPI) network or signaling network databases. When there are different levels of confidence over the interactions, it is able to process them and identify the cascade of interactions having the highest total confidence score. Given a set of proteins, StrongestPath can extract and show the network of interactions among them from the given databases, and expand the network by adding new proteins having the most interactions with highest total confidence to the current proteins. The application can also identify any activation or inhibition regulatory paths between two distinct sets of transcription factors and target genes. This application can be either used with a set of built-in human and mouse PPI or signaling databases, or any user-provided database for some organism.Re...
FRnet-DTI: Deep convolutional neural network for drug-target interaction prediction
Heliyon

Molecular Medicine
Background: Acute lymphoblastic leukemia (ALL) is the most common type of cancer diagnosed in chi... more Background: Acute lymphoblastic leukemia (ALL) is the most common type of cancer diagnosed in children and Glucocorticoids (GCs) form an essential component of the standard chemotherapy in most treatment regimens. The category of infant ALL patients carrying a translocation involving the mixed lineage leukemia (MLL) gene (gene KMT2A) is characterized by resistance to GCs and poor clinical outcome. Although some studies examined GCresistance in infant ALL patients, the understanding of this phenomenon remains limited and impede the efforts to improve prognosis. Methods: This study integrates differential co-expression (DC) and protein-protein interaction (PPI) networks to find active protein modules associated with GC-resistance in MLL-rearranged infant ALL patients. A network was constructed by linking differentially co-expressed gene pairs between GC-resistance and GC-sensitive samples and later integrated with PPI networks by keeping the links that are also present in the PPI network. The resulting network was decomposed into two sub-networks, specific to each phenotype. Finally, both sub-networks were clustered into modules using weighted gene co-expression network analysis (WGCNA) and further analyzed with functional enrichment analysis. Results: Through the integration of DC analysis and PPI network, four protein modules were found active under the GC-resistance phenotype but not under the GC-sensitive. Functional enrichment analysis revealed that these modules are related to proteasome, electron transport chain, tRNA-aminoacyl biosynthesis, and peroxisome signaling pathways. These findings are in accordance with previous findings related to GC-resistance in other hematological malignancies such as pediatric ALL. Conclusions: Differential co-expression analysis is a promising approach to incorporate the dynamic context of gene expression profiles into the well-documented protein interaction networks. The approach allows the detection of relevant protein modules that are highly enriched with DC gene pairs. Functional enrichment analysis of detected protein modules generates new biological hypotheses and may help in explaining the GC-resistance in MLLrearranged infant ALL patients.

Comparison of gene co-expression networks in Pseudomonas aeruginosa and Staphylococcus aureus reveals conservation in some aspects of virulence
Gene
Pseudomonas aeruginosa and Staphylococcus aureus are two evolutionary distant bacterial species t... more Pseudomonas aeruginosa and Staphylococcus aureus are two evolutionary distant bacterial species that are frequently isolated from persistent infections such as chronic infectious wounds and severe lung infections in cystic fibrosis patients. To the best of our knowledge no comprehensive genome scale co-expression study has been already conducted on these two species and in most cases only the expression of very few genes has been the subject of investigation. In this study, in order to investigate the level of expressional conservation between these two species, using heterogeneous gene expression datasets the weighted gene co-expression network analysis (WGCNA) approach was applied to study both single and cross species genome scale co-expression patterns of these two species. Single species co-expression network analysis revealed that in P. aeruginosa, genes involved in quorum sensing (QS), iron uptake, nitrate respiration and type III secretion systems and in S. aureus, genes associated with the regulation of carbon metabolism, fatty acid-phospholipids metabolism and proteolysis represent considerable co-expression across a variety of experimental conditions. Moreover, the comparison of gene co-expression networks between P. aeruginosa and S. aureus was led to the identification of four co-expressed gene modules in both species totally consisting of 318 genes. Several genes related to two component signal transduction systems, small colony variants (SCVs) morphotype and protein complexes were found in the detected modules. We believe that targeting the key players among the identified co-expressed orthologous genes will be a potential intervention strategy to control refractory co-infections caused by these two bacterial species.

Reconstruction of the genome-scale co-expression network for the Hippo signaling pathway in colorectal cancer
Computers in Biology and Medicine
The Hippo signaling pathway (HSP) has been identified as an essential and complex signaling pathw... more The Hippo signaling pathway (HSP) has been identified as an essential and complex signaling pathway for tumor suppression that coordinates proliferation, differentiation, cell death, cell growth and stemness. In the present study, we conducted a genome-scale co-expression analysis to reconstruct the HSP in colorectal cancer (CRC). Five key modules were detected through network clustering, and a detailed discussion of two modules containing respectively 18 and 13 over and down-regulated members of HSP was provided. Our results suggest new potential regulatory factors in the HSP. The detected modules also suggest novel genes contributing to CRC. Moreover, differential expression analysis confirmed the differential expression pattern of HSP members and new suggested regulatory factors between tumor and normal samples. These findings can further reveal the importance of HSP in CRC.

Scientific Reports
Prediction of new drug-target interactions is critically important as it can lead the researchers... more Prediction of new drug-target interactions is critically important as it can lead the researchers to find new uses for old drugs and to disclose their therapeutic profiles or side effects. However, experimental prediction of drug-target interactions is expensive and time-consuming. As a result, computational methods for predictioning new drug-target interactions have gained a tremendous interest in recent times. Here we present iDTI-ESBoost, a prediction model for identification of drug-target interactions using evolutionary and structural features. Our proposed method uses a novel data balancing and boosting technique to predict drug-target interaction. On four benchmark datasets taken from a gold standard data, iDTI-ESBoost outperforms the state-of-the-art methods in terms of area under receiver operating characteristic (auROC) curve. iDTI-ESBoost also outperforms the latest and the bestperforming method found in the literature in terms of area under precision recall (auPR) curve. This is significant as auPR curves are argued as suitable metric for comparison for imbalanced datasets similar to the one studied here. Our reported results show the effectiveness of the classifier, balancing methods and the novel features incorporated in iDTI-ESBoost. iDTI-ESBoost is a novel prediction method that has for the first time exploited the structural features along with the evolutionary features to predict drug-protein interactions. We believe the excellent performance of iDTI-ESBoost both in terms of auROC and auPR would motivate the researchers and practitioners to use it to predict drug-target interactions. To facilitate that, iDTI-ESBoost is implemented and made publicly available at: http:// farshidrayhan.pythonanywhere.com/iDTI-ESBoost/.

Comparison of gene co-expression networks in Pseudomonas aeruginosa and Staphylococcus aureus reveals conservation in some aspects of virulence
Gene, Jan 10, 2018
Pseudomonas aeruginosa and Staphylococcus aureus are two evolutionary distant bacterial species t... more Pseudomonas aeruginosa and Staphylococcus aureus are two evolutionary distant bacterial species that are frequently isolated from persistent infections such as chronic infectious wounds and severe lung infections in cystic fibrosis patients. To the best of our knowledge no comprehensive genome scale co-expression study has been already conducted on these two species and in most cases only the expression of very few genes has been the subject of investigation. In this study, in order to investigate the level of expressional conservation between these two species, using heterogeneous gene expression datasets the weighted gene co-expression network analysis (WGCNA) approach was applied to study both single and cross species genome scale co-expression patterns of these two species. Single species co-expression network analysis revealed that in P. aeruginosa, genes involved in quorum sensing (QS), iron uptake, nitrate respiration and type III secretion systems and in S. aureus, genes ass...

Reconstruction of the genome-scale co-expression network for the Hippo signaling pathway in colorectal cancer
Computers in biology and medicine, 2018
The Hippo signaling pathway (HSP) has been identified as an essential and complex signaling pathw... more The Hippo signaling pathway (HSP) has been identified as an essential and complex signaling pathway for tumor suppression that coordinates proliferation, differentiation, cell death, cell growth and stemness. In the present study, we conducted a genome-scale co-expression analysis to reconstruct the HSP in colorectal cancer (CRC). Five key modules were detected through network clustering, and a detailed discussion of two modules containing respectively 18 and 13 over and down-regulated members of HSP was provided. Our results suggest new potential regulatory factors in the HSP. The detected modules also suggest novel genes contributing to CRC. Moreover, differential expression analysis confirmed the differential expression pattern of HSP members and new suggested regulatory factors between tumor and normal samples. These findings can further reveal the importance of HSP in CRC.

Scientific reports, Jan 18, 2017
Prediction of new drug-target interactions is critically important as it can lead the researchers... more Prediction of new drug-target interactions is critically important as it can lead the researchers to find new uses for old drugs and to disclose their therapeutic profiles or side effects. However, experimental prediction of drug-target interactions is expensive and time-consuming. As a result, computational methods for predictioning new drug-target interactions have gained a tremendous interest in recent times. Here we present iDTI-ESBoost, a prediction model for identification of drug-target interactions using evolutionary and structural features. Our proposed method uses a novel data balancing and boosting technique to predict drug-target interaction. On four benchmark datasets taken from a gold standard data, iDTI-ESBoost outperforms the state-of-the-art methods in terms of area under receiver operating characteristic (auROC) curve. iDTI-ESBoost also outperforms the latest and the best-performing method found in the literature in terms of area under precision recall (auPR) curve...

Network-based expression analysis reveals key genes related to glucocorticoid resistance in infant acute lymphoblastic leukemia
Cellular Oncology, 2016
Despite vast improvements that have been made in the treatment of children with acute lymphoblast... more Despite vast improvements that have been made in the treatment of children with acute lymphoblastic leukemia (ALL), the majority of infant ALL patients (~80 %, < 1 year of age) that carry a chromosomal translocation involving the mixed lineage leukemia (MLL) gene shows a poor response to chemotherapeutic drugs, especially glucocorticoids (GCs), which are essential components of all current treatment regimens. Although addressed in several studies, the mechanism(s) underlying this phenomenon have remained largely unknown. A major drawback of most previous studies is their primary focus on individual genes, thereby neglecting the putative significance of inter-gene correlations. Here, we aimed at studying GC resistance in MLL-rearranged infant ALL patients by inferring an associated module of genes using co-expression network analysis. The implications of newly identified candidate genes with associations to other well-known relevant genes from the same module, or with associations to known transcription factor or microRNA interactions, were substantiated using literature data. A weighted gene co-expression network was constructed to identify gene modules associated with GC resistance in MLL-rearranged infant ALL patients. Significant gene ontology (GO) terms and signaling pathways enriched in relevant modules were used to provide guidance towards which module(s) consisted of promising candidates suitable for further analysis. Through gene co-expression network analysis a novel set of genes (module) related to GC-resistance was identified. The presence in this module of the S100 and ANXA genes, both well-known biomarkers for GC resistance in MLL-rearranged infant ALL, supports its validity. Subsequent gene set net correlation analyses of the novel module provided further support for its validity by showing that the S100 and ANXA genes act as 'hub' genes with potentially major regulatory roles in GC sensitivity, but having lost this role in the GC resistant phenotype. The detected module implicates new genes as being candidates for further analysis through associations with known GC resistance-related genes. From our data we conclude that available systems biology approaches can be employed to detect new candidate genes that may provide further insights into drug resistance of MLL-rearranged infant ALL cases. Such approaches complement conventional gene-wise approaches by taking putative functional interactions between genes into account.

CeFunMO: A centrality based method for discovering functional motifs with application in biological networks
Computers in Biology and Medicine, 2016
Detecting functional motifs in biological networks is one of the challenging problems in systems ... more Detecting functional motifs in biological networks is one of the challenging problems in systems biology. Given a multiset of colors as query and a list-colored graph (an undirected graph with a set of colors assigned to each of its vertices), the problem is reduced to finding connected subgraphs, which best cover the multiset of query. To solve this NP-complete problem, we propose a new color-based centrality measure for list-colored graphs. Based on this newly-defined measure of centrality, a novel polynomial time algorithm is developed to discover functional motifs in list-colored graphs, using a greedy strategy. This algorithm, called CeFunMO, has superior running time and acceptable accuracy in comparison with other well-known algorithms, such as RANGI and GraMoFoNe.

Information theory in systems biology. Part I: Gene regulatory and metabolic networks
Seminars in Cell & Developmental Biology, 2015
"A Mathematical Theory of Communication&... more "A Mathematical Theory of Communication", was published in 1948 by Claude Shannon to establish a framework that is now known as information theory. In recent decades, information theory has gained much attention in the area of systems biology. The aim of this paper is to provide a systematic review of those contributions that have applied information theory in inferring or understanding of biological systems. Based on the type of system components and the interactions between them, we classify the biological systems into 4 main classes: gene regulatory, metabolic, protein-protein interaction and signaling networks. In the first part of this review, we attempt to introduce most of the existing studies on two types of biological networks, including gene regulatory and metabolic networks, which are founded on the concepts of information theory.

Information Theory in Systems Biology: Protein-Protein Interaction and Signaling Networks
Seminars in cell & developmental biology, Jan 12, 2015
By the development of information theory in 1948 by Claude Shannon to address the problems in the... more By the development of information theory in 1948 by Claude Shannon to address the problems in the field of data storage and data communication over (noisy) communication channel, it has been successfully applied in many other research areas such as Bioinformatics and Systems Biology. In this manuscript, we attempt to review some of the existing literatures in Systems Biology, which are using the information theory measures in their calculations. As we have reviewed most of the existing information-theoretic methods in gene regulatory and metabolic networks in the first part of the review, so in the second part of our study, the application of information theory in other types of biological networks including protein-protein interaction and signaling networks will be surveyed.

Drug-Target Interaction Prediction from PSSM based Evolutionary Information
Journal of Pharmacological and Toxicological Methods, 2015
The labor-intensive and expensive experimental process of drug-target interaction prediction has ... more The labor-intensive and expensive experimental process of drug-target interaction prediction has motivated many researchers to focus on in silico prediction, which leads to the helpful information in supporting the experimental interaction data. Therefore, they have proposed several computational approaches for discovering new drug-target interactions. Several learning-based methods have been increasingly developed which can be categorized into two main groups: similarity-based and feature-based. In this paper, we firstly use the bi-gram features extracted from the Position Specific Scoring Matrix (PSSM) of proteins in predicting drug-target interactions. Our results demonstrate the high-confidence prediction ability of the Bigram-PSSM model in terms of several performance indicators specifically for enzymes and ion channels. Moreover, we investigate the impact of negative selection strategy on the performance of the prediction, which is not widely taken into account in the other relevant studies. This is important, as the number of non-interacting drug-target pairs are usually extremely large in comparison with the number of interacting ones in existing drug-target interaction data. An interesting observation is that different levels of performance reduction have been attained for four datasets when we change the sampling method from the random sampling to the balanced sampling.
New Algorithms for Online Unit Clustering
We study the online unit clustering problem introduced by Chan and Zarrabi-Zadeh at WAOA 2006. Th... more We study the online unit clustering problem introduced by Chan and Zarrabi-Zadeh at WAOA 2006. The problem in one dimension is as follows: Given a sequence of points on the real line, partition the points into clusters, each enclosable by a unit interval, with the objective of minimizing the number of clusters used. In this paper, we give a brief survey on the existing algorithms for this problem, and compare their efficiency in practice by implementing all deterministic and randomized algorithms proposed thus far for this problem in the literature. Meanwhile, we introduce two new deterministic algorithms that achieve better performance ratios on average in practice.

In silico pharmacology, 2013
With the growing understanding of complex diseases, the focus of drug discovery has shifted away ... more With the growing understanding of complex diseases, the focus of drug discovery has shifted away from the well-accepted "one target, one drug" model, to a new "multi-target, multi-drug" model, aimed at systemically modulating multiple targets. Identification of the interaction between drugs and target proteins plays an important role in genomic drug discovery, in order to discover new drugs or novel targets for existing drugs. Due to the laborious and costly experimental process of drug-target interaction prediction, in silico prediction could be an efficient way of providing useful information in supporting experimental interaction data. An important notion that has emerged in post-genomic drug discovery is that the large-scale integration of genomic, proteomic, signaling and metabolomic data can allow us to construct complex networks of the cell that would provide us with a new framework for understanding the molecular basis of physiological or pathophysiologic...

Drug-target interaction prediction via chemogenomic space: learning-based methods
Expert opinion on drug metabolism & toxicology, 2014
Identification of the interaction between drugs and target proteins is a crucial task in genomic ... more Identification of the interaction between drugs and target proteins is a crucial task in genomic drug discovery. The in silico prediction is an appropriate alternative for the laborious and costly experimental process of drug-target interaction prediction. Developing a variety of computational methods opens a new direction in analyzing and detecting new drug-target pairs. In this review, we will focus on chemogenomic methods which have established a learning framework for predicting drug-target interactions. Learning-based methods are classified into supervised and semi-supervised, and the supervised learning methods are studied as two separate parts including similarity-based methods and feature-based methods. In spite of many improvements for pharmacology applications by learning-based methods, there are many over simplification settings in construction of predictive models that may lead to over-optimistic results on drug-target interaction prediction.

In Silico Pharmacology, 2013
With the growing understanding of complex diseases, the focus of drug discovery has shifted away ... more With the growing understanding of complex diseases, the focus of drug discovery has shifted away from the wellaccepted "one target, one drug" model, to a new "multi-target, multi-drug" model, aimed at systemically modulating multiple targets. Identification of the interaction between drugs and target proteins plays an important role in genomic drug discovery, in order to discover new drugs or novel targets for existing drugs. Due to the laborious and costly experimental process of drug-target interaction prediction, in silico prediction could be an efficient way of providing useful information in supporting experimental interaction data. An important notion that has emerged in postgenomic drug discovery is that the large-scale integration of genomic, proteomic, signaling and metabolomic data can allow us to construct complex networks of the cell that would provide us with a new framework for understanding the molecular basis of physiological or pathophysiological states. An emerging paradigm of polypharmacology in the postgenomic era is that drug, target and disease spaces can be correlated to study the effect of drugs on different spaces and their interrelationships can be exploited for designing drugs or cocktails which can effectively target one or more disease states. The future goal, therefore, is to create a computational platform that integrates genome-scale metabolic pathway, protein-protein interaction networks, gene transcriptional analysis in order to build a comprehensive network for multi-target multi-drug discovery.

Proceeding of the 6th international workshop on Automation of software test - AST '11, 2011
In this paper, a new approach for analyzing program behavioral graphs to detect fault relevant pa... more In this paper, a new approach for analyzing program behavioral graphs to detect fault relevant paths is presented. The existing graph mining approaches for bug localization merely detect discriminative sub-graphs between failing and passing runs. However, they are not applicable when the context of a failure is not appeared in a discriminative pattern. In our proposed method, the suspicious transitions are identified by contrasting nearest neighbor failing and passing dynamic behavioral graphs. For finding similar failing and passing graphs, we first convert the graphs into adequate vectors. Then, a combination of Jacard-Cosine similarity measures is applied to identify the nearest graphs. The new scoring formula takes advantage of null hypothesis testing for ranking weighted transitions. The main advantage of the proposed technique is its scalability which makes it work on large and complex programs with huge number of predicates. Another main capability of our approach is providing the faulty paths constructed from fault suspicious transitions. Considering the weighted execution graphs in the analysis enables us to find those types of bugs which reveal themselves in specific number of transitions between two particular predicates. The experimental results on Siemens test suite and Space program manifest the effectiveness of the proposed method on weighted execution graphs for locating bugs in comparison with other methods.
Uploads
Papers by Zaynab Mousavian