Papers by Marco Grzegorczyk

Oxford University Press eBooks, Oct 6, 2011
A convenient way of modelling complex interactions is by employing graphs or networks which corre... more A convenient way of modelling complex interactions is by employing graphs or networks which correspond to conditional independence structures in an underlying statistical model. One main class of models in this regard are Bayesian networks, which have the drawback of making parametric assumptions. Bayesian nonparametric mixture models offer a possibility to overcome this limitation, but have hardly been used in combination with networks. This manuscript brigdes this gap by introducing nonparametric Bayesian network models. We review (parametric) Bayesian networks, in particular Gaussian Bayesian networks, from a Bayesian perspective as well as nonparametric Bayesian mixture models. Afterwards these two modelling approaches are combined into nonparametric Bayesian networks. The new models are compared both to Gaussian Bayesian networks and to mixture models in a simulation study, where it turns out that the nonparametric network models perform favorably in non Gaussian situations. The new models are also applied to an example from systems biology.

Bayesian Statistics 9, 2011
A convenient way of modelling complex interactions is by employing graphs or networks which corre... more A convenient way of modelling complex interactions is by employing graphs or networks which correspond to conditional independence structures in an underlying statistical model. One main class of models in this regard are Bayesian networks, which have the drawback of making parametric assumptions. Bayesian nonparametric mixture models offer a possibility to overcome this limitation, but have hardly been used in combination with networks. This manuscript brigdes this gap by introducing nonparametric Bayesian network models. We review (parametric) Bayesian networks, in particular Gaussian Bayesian networks, from a Bayesian perspective as well as nonparametric Bayesian mixture models. Afterwards these two modelling approaches are combined into nonparametric Bayesian networks. The new models are compared both to Gaussian Bayesian networks and to mixture models in a simulation study, where it turns out that the nonparametric network models perform favorably in non Gaussian situations. The new models are also applied to an example from systems biology.

Journal of Environmental Management, Dec 1, 2020
Drought is a complex natural hazard. It occurs due to a prolonged period of deficient in rainfall... more Drought is a complex natural hazard. It occurs due to a prolonged period of deficient in rainfall amount in a certain region. Unlike other natural hazards, drought hazard has a recurrent occurrence. Therefore, comprehensive drought monitoring is essential for regional climate control and water management authorities. In this paper, we have proposed a new drought indicator: the Seasonally Combinative Regional Drought Indicator (SCRDI). The SCRDI integrates Bayesian networking theory with Standardized Precipitation Temperature Index (SPTI) at varying gauge stations in various month/seasons. Application of SCRDI is based on five gauging stations of Northern Area of Pakistan. We have found that the proposed indicator accounts the effect of climate variation within a specified territory, accurately characterizes drought by capturing seasonal dependencies in geospatial variation scenario, and reduces the large/complex data for future drought monitoring. In summary, the proposed indicator can be used for comprehensive characterization and assessment of drought at a certain region.
We assess the accuracy of three established regression methods for reconstructing gene and protei... more We assess the accuracy of three established regression methods for reconstructing gene and protein regulatory networks in the context of circadian regulation. Data are simulated from a recently published regulatory network of the circadian clock in Arabidopsis thaliana, in which protein and gene interactions are described by a Markov jump process based on Michaelis-Menten kinetics. We closely follow recent experimental protocols, including the entrainment of seedlings to different light-dark cycles and the knock-out of various key regulatory genes. Our study provides relative assessment scores for the comparison of state-of-the art regression methods, investigates the influence of systematically missing values related to unknown protein concentrations and mRNA transcription rates, and quantifies the dependence of the performance on the degree of recurrency.
Supplementary material for our Bioinformatics paper: 'Improvements in the reconstruction of time-varying gene regulatory networks: dynamic programming and regularization by information sharing among genes
We propose a novel dynamic Bayesian network approach for modelling non-homogeneous and non-linear... more We propose a novel dynamic Bayesian network approach for modelling non-homogeneous and non-linear dynamic gene-regulatory processes. The new approach is based on a change-point process and a mixture model, using latent variables to assign individual measurements to different components. The practical inference follows the Bayesian paradigm, and we use small synthetic dynamic network domains to demonstrate emprically that this new method reduces the susceptibility to spurious feedback loops. Finally we apply the new method to a real gene expression data set from Arabidopsis thaliana.

On the more generalized non‐parametric framework for the propagation of uncertainty in drought monitoring
Meteorological Applications, May 1, 2020
Drought has a complex climatic and spatio‐temporal feature. Therefore, its accurate monitoring is... more Drought has a complex climatic and spatio‐temporal feature. Therefore, its accurate monitoring is a great challenge for hydrological research. Recently, the use of standardized drought indices (SDIs) for drought monitoring is common in practice. However, because of the subjective choices of probability distribution, the uncertainty related to extreme events always exists in SDIs‐based drought‐monitoring tools. The present research extends the generalized non‐parametric framework for drought monitoring. The application of the proposed framework is based on seven meteorological stations in Pakistan. The preliminary analysis considered the standardized precipitation temperature index (SPTI) at different time scales. The significance of the proposed framework is to address extreme values with more accuracy under a non‐parametric framework. It is concluded that the suitable choice of probability‐plotting‐position formulas allows greater accuracy when capturing the probability of extreme drought events.

Theoretical and Applied Climatology, Dec 20, 2019
Drought is a complex natural hazard that has been recurrently occurred in many regions across the... more Drought is a complex natural hazard that has been recurrently occurred in many regions across the globe. Therefore, precise drought characterization and its regional monitoring are key challenges for advanced water management and hydrological research. In this research, we provided a novel method to improve annual average time series data for the Standardized Drought Index (SDI)-type drought monitoring tools. We proposed multi-auxiliary information-based estimation strategy that improves annual moving average/total precipitation time series records. Therefore, we incorporated a minimum and maximum temperature as auxiliary variables under multi-auxiliary regression estimator. In summary, this study propagates a new drought index named: the Precision-Weighted Standardized Precipitation Index (PWSDI). We evaluated the performance of PWSDI for 10 meteorological stations in Pakistan. We found that improved estimates of temporal precipitation time series are good candidates for modelling and monitoring hydrological drought at the regional settings under SDI procedure.

Overview and Evaluation of Recent Methods for Statistical Inference of Gene Regulatory Networks from Time Series Data
Methods in molecular biology, Dec 14, 2018
A challenging problem in systems biology is the reconstruction of gene regulatory networks from p... more A challenging problem in systems biology is the reconstruction of gene regulatory networks from postgenomic data. A variety of reverse engineering methods from machine learning and computational statistics have been proposed in the literature. However, deciding on the best method to adopt for a particular application or data set might be a confusing task. The present chapter provides a broad overview of state-of-the-art methods with an emphasis on conceptual understanding rather than a deluge of mathematical details, and the pros and cons of the various approaches are discussed. Guidance on practical applications with pointers to publicly available software implementations are included. The chapter concludes with a comprehensive comparative benchmark study on simulated data and a real-work application taken from the current plant systems biology.
results show that for low perturbations (ε→0), the proposed method outperforms uncoupled NH-DBNs,... more results show that for low perturbations (ε→0), the proposed method outperforms uncoupled NH-DBNs, while for large perturbations (ε→ 1) it outperforms conventional homogeneous DBNs
Absolute β-catenin concentrations in Wnt pathway-stimulated and non-stimulated cells
Biomarkers, 2006
... Abraham SC, Wu TT, Hruban RH, Lee JH, Yeo CJ, Conlon K, Brennan M, Cameron JL, Klimstra DS. G... more ... Abraham SC, Wu TT, Hruban RH, Lee JH, Yeo CJ, Conlon K, Brennan M, Cameron JL, Klimstra DS. Genetic and immunohistochemical analysis of pancreatic acinar cell carcinoma: frequent allelic loss on chromosome 11p and alterations in the APC/beta-catenin pathway. ...

Tellus A, 2019
Drought is a complex natural hazard. Its several adverse impacts are prevailing in almost all cli... more Drought is a complex natural hazard. Its several adverse impacts are prevailing in almost all climatic zones around the world. In this regards, drought monitoring and forecasting play a vital role in making drought mitigation policies. Therefore, several drought monitoring tools based on probabilistic models had been developed for precise and accurate inferences of drought severity and its effects. However, risk of inaccurate determination of drought classes always exists in probabilistic models. To overcome this issue, we proposed a new system based Probabilistic Weighted Joint Aggregative Drought Index (PWJADI) criterion for three multi-scalar drought indices, namely Standardized Precipitation Index (SPI), Standardized Precipitation Temperature Index (SPTI), and Standardized Precipitation Evapotranspiration Index (SPEI) at one-month time scale. By the basic assumption of the Markov chain, the PWJADI is based on the temporal switched weights that are propagated from the transition probability matrix of each temporal classification of drought index. Application of the proposed method is made for three meteorological stations of Pakistan. We found that our proposed model has ability to restructure the drought classes by capturing and bending the information from the historical behaviour of each drought class. Consequently, to make accurate and precise drought mitigation policies, the proposed method may integrate into effective drought monitoring systems.

An Introduction to Gaussian Bayesian Networks
Humana Press eBooks, 2010
The extraction of regulatory networks and pathways from postgenomic data is important for drug -d... more The extraction of regulatory networks and pathways from postgenomic data is important for drug -discovery and development, as the extracted pathways reveal how genes or proteins regulate each other. Following up on the seminal paper of Friedman et al. (J Comput Biol 7:601-620, 2000), Bayesian networks have been widely applied as a popular tool to this end in systems biology research. Their popularity stems from the tractability of the marginal likelihood of the network structure, which is a consistent scoring scheme in the Bayesian context. This score is based on an integration over the entire parameter space, for which highly expensive computational procedures have to be applied when using more complex -models based on differential equations; for example, see (Bioinformatics 24:833-839, 2008). This chapter gives an introduction to reverse engineering regulatory networks and pathways with Gaussian Bayesian networks, that is Bayesian networks with the probabilistic BGe scoring metric [see (Geiger and Heckerman 235-243, 1995)]. In the BGe model, the data are assumed to stem from a Gaussian distribution and a normal-Wishart prior is assigned to the unknown parameters. Gaussian Bayesian network methodology for analysing static observational, static interventional as well as dynamic (observational) time series data will be described in detail in this chapter. Finally, we apply these Bayesian network inference methods (1) to observational and interventional flow cytometry (protein) data from the well-known RAF pathway to evaluate the global network reconstruction accuracy of Bayesian network inference and (2) to dynamic gene expression time series data of nine circadian genes in Arabidopsis thaliana to reverse engineer the unknown regulatory network topology for this domain.
Bioinformatics, Jul 28, 2008
We propose a novel dynamic Bayesian network approach for modelling non-homogeneous and non-linear... more We propose a novel dynamic Bayesian network approach for modelling non-homogeneous and non-linear dynamic gene-regulatory processes. The new approach is based on a change-point process and a mixture model, using latent variables to assign individual measurements to different components. The practical inference follows the Bayesian paradigm, and we use small synthetic dynamic network domains to demonstrate emprically that this new method reduces the susceptibility to spurious feedback loops. Finally we apply the new method to a real gene expression data set from Arabidopsis thaliana.

Wiley-VCH Verlag GmbH & Co. KGaA eBooks, Sep 16, 2008
Background: To infer gene regulatory networks from time series gene profiles, two important tasks... more Background: To infer gene regulatory networks from time series gene profiles, two important tasks that are related to biological systems must be undertaken. One task is to determine a valid network structure that has topological properties that can influence the network dynamics profoundly. The other task is to optimize the network parameters to minimize the accumulated discrepancy between the gene expression data and the values produced by the inferred network model. Though the above two tasks must be conducted simultaneously, most existing work addresses only one of the tasks. Results: We propose an iterative approach that couples parameter identification and parameter optimization techniques, to address the two tasks simultaneously during network inference. This approach first identifies the most influential parameters against internal perturbations; this identification is based on sensitivity measurements. Then, a hybrid GA-PSO optimization method infers parameters in accordance with their criticalities. The proposed approach has been applied to several datasets, including subsets of the SOS DNA repair system in E. coli, the Rat central nervous system (CNS), and the protein glycosylation system of yeast S. cerevisiae. The result and analysis show that our approach can infer solutions to satisfy both the requirements of network structure and network behavior. Conclusions: Network structure is an important though challenging issue to address in inferring sophisticated networks with biological details. In need of prior structural knowledge, we turn to measure parameter sensitivity instead to account for the network structure in an indirect way. By developing an integrated approach for considering both the network structure and behavior in the inference process, we can successfully infer critical gene interactions as well as valid time expression profiles.

Neural Information Processing Systems, Dec 7, 2009
Dynamic Bayesian networks have been applied widely to reconstruct the structure of regulatory pro... more Dynamic Bayesian networks have been applied widely to reconstruct the structure of regulatory processes from time series data. The standard approach is based on the assumption of a homogeneous Markov chain, which is not valid in many realworld scenarios. Recent research efforts addressing this shortcoming have considered undirected graphs, directed graphs for discretized data, or over-flexible models that lack any information sharing among time series segments. In the present article, we propose a non-stationary dynamic Bayesian network for continuous data, in which parameters are allowed to vary among segments, and in which a common network structure provides essential information sharing across segments. Our model is based on a Bayesian multiple change-point process, where the number and location of the change-points is sampled from the posterior distribution.
Being Bayesian about learning Gaussian Bayesian networks from incomplete data
International Journal of Approximate Reasoning, Sep 1, 2023
Scientific Reports, Mar 9, 2022
Uploads
Papers by Marco Grzegorczyk