Papers by Costas Bouyioukos
SETDB1 modulates the TGFbeta response in Duchenne muscular dystrophy myotubes (DMD)
Zenodo (CERN European Organization for Nuclear Research), Jan 3, 2024

Proceedings of the ... AAAI Conference on Artificial Intelligence, Mar 24, 2024
In recent years, significant progress has been made in the field of protein function prediction w... more In recent years, significant progress has been made in the field of protein function prediction with the development of various machine-learning approaches. However, most existing methods formulate the task as a multi-classification problem, i.e. assigning predefined labels to proteins. In this work, we propose a novel approach, Prot2Text, which predicts a protein's function in a free text style, moving beyond the conventional binary or categorical classifications. By combining Graph Neural Networks(GNNs) and Large Language Models(LLMs), in an encoder-decoder framework, our model effectively integrates diverse data types including protein sequence, structure, and textual annotation and description. This multimodal approach allows for a holistic representation of proteins' functions, enabling the generation of detailed and accurate functional descriptions. To evaluate our model, we extracted a multimodal protein dataset from SwissProt, and demonstrate empirically the effectiveness of Prot2Text. These results highlight the transformative impact of multimodal models, specifically the fusion of GNNs and LLMs, empowering researchers with powerful tools for more accurate function prediction of existing as well as first-to-see proteins.

In recent years, significant progress has been made in the field of protein function prediction w... more In recent years, significant progress has been made in the field of protein function prediction with the development of various machine-learning approaches. However, most existing methods formulate the task as a multi-classification problem, i.e. assigning predefined labels to proteins. In this work, we propose a novel approach, Prot2Text, which predicts a protein's function in a free text style, moving beyond the conventional binary or categorical classifications. By combining Graph Neural Networks(GNNs) and Large Language Models(LLMs), in an encoder-decoder framework, our model effectively integrates diverse data types including protein sequence, structure, and textual annotation and description. This multimodal approach allows for a holistic representation of proteins' functions, enabling the generation of detailed and accurate functional descriptions. To evaluate our model, we extracted a multimodal protein dataset from SwissProt, and demonstrate empirically the effectiveness of Prot2Text. These results highlight the transformative impact of multimodal models, specifically the fusion of GNNs and LLMs, empowering researchers with powerful tools for more accurate function prediction of existing as well as first-to-see proteins.
bioRxiv (Cold Spring Harbor Laboratory), Jun 28, 2023
Highlights • TGFβ induces nuclear accumulation of SETDB1 in healthy myotubes • SETDB1 is enriched... more Highlights • TGFβ induces nuclear accumulation of SETDB1 in healthy myotubes • SETDB1 is enriched in DMD myotube nuclei with intrinsic TGFβ pathway overactivation • SETDB1 LOF in DMD myotubes attenuates TGFβ-induced pro-fibrotic response • Secretome of TGFb-treated DMD myotubes with SETDB1 LOF is less deleterious on myoblast differentiation .
The Cytoplasmic Fraction of the Histone Lysine Methyltransferase Setdb1 is Essential for Embryonic Stem Cells
iScience, Aug 1, 2023

SUMMARYThe histone lysine methyltransferase SETDB1 is involved in muscle differentiation and has ... more SUMMARYThe histone lysine methyltransferase SETDB1 is involved in muscle differentiation and has been shown as a regulator of the TGFβ pathway in cancer contexts. Here, we investigated the role of SETDB1 in Duchenne muscular dystrophy (DMD) myotubes harboring an overactivated TGFβ pathway. Our data show that challenging healthy myotubes with TGFβ induces nuclear accumulation of SETDB1, while in DMD myotubes SETDB1 is constantly accumulated in nuclei in TGFβ-dependent fashion. SETDB1 loss-of-function (LOF) leads to a decrease of the TGFβ downstream target genes in DMD myotubes, suggesting the involvement of a TGFβ/SETDB1 axis in DMD physiopathology. Transcriptomics show that many targets of SETDB1 code for secreted factors involved in fibrosis, extracellular matrix remodeling and inflammation and are downregulated when SETDB1 is silenced. Conditioned medium assays show that SETDB1 LOF in DMD myotubes abrogates the deleterious effect of the secretome on myoblast differentiation and im...

Frontiers in Endocrinology, Aug 5, 2022
Pancreatic beta cell response to glucose is critical for the maintenance of normoglycemia. A stro... more Pancreatic beta cell response to glucose is critical for the maintenance of normoglycemia. A strong transcriptional response was classically described in rodent models but, interestingly, not in human cells. In this study, we exposed human pancreatic beta cells to an increased concentration of glucose and analysed at a global level the mRNAs steady state levels and their translationalability. Polysome profiling analysis showed an early acute increase in protein synthesis and a specific translation regulation of more than 400 mRNAs, independently of their transcriptional regulation. We clustered the co-regulated mRNAs according to their behaviour in translation in response to glucose and discovered common structural and sequence mRNA features. Among them mTOR-and eIF2-sensitive elements have a predominant role to increase mostly the translation of mRNAs encoding for proteins of the translational machinery. Furthermore, we show that mTOR and eIF2a pathways are independently regulated in response to glucose, participating to a translational reshaping to adapt beta cell metabolism. The early acute increase in the translation machinery components prepare the beta cell for further protein demand due to glucose-mediated metabolism changes.
A pipeline for cloning resistance genes effective against African Puccinia graminis tritici races from the diploid wheat relative Aegilops sharonensis
The Two Blades group aims to develop durable resistance to the wheat stem rust fungus Puccinia gr... more The Two Blades group aims to develop durable resistance to the wheat stem rust fungus Puccinia graminis f. sp. tritici. Our approach is based on exploiting novel major dominant R genes from species related to wheat. These genes have received less attention in recent years because of their race specificity and propensity to break down in the field. We are developing resources in Aegilops sharonensis, a diploid relative of wheat, to facilitate cloning of stem rust R genes. We plan to work with partners to deploy the genes in multi- ...
MotivationNetwork biology is a dominant player in today’s multi-omics era. Therefore, the need fo... more MotivationNetwork biology is a dominant player in today’s multi-omics era. Therefore, the need for visualization tools which can efficiently cope with intra-network heterogeneity emerges.ResultsNORMA-2.0 is a web application which uses efficient layouts to group together areas of interest in a network. In this version, NORMA-2.0 utilizes three different strategies to make such groupings as distinct as possible while it preserves all of the properties from its first version where one can handle multiple networks and annotation files simultaneously.AvailabilityThe web resource is available at http://norma.pavlopouloslab.info/The source code is freely available at https://github.com/PavlopoulosLab/NORMAContactpavlopoulos@fleming.gr
Strategy to obtain extended representation of the expressed <i>Ae</i><i>. sharonensis</i> leaf gene space
<p>Overview showing the sequence input (far left), assembly program (Newbler or CAP3; top),... more <p>Overview showing the sequence input (far left), assembly program (Newbler or CAP3; top), and parameters (default, strict or relaxed; see Methods), for each assembly, and extraction of a non-redundant sequence set.</p
Preprocessing of 454 transcriptome reads
<p>Preprocessing of 454 transcriptome reads.</p
Analysis of heterozygosity in <i>Ae sharonensis</i> accessions 1644 and 2232
<p>Histograms of the ratio between the most and least frequently observed alleles of indivi... more <p>Histograms of the ratio between the most and least frequently observed alleles of individual SNP with read coverage of ≥20 between accessions (A) 1644 and (B) 2232 using the ‘Combined Newbler Default Isotigs’. Heterozygous SNP-bearing <i>Unigenes</i> were identified using a ratio threshold of 3.0 (most versus least frequently observed allele within a genotype) and mapped using best reciprocal BLAST hits to genes in the barley consensus genetic map (C). Dot size reflects number of SNPs in a genetic bin (see legend). </p
Predicted NB-LRR proteins from <i>Ae</i><i>. sharonensis</i> assemblies and BRBH against grass NB-LRRs
<p>Predicted NB-LRR proteins from <i>Ae</i><i>. sharonensis</i> ass... more <p>Predicted NB-LRR proteins from <i>Ae</i><i>. sharonensis</i> assemblies and BRBH against grass NB-LRRs.</p

Polysome profile glucose beta-cells
Polysome profile analysis of human pancreatic beta-cells exposure to glucose A human pancreatic b... more Polysome profile analysis of human pancreatic beta-cells exposure to glucose A human pancreatic beta cell line (betaH2) was exposed to glucose after starvation.<br> Total RNA was collected for a bulk RNA-seq analysis. Three biological repeats were produced for each of the the low and high glucose conditions resulting in to 6 sequence (fastq) files.<br> Three different mRNA-ribosomal fractions were selected from a sucrose gradient corresponding to single ribosomes (monosomes), light ribosomes (2-4) and heavy ribosomes (5 and more).<br> Three biological repeats were produced for each fraction and for the two conditions (low and high glucose) resulting to 18 different (fastq) sequenced files. In total the study consists of 24 submitted fastq files, 6 corresponding to total RNA from two conditions (three replicates) and 18 corresponding to polysome profile, 3 fractions times two conditions times 3 replicates. All 24 *.fastq.gz files are available in the reads_PolysomeP...
Additional file 2 of Analysis tools for the interplay between genome layout and regulation
SMODIA2014-S-Bouyioukos-S2.png. A screen capture of the main window of GREAT:SCAN:PATTERNS on the... more SMODIA2014-S-Bouyioukos-S2.png. A screen capture of the main window of GREAT:SCAN:PATTERNS on the iSSB abSYNTH server with all the available command line parameters as options in the web form and the results of the example data (loaded by clicking the link "Try with example data"). (PNG 280 kb)
Additional file 1 of Analysis tools for the interplay between genome layout and regulation
SMODIA2014-S-Bouyioukos-S1.pdf. The full help message of GREAT:SCAN:PATTERNS command line help me... more SMODIA2014-S-Bouyioukos-S1.pdf. The full help message of GREAT:SCAN:PATTERNS command line help message. All the available command line options are specified and are mirrored in the online version of the tool. The document provides extended description of each of the command line parameters. (PDF 56.7 kb)
Ribpropipe-Alpha-1.0.0
A best practices Ribosome profiling analysis pipeline incorporating docker and minimum installati... more A best practices Ribosome profiling analysis pipeline incorporating docker and minimum installation and intervention by the users.
Polysome profile beta-cells glucose data files
The raw counts table ( countsTOTALS_CodingGenes.tsv) and the Transcripts Per Million table (TPMs)... more The raw counts table ( countsTOTALS_CodingGenes.tsv) and the Transcripts Per Million table (TPMs) ( polysomeProfile_TPM_proteinCoding.csv) files of the data produced from the polysome profile and RNA-seq experiments described in the experimental design that can be found in:<br> https://github.com/parisepigenetics/Translatome_Bcells_glucose

Aegilops sharonensis Eig (Sharon goatgrass) is a wild diploid relative of wheat within the Sitops... more Aegilops sharonensis Eig (Sharon goatgrass) is a wild diploid relative of wheat within the Sitopsis section of Aegilops. This species represents an untapped reservoir of genetic diversity for traits of agronomic importance, especially as a source of novel disease resistance. To gain a foothold in this genetic resource, we sequenced the cDNA from leaf tissue of two geographically distinct Ae. sharonensis accessions (1644 and 2232) using the 454 Life Sciences platform. We compared the results of two different assembly programs using different parameter sets to generate 13 distinct assemblies in an attempt to maximize representation of the gene space in de novo transcriptome assembly. The most sensitive assembly (71,029 contigs; N50 674 nts) retrieved 18,684 unique best reciprocal BLAST hits (BRBH) against six previously characterised grass proteomes while the most specific assembly (30,609 contigs; N50 815 nts) retrieved 15,687 BRBH. We combined these two assemblies into a set of 62,2...

Embryonic stem cells (ESCs) fate is regulated both at transcriptional and post-transcriptional le... more Embryonic stem cells (ESCs) fate is regulated both at transcriptional and post-transcriptional levels. Indeed, several studies showed that, in addition to gene transcription, mRNA stability and protein synthesis are finely tuned and strongly control the ESCs pluripotency and fate changes. An increasing number of RNA-binding proteins (RBPs) involved in post-transcriptional and translational regulation of gene expression has been identified as regulators of ESC identity. The major lysine methyltransferase Setdb1 is essential for the self-renewal and viability of ESCs. Setdb1 was primarily known to methylate the lysine 9 of histone 3 (H3K9) in the nucleus, where it regulates chromatin functions. However, Setdb1 is also massively localized in the cytoplasm, including in mouse ESCs, where its role remains unknown. Here we show that the cytoplasmic Setdb1 (cSetdb1) is essential for the survival of mESCs. Functional assays further demonstrate that cSetdb1 regulates gene expression post-tra...
Uploads
Papers by Costas Bouyioukos