Yale University
Molecular Biophysics and Biochemistry
Data reduction techniques are now a vital part of numerical analysis and principal component analysis is often used to identify important molecular features from a set of descriptors. We now take a different approach and apply data... more
The transcriptome is the readout of the genome. Identifying common features in it across distant species can reveal fundamental principles. To this end, the ENCODE and modENCODE consortia have generated large amounts of matched... more
The topology of the gene-regulatory network has been extensively analyzed. Now, given the large amount of available functional genomic data, it is possible to go beyond this and systematically study regulatory circuits in terms of logic... more
Background: Pseudogenes have long been considered as nonfunctional genomic sequences. However, recent evidence suggests that many of them might have some form of biological activity, and the possibility of functionality has increased... more
SUMMARYH2A.Z is a H2A-type histone variant essential for many aspects of cell biology ranging from gene expression to genome stability. From deuterostomes, H2A.Z evolved into two paralogues H2A.Z.1 and H2A.Z.2 that differ by only three... more
The 1000 Genomes Project Consortium* A major use of the 1000 Genomes Project (1000GP) data is genotype imputation in genomewide association studies (GWAS). Here we develop a method to estimate haplotypes from low-coverage sequencing data... more
The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. Here we report completion of the... more
Endometriosis is a well-known risk factor for ovarian cancer. The genetic changes that characterise endometriosis are poorly understood; however, the mechanistic target of rapamycin (mTOR) pathway is involved. In this study, we... more
The accurate identification and description of the genes in the human and mouse genomes is a fundamental requirement for high quality analysis of data informing both genome biology and clinical genomics. Over the last 15 years, the... more
The most commonly employed mammalian model organism is the laboratory mouse. A wide variety of genetically diverse inbred mouse strains, representing distinct physiological states, disease susceptibilities, and biological mechanisms have... more
Background: Liquid biopsies offer a promising alternative to tissue samples, providing non-invasive diagnostic approaches or serial monitoring of disease evolution. However, certain challenges remain, and the full potential of liquid... more
In a 2018 paper posted to bioRxiv, Pertea et al. presented the CHESS database, a new catalog of human gene annotations that includes 1,178 new protein-coding predictions. These are based on evidence of transcription in human tissues and... more
Pseudogenes are ideal markers of genome remodeling. In turn, the mouse is an ideal platform for studying them, particularly with the availability of developmental transcriptional data and the sequencing of 18 strains. Here, we present a... more
The radial spatial positioning of individual gene loci within interphase nuclei has been associated with up- and downregulation of their expression. In cancer, the genome organization may become disturbed due to chromosomal abnormalities,... more
Pseudogenes are ideal markers of genome remodelling. In turn, the mouse is an ideal platform for studying them, particularly with the recent availability of strain-sequencing and transcriptional data. Here, combining both manual curation... more
Infection by the severe acute respiratory syndrome (SARS) coronavirus-2 (SARS-CoV-2) is the causative agent of a new disease (COVID-19). The risk of severe COVID-19 is increased by certain underlying comorbidities, including asthma,... more
The human and mouse genomes contain instructions that specify RNAs and proteins and govern the timing, magnitude, and cellular context of their production. To better delineate these elements, phase III of the Encyclopedia of DNA Elements... more