Transcriptome data reveal conserved patterns of fruiting body development and response to heat stress in the mushroom-forming fungus Flammulina filiformis
Genome-Wide Characterization of the C-repeat Binding Factor (CBF) Gene Family Involved in the Response to Abiotic Stresses in Tea Plant (Camellia sinensis)
Tea is among the world’s most widely consumed non-alcoholic beverages and possesses enormous econ... more Tea is among the world’s most widely consumed non-alcoholic beverages and possesses enormous economic, health, and cultural values. It is produced from the cured leaves of tea plants, which are important evergreen crops globally cultivated in over 50 countries. Along with recent innovations and advances in biotechnologies, great progress in tea plant genomics and genetics has been achieved, which has facilitated our understanding of the molecular mechanisms of tea quality and the evolution of the tea plant genome. In this review, we briefly summarize the achievements of the past two decades, which primarily include diverse genome and transcriptome sequencing projects, gene discovery and regulation studies, investigation of the epigenetics and noncoding RNAs, origin and domestication, phylogenetics and germplasm utilization of tea plant as well as newly developed tools/platforms. We also present perspectives and possible challenges for future functional genomic studies that will cont...
African wild rice Oryza longistaminata, one of the eight AA-genome species in the genus Oryza, po... more African wild rice Oryza longistaminata, one of the eight AA-genome species in the genus Oryza, possesses highly valued traits, such as the rhizomatousness for perennial rice breeding, strong tolerance to biotic and abiotic stresses, and high biomass production on poor soils. To obtain the high-quality reference genome for O. longistaminata we employed a hybrid assembly approach through incorporating Illumina and PacBio sequencing datasets. The final genome assembly comprised only 107 scaffolds and was approximately ∼363.5 Mb, representing ∼92.7% of the estimated African wild rice genome (∼392 Mb). The N50 lengths of the assembled contigs and scaffolds were ∼46.49 Kb and ∼6.83 Mb, indicating ∼3.72-fold and ∼18.8-fold improvement in length compared to the earlier released assembly (∼12.5 Kb and 364 Kb, respectively). Aided with Hi-C data and syntenic relationship with O. sativa, these assembled scaffolds were anchored into 12 pseudo-chromosomes. Genome annotation and comparative genomic analysis reveal that lineage-specific expansion of gene families that respond to biotic-and abiotic stresses are of great potential for mining novel alleles to overcome major diseases and abiotic adaptation in rice breeding programs.
Asian cultivated rice is believed to have been domesticated from an immediate ancestral progenito... more Asian cultivated rice is believed to have been domesticated from an immediate ancestral progenitor, Oryza rufipogon, which provides promising sources of novel alleles for world rice improvement. Here we first present a high-quality de novo assembly of the typical O. rufipogon genome through the integration of single-molecule sequencing (SMRT), 10× and Hi-C technologies. This chromosome-based reference genome allows a multi-species comparative analysis of the annual selfing O. sativa and its two wild progenitors, the annual selfing O. nivara and perennial outcrossing O. rufipogon, identifying massive numbers of dispensable genes that are functionally enriched in reproductive process. Comparative genomic analyses identified millions of genomic variants, of which large-effect mutations (e.g., SVs, CNV and PAVs) may affect the variation of agronomically significant traits. We demonstrate how lineage-specific expansion of rice gene families may have contributed to the formation of reprod...
Background Tea is the oldest and among the world’s most popular non-alcoholic beverages, which ha... more Background Tea is the oldest and among the world’s most popular non-alcoholic beverages, which has important economic, health and cultural values. Tea is commonly produced from the leaves of tea plants (Camellia sinensis), which belong to the genus Camellia of family Theaceae. In the last decade, many studies have generated the transcriptomes of tea plants at different developmental stages or under abiotic and/or biotic stresses to investigate the genetic basis of secondary metabolites that determine tea quality. However, these results exhibited large differences, particularly in the total number of reconstructed transcripts and the quality of the assembled transcriptomes. These differences largely result from limited knowledge regarding the optimized sequencing depth and assembler for transcriptome assembly of structurally complex plant species genomes. Results We employed different amounts of RNA-sequencing data, ranging from 4 to 84 Gb, to assemble the tea plant transcriptome usi...
Tea is the world's widely consumed non-alcohol beverage with essential economic and health benefi... more Tea is the world's widely consumed non-alcohol beverage with essential economic and health benefits. Confronted with the increasing large-scale omics dataset particularly the genome sequence released in tea plant, the construction of a comprehensive knowledgebase is urgently needed to facilitate the utilization of these datasets towards molecular breeding. We hereby present the first integrative and specially designed web-accessible database, Tea Plant Information Archive (TPIA; http://tpia.teaplant.org). The current release of TPIA employs the comprehensively annotated tea plant genome as framework, and incorporates with abundant well-organized transcriptomes, gene expressions (across species, tissues, and stresses), orthologs, and characteristic metabolites determining tea quality. It also hosts massive Accepted Article This article is protected by copyright. All rights reserved. transcription factors, polymorphic simple sequence repeats, single nucleotide polymorphisms, correlations, manually curated functional genes, and globally collected germplasm information. A variety of versatile analytic tools (e.g. JBrowse, blast, enrichment analysis, etc.) are established helping users to perform further comparative, evolutionary, and functional analysis. We show a case application of TPIA that provides novel and interesting insights into the phytochemical content variation of section Thea of genus Camellia under a well-resolved phylogenetic framework. The constructed knowledgebase of tea plant will serve as a central gateway for global tea community to better understand the tea plant biology that largely benefits the whole tea industry. Data content Genomics data Whole genome sequence data. The current release of tea plant genome (Camellia sinensis var. sinensis cv shuchazao) is 3.14 Gb (2.89 Gb without N) that consists of 14,051 scaffolds and 94,321 contigs (Wei et al., 2018) (Table 1). The contig and scaffold N50 is 67.07 kb and 1.39 Mb, respectively. Average GC content is 37.84%. The maximum length of contig and scaffold is 538.75 kb and 7.31 Mb, respectively.
Genome-wide analysis of WRKY family of transcription factors in common bean, Phaseolus vulgaris: Chromosomal localization, structure, evolution and expression divergence
De novo transcriptome sequencing of Camellia sasanqua and the analysis of major candidate genes related to floral traits
Plant Physiology and Biochemistry
Camellia sasanqua is one of the most famous horticultural plants in Camellia (Theaceae) due to it... more Camellia sasanqua is one of the most famous horticultural plants in Camellia (Theaceae) due to its aesthetic appeal as landscape plant. Knowledge regarding the genetic basis of flowering time, floral aroma and color in C. sasanqua is limited, but is essential to breed new varieties with desired floral traits. Here, we described the de novo transcriptome of young leaves, flower buds and flowers of C. sasanqua. A total of 60,127 unigenes were functionally annotated based on the sequence similarity. After analysis, we found that two floral integrator genes, SOC1 and AP1, in flowering time pathway showed evidence of gene family expansion. Compared with 1-deoxy-D-xylulose-5-phosphate pathway, some genes in the mevalonate pathway were most highly expressed, suggesting that this might represent the major pathway for terpenoid biosynthesis related to floral aroma in C. sasanqua. In flavonoid biosynthesis pathway, PAL, CHI, DFR and ANS showing significantly higher expression levels in flowers and flower buds might have important role in regulation of floral color. The top five most transcription factors (TFs) families in C. sasanqua transcriptome were MYB, MIKC, C3H, FAR1 and HD-ZIP, many of which have a direct relationship with floral traits. In addition, we also identified 33,540 simple sequence repeats (SSRs) in the C. sasanqua transcriptome. Collectively, the C. sasanqua transcriptome dataset generated from this study along with the SSR markers provide a new resource for the identification of novel regulatory transcripts and will accelerate the genetic improvement of C. sasanqua breeding programs.
Morchella species are well known world-round as popular and prized edible fungi due to their uniq... more Morchella species are well known world-round as popular and prized edible fungi due to their unique culinary flavor. Recently, several species have been successfully cultivated in China. However, their reproductive modes are still unknown, and their basic biology needs to be elucidated. Here, we use the morel genome information to investigate mating systems and life cycles of fourteen black morel species. Mating type-specific primers were developed to screen and genotype ascospores, hymenia and stipes from 223 ascocarps of the 14 species from Asia and Europe. Our data indicated that they are all heterothallic and their life cycles are predominantly haploid, but sterile haploid fruiting also exists. Ascospores in all species are mostly haploid, homokaryotic, and multinuclear, whereas aborted ascospores without any nuclei were also detected. Interestingly, we monitored divergent spatial distribution of both mating types in natural morel populations and cultivated sites, where the fert...
Tea is the world's oldest and most popular caffeine-containing beverage with immense economic... more Tea is the world's oldest and most popular caffeine-containing beverage with immense economic, medicinal, and cultural importance. Here, we present the first high-quality nucleotide sequence of the repeat-rich (80.9%), 3.02-Gb genome of the cultivated tea tree Camellia sinensis. We show that an extraordinarily large genome size of tea tree is resulted from the slow, steady, and long-term amplification of a few LTR retrotransposon families. In addition to a recent whole-genome duplication event, lineage-specific expansions of genes associated with flavonoid metabolic biosynthesis were discovered, which enhance catechin production, terpene enzyme activation, and stress tolerance, important features for tea flavor and adaptation. We demonstrate an independent and rapid evolution of the tea caffeine synthesis pathway relative to cacao and coffee. A comparative study among 25 Camellia species revealed that higher expression levels of most flavonoid- and caffeine- but not theanine-rel...
To understand the potential genetic basis of highland adaptation of fungal pathogenicity, we pres... more To understand the potential genetic basis of highland adaptation of fungal pathogenicity, we present here the ~116 Mb de novo assembled high-quality genome of Ophiocordyceps sinensis endemic to the Qinghai-Tibetan Plateau. Compared with other plain-dwelling fungi, we find about 3.4-fold inflation of the O. sinensis genome due to a rapid amplification of long terminal repeat retrotransposons that occurred ~38 million years ago in concert with the uplift of the plateau. We also observe massive removal of thousands of genes related to the transport process and energy metabolism. O. sinensis displays considerable lineage-specific expansion of gene families functionally enriched in the adaptability of low-temperature of cold tolerance, fungal pathogenicity and specialized host infection. We detect signals of positive selection for genes involved in peroxidase and hypoxia to enable its highland adaptation. Resequencing and analyzing 31 whole genomes of O. sinensis, representing nearly all...
Panax notoginseng (Burk) F.H. Chen, belonging to the genus Panax (Araliaceae), is one of the most... more Panax notoginseng (Burk) F.H. Chen, belonging to the genus Panax (Araliaceae), is one of the most highly valued medicinal plants in the world. The dried root of this plant, known as ''Sanchi'', is one of the most commonly used traditional Chinese medicines and has been used as a top-class tonic for more than 2000 years. Approximately 60 triterpene saponins that are considered to be the principal bioactive components responsible for the pharmacological features have been isolated from P. notoginseng (Wan et al., 2007). Triterpene saponins are synthesized via the mevalonic acid (MVA) and the 2-C-methyl-D-erythritol 4-phosphate (MEP)-dependent pathways (Zhao et al., 2014). Although several genes likely to be involved in the triterpene saponin biosynthesis have been identified (Luo et al., 2011; Liu et al., 2015), we still lack the overall knowledge of genetic basis of the ginsenoside biosynthetic pathway in P. notoginseng.
Prokaryotes possess a simple genome transcription system that is different from that of eukaryote... more Prokaryotes possess a simple genome transcription system that is different from that of eukaryotes. In chloroplasts (plastids), it is believed that the prokaryotic gene transcription features govern genome transcription. However, the polycistronic operon transcription model cannot account for all the chloroplast genome (plastome) transcription products at whole-genome level, especially regarding various RNA isoforms. By systematically analyzing transcriptomes of plastids of algae and higher plants, and cyanobacteria, we find that the entire plastome is transcribed in photosynthetic green plants, and that this pattern originated from prokaryotic cyanobacteria - ancestor of the chloroplast genomes that diverged about 1 billion years ago. We propose a multiple arrangement transcription model that multiple transcription initiations and terminations combine haphazardly to accomplish the genome transcription followed by subsequent RNA processing events, which explains the full chloroplast...
Camellia reticulata, which is native to Southwest China, is famous for its ornamental flowers and... more Camellia reticulata, which is native to Southwest China, is famous for its ornamental flowers and high-quality seed oil. However, the lack of genomic information for this species has largely hampered our understanding of its key pathways related to oil production, photoperiodic flowering process and pigment biosynthesis. Here, we first sequenced and characterized the transcriptome of a diploid C. reticulata in an attempt to identify genes potentially involved in triacylglycerol biosynthesis (TAGBS), photoperiodic flowering, flavonoid biosynthesis (FlaBS), carotenoid biosynthesis (CrtBS) pathways. De novo assembly of the transcriptome provided a catalog of 141,460 unigenes with a total length of ~96.1 million nucleotides (Mnt) and an N50 of 1080 nt. Of them, 22,229 unigenes were defined as differentially expressed genes (DEGs) across five sequenced tissues. A large number of annotated genes in C. reticulata were found to have been duplicated, and differential expression patterns of t...
Proceedings of the National Academy of Sciences of the United States of America, Jan 30, 2015
Polyploidy, or whole-genome duplication (WGD), serves as a key innovation in plant evolution and ... more Polyploidy, or whole-genome duplication (WGD), serves as a key innovation in plant evolution and is an important genomic feature for all eukaryotes. Neopolyploids have to overcome difficulties in meiosis, genomic alterations, changes of gene expression, and epigenomic reorganization. However, the underlying mechanisms for these processes are poorly understood. One of the most interesting aspects is that genome doubling events increase the dosage of all genes. Unlike allopolyploids entangled by both hybridization and polyploidization, autopolyploids, especially artificial lines, in relatively uniform genetic background offer a model system to understand mechanisms of genome-dosage effects. To investigate DNA methylation effects in response to WGD rather than hybridization, we produced autotetraploid rice with its diploid donor, Oryza sativa ssp. indica cv. Aijiaonante, both of which were independently self-pollinated over 48 generations, and generated and compared their comprehensive...
Uploads
Papers by En-Hua Xia