Abstract
Transposon-directed insertion site sequencing (TraDIS) is a highthroughput method coupling transposon mutagenesis with short-fragment DNA sequencing. It is commonly used to identify essential genes. Single gene deletion libraries are considered the gold standard for identifying essential genes. Currently, the TraDIS method has not been benchmarked against such libraries, and therefore, it remains unclear whether the two methodologies are comparable. To address this, a high-density transposon library was constructed in Escherichia coli K-12. Essential genes predicted from sequencing of this library were compared to existing essential gene databases. To decrease false-positive identification of essential genes, statistical data analysis included corrections for both gene length and genome length. Through this analysis, new essential genes and genes previously incorrectly designated essential were identified. We show that manual analysis of TraDIS data reveals novel features that would not have been detected by statistical analysis alone. Examples include short essential regions within genes, orientation-dependent effects, and fine-resolution identification of genome and protein features. Recognition of these insertion profiles in transposon mutagenesis data sets will assist genome annotation of less well characterized genomes and provides new insights into bacterial physiology and biochemistry.
References (77)
- unclear." A threshold cutoff of log 2 (12) was chosen, as it is more stringent than log 2 (4) (used by Langridge et al. [4]), and consistent with analysis used by Phan et al. (9). Essential gene lists. The Keio essential gene list is composed of the original essential genes minus three open reading frames (ORFs), JW5190, JW5193, and JW5379, as they are not annotated within strain MG1655 and are thought to be spurious, giving a final list of 300 genes (1, 73). The PEC data set is composed of the 300 genes listed as essential for strain W3110 (2). The lists of essential genes were compared using BioVenn (74). Statistical analysis. For details of the statistical analysis, see Text S1, Fig. S1, and Fig. S2 in the supplemental material. Accession number(s). TraDIS sequencing data are available from the European Nucleotide Archive under accession no. PRJEB24436. SUPPLEMENTAL MATERIAL Supplemental material for this article may be found at https://doi.org/10.1128/mBio .02096-17. TEXT S1, DOCX file, 0.1 MB. FIG S1, PDF file, 0.02 MB. FIG S2, TIF file, 0.1 MB. TABLE S1, XLSX file, 0.2 MB. TABLE S2, PDF file, 0.04 MB. TABLE S3, PDF file, 0.03 MB. TABLE S4, XLSX file, 0.3 MB. REFERENCES
- Baba T, Ara T, Hasegawa M, Takai Y, Okumura Y, Baba M, Datsenko KA, Tomita M, Wanner BL, Mori H. 2006. Construction of Escherichia coli K-12 in-frame, single-gene knockout mutants: the Keio collection. Mol Syst Biol 2:2006.0008. https://doi.org/10.1038/msb4100050.
- Yamazaki Y, Niki H, Kato J. 2008. Profiling of Escherichia coli Chromosome database. Methods Mol Biol 416:385-389. https://doi.org/10.1007/978-1 -59745-321-9_26.
- Nguyen BD, Valdivia RH. 2012. Virulence determinants in the obligate intracellular pathogen Chlamydia trachomatis revealed by forward ge- netic approaches. Proc Natl Acad Sci U S A 109:1263-1268. https://doi .org/10.1073/pnas.1117884109.
- Langridge GC, Phan MD, Turner DJ, Perkins TT, Parts L, Haase J, Charles I, Maskell DJ, Peters SE, Dougan G, Wain J, Parkhill J, Turner AK. 2009. Simultaneous assay of every Salmonella Typhi gene using one million transposon mutants. Genome Res 19:2308 -2316. https://doi.org/10 .1101/gr.097097.109.
- van Opijnen T, Bodi KL, Camilli A. 2009. Tn-seq: high-throughput parallel sequencing for fitness and genetic interaction studies in microorgan- isms. Nat Methods 6:767-772. https://doi.org/10.1038/nmeth.1377.
- Gawronski JD, Wong SMS, Giannoukos G, Ward DV, Akerley BJ. 2009. Tracking insertion mutants within libraries by deep sequencing and a genome-wide screen for Haemophilus genes required in the lung. Proc Natl Acad Sci U S A 106:16422-16427. https://doi.org/10.1073/pnas .0906627106.
- Goodman AL, McNulty NP, Zhao Y, Leip D, Mitra RD, Lozupone CA, Knight R, Gordon JI. 2009. Identifying genetic determinants needed to establish a human gut symbiont in its habitat. Cell Host Microbe 6:279 -289. https://doi.org/10.1016/j.chom.2009.08.003.
- Christen B, Abeliuk E, Collier JM, Kalogeraki VS, Passarelli B, Coller JA, Fero MJ, McAdams HH, Shapiro L. 2011. The essential genome of a bacterium. Mol Syst Biol 7:528. https://doi.org/10.1038/msb.2011.58.
- Phan MD, Peters KM, Sarkar S, Lukowski SW, Allsopp LP, Moriel DG, Achard MES, Totsika M, Marshall VM, Upton M, Beatson SA, Schembri MA. 2013. The serum resistome of a globally disseminated multidrug resistant uropathogenic Escherichia coli clone. PLoS Genet 9:e1003834. https://doi.org/10.1371/journal.pgen.1003834.
- Hassan KA, Cain AK, Huang T, Liu Q, Elbourne LDH, Boinett CJ, Brzoska AJ, Li L, Ostrowski M, Nhu NTK, Nhu TDH, Baker S, Parkhill J, Paulsen IT. 2016. Fluorescence-based flow sorting in parallel with transposon inser- tion site sequencing identifies multidrug efflux systems in Acinetobacter baumannii. mBio 7:e01200-16. https://doi.org/10.1128/mBio.01200-16.
- Paulsen IT, Cain AK, Hassan KA. 2017. Physical enrichment of transposon mutants from saturation mutant libraries using the TraDISort ap- proach. Mob Genet Elements 7:1-7. https://doi.org/10.1080/2159256X .2017.1313805.
- Parsons AB, Brost RL, Ding H, Li Z, Zhang C, Sheikh B, Brown GW, Kane PM, Hughes TR, Boone C. 2004. Integration of chemical-genetic and genetic interaction data links bioactive compounds to cellular target pathways. Nat Biotechnol 22:62-69. https://doi.org/10.1038/nbt919.
- Grant AJ, Oshota O, Chaudhuri RR, Mayho M, Peters SE, Clare S, Maskell DJ, Mastroeni P. 2016. Genes required for the fitness of Salmonella enterica serovar Typhimurium during infection of immunodeficient
- Ϫ/Ϫ phox mice. Infect Immun 84:989 -997. https://doi.org/10.1128/ IAI.01423-15.
- Troy EB, Lin T, Gao L, Lazinski DW, Lundt M, Camilli A, Norris SJ, Hu LT. 2016. Global Tn-seq analysis of carbohydrate utilization and vertebrate infectivity of Borrelia burgdorferi. Mol Microbiol 101:1003-1023. https:// doi.org/10.1111/mmi.13437.
- Mann B, van Opijnen T, Wang J, Obert C, Wang Y-D, Carter R, McGoldrick DJ, Ridout G, Camilli A, Tuomanen EI, Rosch JW. 2012. Control of virulence by small RNAs in Streptococcus pneumoniae. PLoS Pathog 8:e1002788. https://doi.org/10.1371/journal.ppat.1002788.
- Grenov AI, Gerdes SY. 2008. Modeling competitive outgrowth of mutant populations: why do essentiality screens yield divergent results? Methods Mol Biol 416:361-367. https://doi.org/10.1007/978-1 -59745-321-9_24.
- Manna D, Porwollik S, McClelland M, Tan R, Higgins NP. 2007. Microarray analysis of Mu transposition in Salmonella enterica, serovar Typhimurium: transposon exclusion by high-density DNA binding pro- teins. Mol Microbiol 66:315-328. https://doi.org/10.1111/j.1365-2958 .2007.05915.x.
- Curtis PD, Brun YV. 2014. Identification of essential alphaproteobacterial genes reveals operational variability in conserved developmental and cell cycle systems. Mol Microbiol 93:713-735. https://doi.org/10.1111/ mmi.12686.
- Solaimanpour S, Sarmiento F, Mrázek J. 2015. Tn-seq explorer: a tool for analysis of high-throughput sequencing data of transposon mutant libraries. PLoS One 10:e0126070. https://doi.org/10.1371/journal.pone .0126070.
- Liu G, Draper GC, Donachie WD. 1998. FtsK is a bifunctional protein involved in cell division and chromosome localization in Escherichia coli. Mol Microbiol 29:893-903. https://doi.org/10.1046/j.1365-2958 .1998.00986.x.
- Dubarry N, Possoz C, Barre F-X. 2010. Multiple regions along the Esche- richia coli FtsK protein are implicated in cell division. Mol Microbiol 78:1088 -1100. https://doi.org/10.1111/j.1365-2958.2010.07412.x.
- Draper GC, McLennan N, Begg K, Masters M, Donachie WD. 1998. Only the N-terminal domain of FtsK functions in cell division. J Bacteriol 180:4621-4627.
- Yu XC, Tran AH, Sun Q, Margolin W. 1998. Localization of cell division protein FtsK to the Escherichia coli septum and identification of a potential N-terminal targeting domain. J Bacteriol 180:1296 -1304.
- Dorazi R, Dewar SJ. 2000. Membrane topology of the N-terminus of the Escherichia coli FtsK division protein. FEBS Lett 478:13-18. https://doi .org/10.1016/S0014-5793(00)01820-2.
- Wang L, Lutkenhaus J. 1998. FtsK is an essential cell division protein that is localized to the septum and induced as part of the SOS response. Mol Microbiol 29:731-740. https://doi.org/10.1046/j.1365-2958.1998.00958.x.
- Murakami A, Nakatogawa H, Ito K. 2004. Translation arrest of SecM is essential for the basal and regulated expression of SecA. Proc Natl Acad Sci U S A 101:12330 -12335. https://doi.org/10.1073/pnas.0404907101.
- DeJesus MA, Ioerger TR. 2013. A Hidden Markov Model for identifying essential and growth-defect regions in bacterial genomes from trans- poson insertion sequencing data. BMC Bioinformatics 14:303. https://doi .org/10.1186/1471-2105-14-303.
- Freed NE, Bumann D, Silander OK. 2016. Combining Shigella Tn-seq data with gold-standard E. coli gene deletion data suggests rare transitions between essential and non-essential gene functionality. BMC Microbiol 16:203. https://doi.org/10.1186/s12866-016-0818-0.
- Zomer A, Burghout P, Bootsma HJ, Hermans PWM, van Hijum SAFT. 2012. ESSENTIALS: software for rapid analysis of high throughput trans- poson insertion sequencing data. PLoS One 7:e43012. https://doi.org/ 10.1371/journal.pone.0043012.
- Sarmiento F, Mrázek J, Whitman WB. 2013. Genome-scale analysis of gene function in the hydrogenotrophic methanogenic archaeon Metha- nococcus maripaludis. Proc Natl Acad Sci U S A 110:4726 -4731. https:// doi.org/10.1073/pnas.1220225110.
- Zhang YJ, Ioerger TR, Huttenhower C, Long JE, Sassetti CM, Sacchettini JC, Rubin EJ. 2012. Global assessment of genomic regions required for growth in Mycobacterium tuberculosis. PLoS Pathog 8:e1002946. https:// doi.org/10.1371/journal.ppat.1002946.
- Lodge J, Fear J, Busby S, Gunasekaran P, Kamini NR. 1992. Broad host range plasmids carrying the Escherichia coli lactose and galactose oper- ons. FEMS Microbiol Lett 74:271-276. https://doi.org/10.1111/j.1574 -6968.1992.tb05378.x.
- Islam MS, Shaw RK, Frankel G, Pallen MJ, Busby SJW. 2012. Translation of a minigene in the 5' leader sequence of the enterohaemorrhagic Esch- erichia coli LEE1 transcription unit affects expression of the neighbouring downstream gene. Biochem J 441:247-253. https://doi.org/10.1042/ BJ20110912.
- Yamamoto N, Nakahigashi K, Nakamichi T, Yoshino M, Takai Y, Touda Y, Furubayashi A, Kinjyo S, Dose H, Hasegawa M, Datsenko KA, Nakayashiki T, Tomita M, Wanner BL, Mori H. 2009. Update on the Keio collection of Escherichia coli single-gene deletion mutants. Mol Syst Biol 5:335. https://doi.org/10.1038/msb.2009.92.
- Claverie-Martin F, Diaz-Torres MR, Yancey SD, Kushner SR. 1991. Analysis of the altered mRNA stability (ams) gene from Escherichia coli. Nucleo- tide sequence, transcriptional analysis, and homology of its product to MRP3, a mitochondrial ribosomal protein from Neurospora crassa. J Biol Chem 266:2843-2851.
- Ow MC, Liu Q, Mohanty BK, Andrew ME, Maples VF, Kushner SR. 2002. RNase E levels in Escherichia coli are controlled by a complex regula- tory system that involves transcription of the rne gene from three promoters. Mol Microbiol 43:159 -171. https://doi.org/10.1046/j.1365 -2958.2002.02726.x.
- Ades SE, Connolly LE, Alba BM, Gross CA. 1999. The Escherichia coli sigma E -dependent extracytoplasmic stress response is controlled by the regulated proteolysis of an anti-sigma factor. Genes Dev 13:2449 -2461. https://doi.org/10.1101/gad.13.18.2449.
- Alba BM, Zhong HJ, Pelayo JC, Gross CA. 2001. degS (hhoB) is an essential Escherichia coli gene whose indispensable function is to pro- vide sigma E activity. Mol Microbiol 40:1323-1333. https://doi.org/10 .1046/j.1365-2958.2001.02475.x.
- Bass S, Gu Q, Christen A. 1996. Multicopy suppressors of prc mutant Escherichia coli include two HtrA (DegP) protease homologs (HhoAB), DksA, and a truncated R1pA. J Bacteriol 178:1154 -1161. https://doi.org/ 10.1128/jb.178.4.1154-1161.1996.
- Waller PR, Sauer RT. 1996. Characterization of degQ and degS, Esche- richia coli genes encoding homologs of the DegP protease. J Bacteriol 178:1146 -1153. https://doi.org/10.1128/jb.178.4.1146-1153.1996.
- Malinverni JC, Silhavy TJ. 2009. An ABC transport system that maintains lipid asymmetry in the Gram-negative outer membrane. Proc Natl Acad Sci U S A 106:8009 -8014. https://doi.org/10.1073/pnas.0903229106.
- Thong S, Ercan B, Torta F, Fong ZY, Wong HYA, Wenk MR, Chng S-S. 2016. Defining key roles for auxiliary proteins in an ABC transporter that maintains bacterial outer membrane lipid asymmetry. Elife 5:e19042. https://doi.org/10.7554/eLife.19042.
- Kato J, Katayama T. 2001. Hda, a novel DnaA-related protein, regulates the replication cycle in Escherichia coli. EMBO J 20:4253-4262. https:// doi.org/10.1093/emboj/20.15.4253.
- Riber L, Olsson JA, Jensen RB, Skovgaard O, Dasgupta S, Marinus MG, Løbner-Olesen A. 2006. Hda-mediated inactivation of the DnaA protein and dnaA gene autoregulation act in concert to ensure homeostatic maintenance of the Escherichia coli chromosome. Genes Dev 20: 2121-2134. https://doi.org/10.1101/gad.379506.
- Camara JE, Skarstad K, Crooke E. 2003. Controlled initiation of chromo- somal replication in Escherichia coli requires functional Hda protein. J Bacteriol 185:3244 -3248. https://doi.org/10.1128/JB.185.10.3244-3248 .2003.
- Wolf J, Gerber AP, Keller W. 2002. tadA, an essential tRNA-specific adenosine deaminase from Escherichia coli. EMBO J 21:3841-3851. https://doi.org/10.1093/emboj/cdf362.
- Bubunenko M, Baker T, Court DL. 2007. Essentiality of ribosomal and transcription antitermination proteins analyzed by systematic gene re- placement in Escherichia coli. J Bacteriol 189:2844 -2853. https://doi.org/ 10.1128/JB.01713-06.
- Durand A, Sinha AK, Dard-Dascot C, Michel B. 2016. Mutations affecting potassium import restore the viability of the Escherichia coli DNA poly- merase III holD mutant. PLOS Genet 12:e1006114. https://doi.org/10 .1371/journal.pgen.1006114.
- Viguera E, Petranovic M, Zahradka D, Germain K, Ehrlich DS, Michel B. 2003. Lethality of bypass polymerases in Escherichia coli cells with a defective clamp loader complex of DNA polymerase III. Mol Microbiol 50:193-204. https://doi.org/10.1046/j.1365-2958.2003.03658.x.
- Duigou S, Silvain M, Viguera E, Michel B. 2014. ssb gene duplication restores the viability of ΔholC and ΔholD Escherichia coli mutants. PLoS Genet 10:e1004719. https://doi.org/10.1371/journal.pgen.1004719.
- Riley M, Abe T, Arnaud MB, Berlyn MK, Blattner FR, Chaudhuri RR, Glasner JD, Horiuchi T, Keseler IM, Kosuge T, Mori H, Perna NT, Plunkett G, Rudd KE, Serres MH, Thomas GH, Thomson NR, Wishart D, Wanner BL. 2006. The Essential Genome of E. coli K-12 ® January/February 2018 Volume 9 Issue 1 e02096-17 mbio.asm.org 17 mbio.asm.org on March 21, 2018 -Published by mbio.asm.org Downloaded from Escherichia coli K-12: a cooperatively developed annotation snapshot- 2005. Nucleic Acids Res 34:1-9. https://doi.org/10.1093/nar/gkj405.
- Hemm MR, Paul BJ, Schneider TD, Storz G, Rudd KE. 2008. Small mem- brane proteins found by comparative genomics and ribosome binding site models. Mol Microbiol 70:1487-1501. https://doi.org/10.1111/j.1365 -2958.2008.06495.x.
- Davies IJ, Drabble WT. 1996. Stringent and growth-rate-dependent con- trol of the gua operon of Escherichia coli K-12. Microbiology 142: 2429 -2437. https://doi.org/10.1099/00221287-142-9-2429.
- De Lay NR, Cronan JE. 2008. Genetic interaction between the Esche- richia coli AcpT phosphopantetheinyl transferase and the YejM inner membrane protein. Genetics 178:1327-1337. https://doi.org/10.1534/ genetics.107.081836.
- Daley DO, Rapp M, Granseth E, Melén K, Drew D, von Heijne G. 2005. Global topology analysis of the Escherichia coli inner membrane pro- teome. Science 308:1321-1323.
- Dalebroux ZD, Edrozo MB, Pfuetzner RA, Ressl S, Kulasekara BR, Blanc M-P, Miller SI. 2015. Delivery of cardiolipins to the Salmonella outer membrane is necessary for survival within host tissues and virulence. Cell Host Microbe 17:441-451. https://doi.org/10.1016/j.chom.2015.03 .003.
- Cano DA, Domínguez-Bernal G, Tierrez A, Garcia-Del Portillo F, Casa- desús J. 2002. Regulation of capsule synthesis and cell motility in Salmonella enterica by the essential gene igaA. Genetics 162:1513-1523.
- Cho S-H, Szewczyk J, Pesavento C, Zietek M, Banzhaf M, Roszczenko P, Asmar A, Laloux G, Hov A-K, Leverrier P, Van der Henst C, Vertommen D, Typas A, Collet J-F. 2014. Detecting envelope stress by monitoring -barrel assembly. Cell 159:1652-1664. https://doi.org/10.1016/j.cell.2014.11.045.
- Harrison CJ, Hayer-Hartl M, Di Liberto M, Hartl F, Kuriyan J. 1997. Crystal structure of the nucleotide exchange factor GrpE bound to the ATPase domain of the molecular chaperone DnaK. Science 276:431-435. https:// doi.org/10.1126/science.276.5311.431.
- Fields S, Song O. 1989. A novel genetic system to detect protein-protein interactions. Nature 340:245-246. https://doi.org/10.1038/340245a0.
- Karimova G, Pidoux J, Ullmann A, Ladant D. 1998. A bacterial two-hybrid system based on a reconstituted signal transduction pathway. Proc Natl Acad Sci U S A 95:5752-5756.
- Cabantous S, Waldo GS. 2006. In vivo and in vitro protein solubility assays using split GFP. Nat Methods 3:845-854. https://doi.org/10.1038/ nmeth932.
- Martorana AM, Sperandeo P, Polissi A, Dehò G. 2011. Complex transcrip- tional organization regulates an Escherichia coli locus implicated in lipopolysaccharide biogenesis. Res Microbiol 162:470 -482. https://doi .org/10.1016/j.resmic.2011.03.007.
- Sperandeo P, Pozzi C, Dehò G, Polissi A. 2006. Non-essential KDO bio- synthesis and new essential cell envelope biogenesis genes in the Escherichia coli yrbG-yhbG locus. Res Microbiol 157:547-558. https://doi .org/10.1016/j.resmic.2005.11.014.
- Datsenko KA, Wanner BL. 2000. One-step inactivation of chromosomal genes in Escherichia coli K-12 using PCR products. Proc Natl Acad Sci U S A 97:6640 -6645. https://doi.org/10.1073/pnas.120163297.
- Chang AC, Cohen SN. 1978. Construction and characterization of ampli- fiable multicopy DNA cloning vehicles derived from the P15A cryptic miniplasmid. J Bacteriol 134:1141-1156.
- Pearson WR, Wood T, Zhang Z, Miller W. 1997. Comparison of DNA sequences with protein sequences. Genomics 46:24 -36. https://doi.org/ 10.1006/geno.1997.4995.
- Bolger AM, Lohse M, Usadel B. 2014. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30:2114 -2120. https://doi.org/10 .1093/bioinformatics/btu170.
- Tatusova T, Ciufo S, Fedorov B, O'Neill K, Tolstoy I. 2014. RefSeq microbial genomes database: new representation and annotation strategy. Nu- cleic Acids Res 42:D553-D559. https://doi.org/10.1093/nar/gkt1274.
- Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R, 1000 Genome Project Data Processing Subgroup. 2009. The Sequence Alignment Map Format and SAMtools. Bioinformat- ics 25:2078 -2079. https://doi.org/10.1093/bioinformatics/btp352.
- Quinlan AR, Hall IM. 2010. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26:841-842. https://doi .org/10.1093/bioinformatics/btq033.
- Rutherford K, Parkhill J, Crook J, Horsnell T, Rice P, Rajandream MA, Barrell B. 2000. Artemis: sequence visualization and annotation. Bioin- formatics 16:944 -945. https://doi.org/10.1093/bioinformatics/16.10.944.
- Zhou J, Rudd KE. 2013. EcoGene 3.0. Nucleic Acids Res 41:D613-D624. https://doi.org/10.1093/nar/gks1235.
- Hulsen T, de Vlieg J, Alkema W. 2008. BioVenn-a web application for the comparison and visualization of biological lists using area- proportional Venn diagrams. BMC Genomics 9:488. https://doi.org/10 .1186/1471-2164-9-488.
- Li H, Durbin R. 2009. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25:1754 -1760. https://doi .org/10.1093/bioinformatics/btp324.