Academia.eduAcademia.edu

Outline

Search and sequence analysis tools services from EMBL-EBI in 2022

2022, Nucleic Acids Research

https://doi.org/10.1093/NAR/GKAC240

Abstract

The EMBL-EBI search and sequence analysis tools frameworks provide integrated access to EMBL-EBI’s data resources and core bioinformatics analytical tools. EBI Search (https://www.ebi.ac.uk/ebisearch) provides a full-text search engine across nearly 5 billion entries, while the Job Dispatcher tools framework (https://www.ebi.ac.uk/services) enables the scientific community to perform a diverse range of sequence analysis using popular bioinformatics applications. Both allow users to interact through user-friendly web applications, as well as via RESTful and SOAP-based APIs. Here, we describe recent improvements to these services and updates made to accommodate the increasing data requirements during the COVID-19 pandemic.

References (31)

  1. Hu,B., Guo,H., Zhou,P. and Shi,Z.-L. (2021) Characteristics of SARS-CoV-2 and COVID-19. Nat. Rev. Microbiol., 19, 141-154.
  2. Cantelli,G., Cochrane,G., Brooksbank,C., McDonagh,E., Flicek,P., McEntyre,J., Birney,E. and Apweiler,R. (2021) The European Bioinformatics Institute: empowering cooperation in response to a global health crisis. Nucleic Acids Res., 49, D29-D37.
  3. Harrison,P.W., Lopez,R., Rahman,N., Allen,S.G., Aslam,R., Buso,N., Cummins,C., Fathy,Y., Felix,E., Glont,M. et al. (2021) The COVID-19 Data Portal: accelerating SARS-CoV-2 and COVID-19 research through rapid open access data sharing. Nucleic Acids Res., 49, W619-W623.
  4. Madeira,F., Park,Y.M., Lee,J., Buso,N., Gur,T., Madhusoodanan,N., Basutkar,P., Tivey,A.R.N., Potter,S.C., Finn,R.D. et al. (2019) The EMBL-EBI search and sequence analysis tools APIs in 2019. Nucleic Acids Res., 47, W636-W641.
  5. Cummins,C., Ahamed,A., Aslam,R., Burgin,J., Devraj,R., Edbali,O., Gupta,D., Harrison,P.W., Haseeb,M., Holt,S. et al. (2022) The European Nucleotide Archive in 2021. Nucleic Acids Res., 50, D106-D110.
  6. Howe,K.L., Achuthan,P., Allen,J., Allen,J., Alvarez-Jarreta,J., Amode,M.R., Armean,I.M., Azov,A.G., Bennett,R., Bhai,J. et al. (2021) Ensembl 2021. Nucleic Acids Res., 49, D884-D891.
  7. Perez-Riverol,Y., Zorin,A., Dass,G., Vu,M.-T., Xu,P., Glont,M., Vizcaíno,J.A., Jarnuczak,A.F., Petryszak,R., Ping,P. et al. (2019) Quantifying the impact of public omics data. Nat. Commun., 10, 3512.
  8. Camacho,C., Coulouris,G., Avagyan,V., Ma,N., Papadopoulos,J., Bealer,K. and Madden,T.L. (2009) BLAST+: architecture and applications. BMC Bioinformatics, 10, 421.
  9. Pearson,W.R. and Lipman,D.J. (1988) Improved tools for biological sequence comparison. Proc. Natl. Acad. Sci. U.S.A., 85, 2444-2448.
  10. Sievers,F. and Higgins,D.G. (2021) The clustal omega multiple alignment package. Methods Mol. Biol., 2231, 3-16.
  11. Lassmann,T. (2019) Kalign 3: multiple sequence alignment of large data sets. Bioinformatics, 36, 1928-1929.
  12. Blum,M., Chang,H.-Y., Chuguransky,S., Grego,T., Kandasaamy,S., Mitchell,A., Nuka,G., Paysan-Lafosse,T., Qureshi,M., Raj,S. et al. (2021) The InterPro protein families and domains database: 20 years on. Nucleic Acids Res., 49, D344-D354.
  13. Sweeney,B.A., Hoksza,D., Nawrocki,E.P., Ribas,C.E., Madeira,F., Cannone,J.J., Gutell,R., Maddala,A., Meade,C.D., Williams,L.D. et al. (2021) R2DT is a framework for predicting and visualising RNA secondary structure using templates. Nat. Commun., 12, 3494.
  14. Consortium,UniProt (2021) UniProt: the universal protein knowledgebase in 2021. Nucleic Acids Res., 49, D480-D489.
  15. Ochoa,D., Hercules,A., Carmona,M., Suveges,D., Gonzalez-Uriarte,A., Malangone,C., Miranda,A., Fumis,L., Carvalho-Silva,D., Spitzer,M. et al. (2021) Open Targets Platform: supporting systematic drug-target identification and prioritisation. Nucleic Acids Res., 49, D1302-D1310.
  16. Laskowski,R.A., Stephenson,J.D., Sillitoe,I., Orengo,C.A. and Thornton,J.M. (2020) VarSite: disease variants and protein structure. Protein Sci. Publ. Protein Soc., 29, 111-119.
  17. consortium,PDBe-KB (2020) PDBe-KB: a community-driven resource for structural and functional annotations. Nucleic Acids Res., 48, D344-D353.
  18. Buniello,A., MacArthur,J.A.L., Cerezo,M., Harris,L.W., Hayhurst,J., Malangone,C., McMahon,A., Morales,J., Mountjoy,E., Sollis,E. et al. (2019) The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019. Nucleic Acids Res., 47, D1005-D1012.
  19. Iudin,A., Korir,P.K., Salavert-Torres,J., Kleywegt,G.J. and Patwardhan,A. (2016) EMPIAR: a public archive for raw electron microscopy image data. Nat. Methods, 13, 387-388.
  20. Cezard,T., Cunningham,F., Hunt,S.E., Koylass,B., Kumar,N., Saunders,G., Shen,A., Silva,A.F., Tsukanov,K., Venkataraman,S. et al. (2022) The European Variation Archive: a FAIR resource of genomic variation for all species. Nucleic Acids Res., 50, D1216-D1220.
  21. Bairoch,A. (2018) The Cellosaurus, a Cell-Line Knowledge Resource. J. Biomol. Tech. JBT, 29, 25-38.
  22. El-Gebali,S., Mistry,J., Bateman,A., Eddy,S.R., Luciani,A., Potter,S.C., Qureshi,M., Richardson,L.J., Salazar,G.A., Smart,A. et al. (2019) The Pfam protein families database in 2019. Nucleic Acids Res., 47, D427-D432.
  23. Robinson,J., Guethlein,L.A., Maccari,G., Blokhuis,J., Bimber,B.N., de Groot,N.G., Sanderson,N.D., Abi-Rached,L., Walter,L., Bontrop,R.E. et al. (2018) Nomenclature for the KIR of non-human species. Immunogenetics, 70, 571-583.
  24. Varadi,M., Anyango,S., Deshpande,M., Nair,S., Natassia,C., Yordanova,G., Yuan,D., Stroe,O., Wood,G., Laydon,A. et al. (2022) AlphaFold Protein Structure Database: massively expanding the structural coverage of protein-sequence space with high-accuracy models. Nucleic Acids Res., 50, D439-D444.
  25. Lawson,C.L., Patwardhan,A., Baker,M.L., Hryc,C., Garcia,E.S., Hudson,B.P., Lagerstedt,I., Ludtke,S.J., Pintilie,G., Sala,R. et al. (2016) EMDataBank unified data resource for 3DEM. Nucleic Acids Res., 44, D396-D403.
  26. Schoch,C.L., Ciufo,S., Domrachev,M., Hotton,C.L., Kannan,S., Khovanskaya,R., Leipe,D., Mcveigh,R., O'Neill,K., Robbertse,B. et al. (2020) NCBI taxonomy: a comprehensive update on curation, resources and tools. Database J. Biol. Databases Curation, 2020, baaa062.
  27. Liu,C., Shi,W., Becker,S.T., Schatz,D.G., Liu,B. and Yang,Y. (2021) Structural basis of mismatch recognition by a SARS-CoV-2 proofreading enzyme. Science, 373, 1142-1146.
  28. Spratt,A.N., Kannan,S.R., Woods,L.T., Weisman,G.A., Quinn,T.P., Lorson,C.L., S önnerborg,A., Byrareddy,S.N. and Singh,K. (2021) Evolution, correlation, structural impact and dynamics of emerging SARS-CoV-2 variants. Comput. Struct. Biotechnol. J., 19, 3799-3809.
  29. Alsulami,A.F., Thomas,S.E., Jamasb,A.R., Beaudoin,C.A., Moghul,I., Bannerman,B., Copoiu,L., Vedithi,S.C., Torres,P. and Blundell,T.L. (2021) SARS-CoV-2 3D database: understanding the coronavirus proteome and evaluating possible drug targets. Brief. Bioinform., 22, 769-780.
  30. Banerjee,S., Seal,S., Dey,R., Mondal,K.K. and Bhattacharjee,P. (2021) Mutational spectra of SARS-CoV-2 orf1ab polyprotein and signature mutations in the United States of America. J. Med. Virol., 93, 1428-1435.
  31. Yashvardhini,N., Jha,D.K. and Bhattacharya,S. (2021) Identification and characterization of mutations in the SARS-CoV-2 RNA-dependent RNA polymerase as a promising antiviral therapeutic target. Arch. Microbiol., 203, 5463-5473.