1,003 reference genomes of bacterial and archaeal isolates expand coverage of the tree of life

  • We present 1,003 reference genomes that were sequenced as part of the Genomic Encyclopedia of Bacteria and Archaea (GEBA) initiative, selected to maximize sequence coverage of phylogenetic space. These genomes double the number of existing type strains and expand their overall phylogenetic diversity by 25%. Comparative analyses with previously available finished and draft genomes reveal a 10.5% increase in novel protein families as a function of phylogenetic diversity. The GEBA genomes recruit 25 million previously unassigned metagenomic proteins from 4,650 samples, improving their phylogenetic and functional interpretation. We identify numerous biosynthetic clusters and experimentally validate a divergent phenazine cluster with potential new chemical structure and antimicrobial activity. This Resource is the largest single release of reference genomes to date. Bacterial and archaeal isolate sequence space is still far from saturated, and future endeavors in this direction willWe present 1,003 reference genomes that were sequenced as part of the Genomic Encyclopedia of Bacteria and Archaea (GEBA) initiative, selected to maximize sequence coverage of phylogenetic space. These genomes double the number of existing type strains and expand their overall phylogenetic diversity by 25%. Comparative analyses with previously available finished and draft genomes reveal a 10.5% increase in novel protein families as a function of phylogenetic diversity. The GEBA genomes recruit 25 million previously unassigned metagenomic proteins from 4,650 samples, improving their phylogenetic and functional interpretation. We identify numerous biosynthetic clusters and experimentally validate a divergent phenazine cluster with potential new chemical structure and antimicrobial activity. This Resource is the largest single release of reference genomes to date. Bacterial and archaeal isolate sequence space is still far from saturated, and future endeavors in this direction will continue to be a valuable resource for scientific discovery.show moreshow less

Download full text files

Export metadata

Statistics

Number of document requests

Additional Services

Share in Twitter Search Google Scholar
Metadaten
Author:Supratim Mukherjee, Rekha Seshadri, Neha J. Varghese, Emiley A. Eloe-Fadrosh, Jan P. Meier-KolthoffORCiDGND, Markus Göker, R. Cameron Coates, Michalis Hadjithomas, Georgios A. Pavlopoulos, David Paez-Espino, Yasuo Yoshikuni, Axel Visel, William B. Whitman, George M. Garrity, Jonathan A. Eisen, Philip Hugenholtz, Amrita Pati, Natalia N. Ivanova, Tanja Woyke, Hans-Peter Klenk, Nikos C. Kyrpides
URN:urn:nbn:de:bvb:384-opus4-1067400
Frontdoor URLhttps://opus.bibliothek.uni-augsburg.de/opus4/106740
ISSN:1087-0156OPAC
ISSN:1546-1696OPAC
Parent Title (English):Nature Biotechnology
Publisher:Springer Science and Business Media LLC
Place of publication:Berlin
Type:Article
Language:English
Year of first Publication:2017
Publishing Institution:Universität Augsburg
Release Date:2023/08/14
Tag:Biomedical Engineering; Molecular Medicine; Applied Microbiology and Biotechnology; Bioengineering; Biotechnology
Volume:35
Issue:7
First Page:676
Last Page:683
Note:
Erratum published at: https://doi.org/10.1038/nbt0418-368c
DOI:https://doi.org/10.1038/nbt.3886
Institutes:Fakultät für Angewandte Informatik
Fakultät für Angewandte Informatik / Institut für Informatik
Fakultät für Angewandte Informatik / Institut für Informatik / Lehrstuhl für Biomedizinische Informatik, Data Mining und Data Analytics
Dewey Decimal Classification:0 Informatik, Informationswissenschaft, allgemeine Werke / 00 Informatik, Wissen, Systeme / 004 Datenverarbeitung; Informatik
Licence (German):CC-BY 4.0: Creative Commons: Namensnennung (mit Print on Demand)