Extensive Unexplored Human Microbiome Diversity Revealed by Over 150,000 Genomes from Metagenomes Spanning Age, Geography, and Lifestyle.

RSS de esta página

PubMed ID: 30661755

Imagen Publicación

Pasolli E, Asnicar F, Manara S, Zolfo M, Karcher N, Armanini F, Beghini F, Manghi P, Tett A, Ghensi P, Collado MC, Rice BL, DuLong C, Morgan XC, Golden CD, Quince C, Huttenhower C, Segata N

Cell. Jan 2019

COMMENT: This article describes a lot of work done analyzing 9,428 human metagenomic samples and trying to reconstruct the microbial genomes that there were in each sample using de novo assembly and binning procedures. The results of this work show the high percentage of unknown genomes that there is in the human microbiome. It will be needed a lot of lab and bioinformatics work to refine these new detected genomes to reach the level of precision and completeness required to be used as reference genomes but, in any case, to know that they are there is very useful.

The genomes were assembled, per sample, using metaSPAdes and binned using MetaBAT2:

We leveraged 9,428 metagenomes to reconstruct 154,723 microbial genomes (45% of high quality) spanning body sites, ages, countries, and lifestyles. We recapitulated 4,930 species-level genome bins (SGBs), 77% without genomes in public repositories (unknown SGBs [uSGBs]). uSGBs are prevalent (in 93% of well-assembled samples), expand underrepresented phyla, and are enriched in non-Westernized populations (40% of the total SGBs)

To organize the 154,723 genomes into species-level genome bins (SGBs), we employed an all-versus-all genetic distance quantification followed by clustering and identification of genome bins spanning a 5% genetic diversity …

We identified 3,796 SGBs (i.e., 77.0% of the total) covering unexplored microbial diversity as they represent species without any publicly available genomes from isolate sequencing or previous metagenomic assemblies

The functional annotation was done mainly based on similarity to UniRef90 and UniRef50 protein entries:

Functional annotation of all the reconstructed genomes assigned a UniRef90 (The UniProt Consortium, 2017) label to 230 M genes and a UniRef50 to 268 M genes (72.7% and 84.8% of the total of 316 M genes, respectively).

The percentage of functionally annotated genes varied depending on the availability of reference proteomes of closely related species:

… the rate of annotation varied greatly in SGBs (e.g., >90% genes annotated for well studied species such as Escherichia coli or Bacteroides fragilis versus 22% for ID 15286, which is the largest SGB without reference genomes)

Some distinctive functional annotations were detected in each body site:

Each of the body sites considered had a clear distinctive set of annotations with the adult fecal microbiome enriched for 101,056 gene families representative of anaerobe-specific functions such as formate oxidation and methanogenesis and a strong representation of biofilm formation functions in the oral cavity and on the skin.

A set of unkown genomes reconstructed in the oral samples belonged to Saccharibacteria (previously named TM7):

For example, the candidate phylum Saccharibacteria (previously named TM7) contains members of the oral microbiome that are particularly difficult to cultivate. For this clade, we reconstructed 387 genomes from 108 SGBs, some representing members observed only using 16S rRNA gene sequencing.

The 107 Saccharibacteria uSGBs thus suggest a substantially undersampled diversity of human associated members of this phylum. Its importance is also confirmed by the occurrence of at least one genome from these 108 SGBs in 33% of oral cavity samples, where they can reach average abundances above 3% (Table S4) and maximum abundances exceeding 10%.

A set of new reconstructed genomes belonged to archaea:

Among uSGBs, we also reconstructed genomes assigned to Thermoplasmatales (ID 376, 378, 380, 381), Candidatus Methanomethylophilus (ID 372, 382, 384), Methanomassiliicoccus (ID 362, 364), and Methanosphaera (ID 697), all very distant from their nearest reference genomes (average 22.4%, SD 4.0% nucleotide distance). This expanded human-associated archaeal diversity suggests the presence of several as-yet-uncharacterized archaea of potentially unique functional relevance in this ecosystem

Authors concluded that this study would allow better exploitation of metagenomic technologies:

We thus identify thousands of microbial genomes from yet-to-be-named species, expand the pangenomes of human-associated microbes, and allow better exploitation of metagenomic technologies.



Description of datasets and samples analyzed:


Software and Algorithms used in this work:


Raquel Tobes