bbsag20 dataset: total 30,851 single-cell genomes, 51 metagenomes, and 1,544 metagenome-assembled genomes from the human oral/gut microbiota
README
This is the large single amplified genome calatog (bbsag20) corresponding to the publication below:
Kawano-Sugaya T, Arikawa K, Saeki T, Endoh T, Kamata K, Matsuhashi A, Hosokawa M. A single amplified genome catalog reveals the dynamics of mobilome and resistome in the human microbiome. Microbiome. 2024;12(1):188 (doi:10.1186/s40168-024-01903-z)
Research abstract
The increase in metagenome-assembled genomes (MAGs) has significantly advanced our understanding of the functional characterization and taxonomic assignment within the human microbiome. However, MAGs, as population consensus genomes, often mask heterogeneity among species and strains, thereby obfuscating the precise relationships between microbial hosts and mobile genetic elements (MGEs). In contrast, single amplified genomes (SAGs) derived via single-cell genome sequencing can capture individual genomic content, including MGEs. We present the bbsag20 dataset, which encompasses 17,202 human-associated prokaryotic SAGs and 869 MAGs, spanning 647 gut and 312 oral bacterial species. The SAGs revealed diverse bacterial lineages and MGEs with a broad host range that were absent in the MAGs and traced the translocation of oral bacteria to the gut. Importantly, our SAGs linked individual mobilomes to resistomes and meticulously charted a dynamic network of antibiotic resistance genes (ARGs) on MGEs, pinpointing potential ARG reservoirs in the microbial community.
This dataset contains five types of data:
- mg.tar.gz: Fecal metagenome from x51 samples (QLF001-064)
- mag.tar.gz: Fecal metagenome-assembled genome from x51 samples (QLF001-064; x1,544 genomes containing x869 HQ/MQ genomes)
- QLF001-064sag.tar.gz: Fecal single amplified genome from x51 samples (QLF001-064; x19,042 genomes containing x10,066 HQ/MQ genomes)
- QLS001-033sag.tar.gz: Oral single amplified genome from x32 samples (QLS001-033; x11,809 genomes containing x7,136 HQ/MQ genomes)
- bbsag20_plasmid.fna.gz: Contigs from genomes above predicted by Platon (doi: 10.1099/mgen.0.000398).
- Details of genomes (e.g. assembly statistics, quality, taxonomy classification, and number of mobile genetic elements) are described in summary.tsv
- The raw data produced in this study were deposited at NCBI under BioProject ID PRJNA1030952.
Funding
This work was supported by the Tokyo Metropolitan Small and Medium Enterprise Support Center.
History
Research Institution(s)
bitBiome, Inc.Contact email
masahito.hosokawa@bitbiome.bioAssociated Preprint DOI
I confirm there is no human personally identifiable information in the files or description shared
- Yes
I confirm the files and description shared may be publicly distributed under the license selected
- Yes