88 files

bbsag20 dataset: total 30,851 single-cell genomes, 51 metagenomes, and 1,544 metagenome-assembled genomes from the human oral/gut microbiota

Version 2 2024-03-06, 22:36
Version 1 2023-12-12, 21:22
online resource
posted on 2024-03-06, 22:36 authored by Tetsuro Kawano-SugayaTetsuro Kawano-Sugaya, Koji ArikawaKoji Arikawa, Tatsuya Saeki, Taruho Endoh, Kazuma Kamata, Ayumi Matsuhashi, Masahito HosokawaMasahito Hosokawa


This is the large single amplified genome calatog (bbsag20) corresponding to the publication below:

  • Title: "Single Amplified Genome Catalog Reveals the Dynamics of Mobilome and Resistome in the Human Microbiome" (preprint:
  • Authors: Tetsuro Kawano-Sugaya, Koji Arikawa, Tatsuya Saeki, Taruho Endoh, Kazuma Kamata, Ayumi Matsuhashi, and Masahito Hosokawa

Research abstract

The increase in metagenome-assembled genomes (MAGs) has significantly advanced our understanding of the functional characterization and taxonomic assignment within the human microbiome. However, MAGs, as population consensus genomes, often mask heterogeneity among species and strains, thereby obfuscating the precise relationships between microbial hosts and mobile genetic elements (MGEs). In contrast, single amplified genomes (SAGs) derived via single-cell genome sequencing can capture individual genomic content, including MGEs. We present the bbsag20 dataset, which encompasses 17,202 human-associated prokaryotic SAGs and 869 MAGs, spanning 647 gut and 312 oral bacterial species. The SAGs revealed diverse bacterial lineages and MGEs with a broad host range that were absent in the MAGs and traced the translocation of oral bacteria to the gut. Importantly, our SAGs linked individual mobilomes to resistomes and meticulously charted a dynamic network of antibiotic resistance genes (ARGs) on MGEs, pinpointing potential ARG reservoirs in the microbial community.

This dataset contains five types of data:

  1. mg.tar.gz: Fecal metagenome from x51 samples (QLF001-064)
  2. mag.tar.gz: Fecal metagenome-assembled genome from x51 samples (QLF001-064; x1,544 genomes containing x869 HQ/MQ genomes)
  3. QLF001-064sag.tar.gz: Fecal single amplified genome from x51 samples (QLF001-064; x19,042 genomes containing x10,066 HQ/MQ genomes)
  4. QLS001-033sag.tar.gz: Oral single amplified genome from x32 samples (QLS001-033; x11,809 genomes containing x7,136 HQ/MQ genomes)
  5. bbsag20_plasmid.fna.gz: Contigs from genomes above predicted by Platon (doi: 10.1099/mgen.0.000398).
  • Details of genomes (e.g. assembly statistics, quality, taxonomy classification, and number of mobile genetic elements) are described in summary.tsv
  • The raw data produced in this study were deposited at NCBI under BioProject ID PRJNA1030952.


This work was supported by the Tokyo Metropolitan Small and Medium Enterprise Support Center.


Research Institution(s)

bitBiome, Inc.

I confirm there is no human personally identifiable information in the files or description shared

  • Yes

I confirm the files and description shared may be publicly distributed under the license selected

  • Yes

Competing Interest Statement

MH is a founder and shareholder in bitBiome, Inc., which provides single-cell genomics services using the SAG-gel workflow as bit-MAP. TKS, KA, TS, TE, KK, and AM are employed at bitBiome, Inc. MH, TS, TE, KK, and KA are inventors on patent applications submitted by bitBiome, Inc., covering the technique for single-cell sequencing.