Geobacillus is a genus of Gram-positive, aerobic, spore-forming obligate thermophiles. The descriptions
and subsequent affiliations of the species in the genus have mostly been based on polyphasic taxonomy
rules that include traditional sequence-based methods such as DNA–DNA hybridization and comparison
of 16S rRNA gene sequences. Currently, there are fifteen validly described species within the genus. The
availability of whole genome sequences has provided an opportunity to validate and/or re-assess these
conventional estimates of genome relatedness. We have applied whole genome approaches to estimate
the phylogenetic relatedness among the sixty-three Geobacillus strains for which genome sequences are
currently publicly available, including the type strains of eleven validly described species. The phylogenomic
metrics AAI (Average Amino acid Identity), ANI (Average Nucleotide Identity) and dDDH (digital
DNA–DNA hybridization) indicated that the current genus Geobacillus is comprised of sixteen distinct
genomospecies, including several potentially novel species. Furthermore, a phylogeny constructed on
the basis of the core genes identified from the whole genome analyses indicated that the genus clusters
into two monophyletic clades that clearly differ in terms of nucleotide base composition. The G + C content
ranges for clade I and II were 48.8–53.1% and 42.1–44.4%, respectively. We therefore suggest that
the Geobacillus species currently residing within clade II be considered as a new genus
Supplementary Figure S1: AAI relationships among sixty-three strains of Geobacillus. The dendrogram was constructed using the distances matrices (derived from ANI and dDDH values) using the web server DendroUPGMA .
Supplementary Table S1: Genome features of sixty-three Geobacillus strains included in this study. The original species and strain designations are indicated, as are the genome size, GenBank Assembly accession numbers or Integrate Microbial Genomics database project ID. The status of the original genome sequence (Complete or Draft), number of contigs for the original and final assembly are shown. The number of genes coded on the genome (as predicted with RAST) and G+C content (%) are indicated.
Supplementary Table S2: ANI and dDDH relationships among sixty-three strains of Geobacillus species. The lower triangle shows the ANI values with the 96% threshold higlighted using bottom and left borders. The blue, white and red colour code (0-100%) was used to depict the contrast between the ANI values of the two major clades identified in this study. The upper triangle shows the dDDH values with 70% threshold demacated using upper and right borders. The red, yellow and green colour code (0-100%) was used to highlight the contrast between the dDDH values of the two major clades. The strains names were annotated with different colour fills to indicate species recognised in this study. The two major clades identified in this study are demarcated by a thick border line.
Supplementary Table S3: AAI relationships among sixty-three strains of Geobacillus species. The green, yellow and red colour code (0-100%) was used to highlight the AAI among the Geobacillus species included in this study. The strains names were annotated with different colour fills to indicate species recognised in this study. The species grouping within the clades are highlighted using the different colour fills in the species names. The two major clades identified in this study are demarcated using a thick border line.