Bacteriologists have strived toward attaining a natural classification system based on
evolutionary relationships for nearly 100 years. In the early twentieth century it was
accepted that a phylogeny-based system would be the most appropriate, but in the
absence of molecular data, this approach proved exceedingly difficult. Subsequent
technical advances and the increasing availability of genome sequencing have allowed
for the generation of robust phylogenies at all taxonomic levels. In this study, we
explored the possibility of linking biological characters to higher-level taxonomic groups
in bacteria by making use of whole genome sequence information. For this purpose,
we specifically targeted the genus Pantoea and its four main lineages. The shared
gene sets were determined for Pantoea, the four lineages within the genus, as well
as its sister-genus Tatumella. This was followed by functional characterization of the
gene sets using the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. In
comparison to Tatumella, various traits involved in nutrient cycling were identified within
Pantoea, providing evidence for increased efficacy in recycling of metabolites within the
genus. Additionally, a number of traits associated with pathogenicity were identified within
species often associated with opportunistic infections, with some support for adaptation
toward overcoming host defenses. Some traits were also only conserved within specific
lineages, potentially acquired in an ancestor to the lineage and subsequently maintained.
It was also observed that the species isolated from the most diverse sources were
generally the most versatile in their carbon metabolism. By investigating evolution, based
on the more variable genomic regions, it may be possible to detect biologically relevant
differences associated with the course of evolution and speciation.
Supplementary Table S1 | Environmental information processing traits with
differences among the lineages.
Supplementary File S1 | Output from GhostKOALA for the Pantoea and
Tatumella shared gene sets. This file contains the differences observed for the
overview maps and the specific pathways for Pantoea and Tatumella. The overlay
figures of the overview and pathway maps are also indicated.
Supplementary File S2 | Output from GhostKOALA for the lineages within
Pantoea. This file contains the differences observed for the overview maps as well
as the overlay figures of the overview maps for the different lineages.
Supplementary File S3 | The summary of differences in pathways requiring 2 or
more genes, as well as a summary of the BLAST confirmations of these genes.
Supplementary File S4 | The results from the selection analyses and the figures
for the gene clusters not indicated in text.
Supplementary File S5 | The differences for pathways involved in “Environmental
Information Processing”. A summary of the BLAST confirmation is also included
as well as the maps for each lineage for the ABC transporters, two-component
systems and the PTSs.
Supplementary File S6 | Summary of the Blast2GO analyses as well as the
BLAST hits for genes not annotated with Blast2GO. A pie chart indicating the
distribution of BLAST hits is also indicated.