All genome sequences incorporated in this study are publically available in
the NCBI Genome database. The NCBI accession numbers for the contigs/
chromosomes on which the target loci are found are indicated in
Additional file 1: Table S1.
Table S1. Enterobacteriaceae strains analysed in this
study. The deep-branching clade to which they belong (Fig. 1), the presence/
absence of fliDCAZ loci, the NCBI accession numbers of the contigs on which
these loci occur, and presence/absence of FGI and FMI loci are indicated. Table
S2. Characteristics of FGI+ Enterobacteriaceae and their flagellin glycosylation
islands. The isolation sources and deep-branching clade to which they belong
in the Enterobacteriaceae phylogeny (Fig. 1) are indicated for each of the FGI+
enterobacterial strains. The sizes, G + C contents, G + C deviation from
the rest of the genome and number of proteins encoded on each FGI
are shown. Table S3. Characteristics of FMI+ Enterobacteriaceae and their
flagellin methylation islands. The isolation sources and deep-branching
clade to which they belong in the Enterobacteriaceae phylogeny (Fig. 1)
are indicated for each of the FMI+ enterobacterial strains. The sizes, G + C
contents, G + C deviation from the rest of the genome and number of
proteins encoded on each FMI are shown. Table S4. Annotations of the
proteins encoded on the enterobacterial FGIs. The number of strains and
genera in which orthologs of each distinct protein are encoded within the
FGIs are indicated, as well as the closest non-enterobacterial Blast hit, obtained
by BlastP analysis against the NCBI non-redundant protein database. Orthologs
were only considered among the top 500 BLAST hits and for those orthologs
with > 30 % amino acid identity to the query protein. The putative function
and conserved domains observed after BLAST analyses against the NCBI
protein and conserved domain databases are shown. Table S5. Genomics
inserts in the fliDCAZ loci of FGI−/FMI− Enterobacteriaceae. The insert size, G + C
content, G + C deviation from the rest of the genome, number of proteins
encoded and putative functions of the encoded proteins in each insert are
shown. (XLSX 234 kb)
Figure S1. Schematic diagrams of inserts within the
fliDCAZ loci of FGI−/FMI− Enterobacteriaceae. Flanking genes are indicated by
yellow arrows, predicted phage genes by blue arrows, fimbrial biogenesis
genes by grey arrows and sugar/amino acid transporter genes by orange
arrows. Black arrows indicate predicted transposase or endonuclease genes,
while the red arrows indicate genes with disrupted reading frames. The
flagellin glycan biosynthetic genes in the FGI+ strains E. tracheiphila Buff/
PSU-1 are indicated by green arrows, upstream of the predicted phage
integration site. (TIF 471 kb)
Figure S2. Schematic diagrams of stereotypical flagellin
glycosylation islands of the forty-two distinct FGI types. Glycosyltransferase
and sugar biosynthetic genes are indicated by dark and light green arrows,
respectively. Formyltransferases, methyltransferases, acetyltransferases and
aminotransferases are encoded by genes represented by dark blue, purple,
light blue and yellow arrows, respectively. Pink arrows indicates genes
involved in fatty acid biosynthesis. Flanking genes are indicated by grey
arrows, genes coding for hypothetical proteins or involved in functions with
no relative known function in flagellin glycosylation by white arrows and
black arrows indicate transposes and endonuclease genes. (TIF 3168 kb)