Abstract:
The phylogeny of nitrogenase has only been analyzed using the structural proteins NifHDK.
As nifHDKENB has been established as the minimum number of genes necessary for in silico prediction
of diazotrophy, we present an updated phylogeny of diazotrophs using both structural (NifHDK)
and cofactor assembly proteins (NifENB). Annotated Nif sequences were obtained from InterPro
from 963 culture-derived genomes. Nif sequences were aligned individually and concatenated
to form one NifHDKENB sequence. Phylogenies obtained using PhyML, FastTree, RapidNJ, and
ASTRAL from individuals and concatenated protein sequences were compared and analyzed. All
six genes were found across the Actinobacteria, Aquificae, Bacteroidetes, Chlorobi, Chloroflexi,
Cyanobacteria, Deferribacteres, Firmicutes, Fusobacteria, Nitrospira, Proteobacteria, PVC group,
and Spirochaetes, as well as the Euryarchaeota. The phylogenies of individual Nif proteins were
very similar to the overall NifHDKENB phylogeny, indicating the assembly proteins have evolved
together. Our higher resolution database upheld the three cluster phylogeny, but revealed undocumented
horizontal gene transfers across phyla. Only 48% of the 325 genera containing all six nif genes
are currently supported by biochemical evidence of diazotrophy. In addition, this work provides
reference for any inter-phyla comparison of Nif sequences and a quality database of Nif proteins
that can be used for identifying new Nif sequences.
Description:
SUPPLEMENTARY MATERIAL : FIGURE S1: Phylogenetic analysis of individual NifH proteins by FastTree using the JTT+CAT evolution model, FIGURE S2: Phylogenetic analysis of individual NifD proteins by FastTree using the JTT+CAT evolution model, FIGURE S3: Phylogenetic analysis of individual NifK proteins by FastTree using the JTT+CAT evolution model, FIGURE S4: Phylogenetic analysis of individual NifE proteins by FastTree using the JTT+CAT evolution model, FIGURE S5: Phylogenetic analysis of individual NifN proteins by FastTree using the JTT+CAT evolution model, FIGURE S6: Phylogenetic analysis of individual NifN proteins by FastTree using the JTT+CAT evolution model, FIGURE S7: Molecular phylogenetic analysis of concatenated NifHDKENB proteins by Maximum Likelihood (PhyML) with branch support by posterior probability. Each clade is highlighted by the bacterial or archaeal phylum and Proteobacteria are further divided into classes, FIGURE S8: Molecular phylogenetic analysis of concatenated NifHDKENB proteins Neighbour joining by RapidNJ using Kimura model. Each clade is highlighted by the bacterial or archaeal phylum and Proteobacteria are further divided into classes, FIGURE S9: Cladogram showing comparison of Astral tree obtained from six individual trees vs. majority consensus tree obtained from three trees obtained by FASTTREE, PhyML, and Neighbour Joining using concatenated NifHDKENB proteins. Black dots in Astral tree represents the Astral bootstrapping and on the concatenated tree represents the branching present in all three trees and all other splits present in at least two trees, FIGURE S10: Chronogram of evolution of diazotrophs obtained by selecting diazotrophic genera from the microbial evolution tree proposed by Zhu et al., 2019 using source data file. Node labels represent the time in billion years ago (Ga) in the original tree, FIGURE S11: Number of genomes containing all six nifHDKENB genes according to the year they were reported. Number of genomes without biochemical evidence increased rapidly with the increase in the number of genomes reported, SUPPLEMENTAL DATA S1: Nif/Vnf/Anf HDKENB protein sequences used for analyses, SUPPLEMENTAL DATA S2: Information on the strains and gene/protein sequences used, SUPPLEMENTAL DATA S3: List of all genera for which biochemical evidence of nitrogen fixation could be found in the literature, Supplemental tree file: High resolution tanglegram comparing the concatenated NifHDKENB tree with 16S rRNA phylogeny of the diazotrophs, both obtained using FastTree. Lines indicate the respective positions of the 963 bacteria in the 2 trees.