Abstract:
Modern phylogenetic studies from the advancement of next generation sequencing can benefit from an analysis of complete genome sequences of various microorganisms. Evolutionary inferences based on genome scale analysis were believed to be more accurate than gene-based ones. However, the computational complexity of current phylogenomic procedures and lack of reliable annotation and alignment free evolutionary models keep microbiologists from wider use of these opportunities. For example, the super-matrix approach of phylogenomics requires identification of clusters of orthologous genes in compared genomes followed by alignment of numerous sequences to proceed with reconciliation of multiple trees inferred by traditional phylogenetic tools. In fact, the approach potentially multiplies the problems of gene annotation and sequence alignment, not mentioning the computational difficulties and laboriousness of the methods. For this research, we identified that the alignment and annotation-free method based on comparison of oligonucleotide usage patterns (OUP) calculated for genome-scale DNA sequences allowed fast inferring of phylogenetic trees. These were also congruent with the corresponding whole genome supermatrix trees in terms of tree topology and branch lengths. Validation and benchmarking tests for OUP phylogenomics were done based on comparisons to current literature and artificially created sequences with known phylogeny. It was demonstrated that the OUP diversification between taxa was driven by global adjustments of codon usage to fit fluctuating tRNA concentrations that were well aligned to the species evolution. A web-based program to perform OUP-based phylogenomics was released on http://swphylo.bi.up.ac.za/. Applicability of the tool was proven for different taxa from species to family levels. Distinguishing between closely related taxonomic units may be enforced by providing the program with alignments of marker protein sequences, e.g. gyrA.