Abstract:
Modern phylogenetic studies may benefit from the analysis of complete genome sequences of various microorganisms.
Evolutionary inferences based on genome-scale analysis are believed to be more accurate than the gene-based alternative. However, the
computational complexity of current phylogenomic procedures, inappropriateness of standard phylogenetic tools to process genome-wide data,
and lack of reliable substitution models which correlates with alignment-free phylogenomic approaches deter microbiologists from using these
opportunities. For example, the super-matrix and super-tree approaches of phylogenomics use multiple integrated genomic loci or individual
gene-based trees to infer an overall consensus tree. However, these approaches potentially multiply errors of gene annotation and sequence
alignment not mentioning the computational complexity and laboriousness of the methods. In this article, we demonstrate that the annotationand
alignment-free comparison of genome-wide tetranucleotide frequencies, termed oligonucleotide usage patterns (OUPs), allowed a fast and
reliable inference of phylogenetic trees. These were congruent to the corresponding whole genome super-matrix trees in terms of tree topology
when compared with other known approaches including 16S ribosomal RNA and GyrA protein sequence comparison, complete genome-based
MAUVE, and CVTree methods. A Web-based program to perform the alignment-free OUP-based phylogenomic inferences was implemented at
http://swphylo.bi.up.ac.za/. Applicability of the tool was tested on different taxa from subspecies to intergeneric levels. Distinguishing between
closely related taxonomic units may be enforced by providing the program with alignments of marker protein sequences, eg, GyrA.