Genomics insights into the global evolution and antibiotic resistance of the Mycobacterium tuberculosis complex
Loading...
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
University of Pretoria
Abstract
Mycobacterium tuberculosis (Mtb) recently reclaimed its status as the leading global cause of death from a single infectious agent after three years of COVID-19 holding the top position. We used Mtb whole genome sequencing data (WGS) to explore the diversity of human-adapted lineages of Mtb. Using publicly available datasets, we curated and characterised a large global WGS dataset of more than 9000 Mtb strains sampled across the globe. Based on the distribution of single nucleotide polymorphisms, we performed lineage classification, drug resistance predictions and molecular clock estimations to characterise the global evolution of Mtb and create a non-redundant global reference dataset. Our data suggested that public Mtb WGS datasets are highly redundant, and redundancy minimisation is required before analysing large datasets.
We next sought to explore the evolutionary dynamics that shaped the genetic landscape of the African continent which has been suggested as the origins of Mtb. We demonstrate that Lineage 2 and Lineage 4 are the most dominant on the continent. Using Maximum Likelihood and Bayesian phylogenetic techniques, we mapped identified drug resistance-associated mutations to time-resolved phylogenies. We estimated that drug resistance on the continent emerged at multiple events, with the earliest emergence of drug resistance occurring in the mid-20th century. We also identified the presence of resistance mutations associated with recently introduced drugs in isolates that were sampled prior to the use of these drugs. Using Bayesian skyline coalescent inference, we observed an expansion in the Mtb population in Africa in timelines that coincided with increased migration from Europe and Asia into Africa. We also inferred a population expansion of Mtb at the time when HIV prevalence was at its peak on the continent.
We next sought to understand the evolutionary dynamics of Lineage 2 and Lineage 4 Mtb in the Southern Africa Development Community (SADC) region, a part of the continent which carries the highest burden of HIV/TB coinfection. We demonstrate that the heterogeneity of Mtb Lineage 2 diversity in the SADC region is under-characterised. We identify 13 sublineages of Lineage 2 in the region from our analysis. To explore the origins of SADC Lineage 2 and Lineage 4, we employed two phylogeographic approaches and both of them place East Asia as the origin of Lineage 2 and Europe as the origin of Lineage 4. We also infer that the two lineages were introduced through multiple introduction events with South Africa as a central hub for the dispersion of the lineages northwards. Taken together, our phylogeographic analysis and our Bayesian skyline results suggest that migration and colonialism played a role in shaping the diversity of the two Lineages in SADC.
Lastly, using mathematical models, drug susceptibility testing data and genomic data, we sought to model the epistatic dynamics that govern drug resistance in Mtb. We obtained co-dependency estimates that represent the probability of one mutation emerging after another mutation. We then created networks and traced the trajectories from drug susceptibility status to pre-XDR-TB status.
Description
Thesis (PhD (Bioinformatics))--University of Pretoria, 2025.
Keywords
UCTD, Sustainable Development Goals (SDGs), Tuberculosis, Whole genome sequencing, Evolution, Phylogenetics, Single nucleotide polymorphism
Sustainable Development Goals
SDG-03: Good health and well-being
Citation
*