Identifying lineage effects when controlling for population structure improves power in bacterial association studies
Earle, Sarah G.; Wu, Chieh-Hsi; Charlesworth, Jane; Stoesser, Nicole; Gordon, N. Claire; Walker, Timothy M.; Spencer, Chris C.A.; Iqbal, Zamin; Clifton, David A.; Hopkins, Katie L.; Woodford, Neil; Smith, E. Grace; Ismail, Nazir Ahmed; Llewelyn, Martin J.; Peto, Tim E.; Crook, Derrick W.; McVean, Gil; Walker, A. Sarah; Wilson, Daniel J.
Bacteria pose unique challenges for genome-wide association studies because of strong structuring into distinct strains and substantial linkage disequilibrium across the genome1,2. Although methods developed for human studies can correct for strain structure3,4, this risks considerable loss-of-power because genetic differences between strains often contribute substantial phenotypic variability5. Here, we propose a new method that captures lineage-level associations even when locus-specific associations cannot be fine-mapped. We demonstrate its ability to detect genes and genetic variants underlying resistance to 17 antimicrobials in 3,144 isolates from four taxonomically diverse clonal and recombining bacteria: Mycobacterium tuberculosis, Staphylococcus aureus, Escherichia coli and Klebsiella pneumoniae. Strong selection, recombination and penetrance confer high power to recover known antimicrobial resistance mechanisms and reveal a candidate association between the outer membrane porin nmpC and cefazolin resistance in E. coli. Hence, our method pinpoints locus-specific effects where possible and boosts power by detecting lineage-level differences when fine-mapping is intractable.