Abstract:
Indigenous breeds such as the South African (SA) Drakensberger are economically important genetic resources in local beef production because of their adaptive traits and ability to perform competitively at a commercial level. Genomic selection (GS) is a promising technology to accelerate genetic progress in traits relevant to commercial beef production. A major obstacle in applying this methodology has been the cost of genotyping at high densities of single nucleotide polymorphisms (SNPs). Cost reduction can be achieved by exploiting genotype imputation in GS workflows by means of genotyping at lower densities and imputing upwards. The overarching aim of this study was to conduct an investigation into the practicality of applying imputation in such a workflow utilizing genotypic data for 1 135 SA Drakensberger animals genotyped for 139 480 SNPs. As a pre-imputation step, the objective was firstly to elucidate inter- and intra-chromosomal patterns in genomic characteristics that may contribute to variability in achievable imputation accuracy across the genome. Inter-chromosomal differences in the proportion of low minor allele frequency (MAF) SNPs estimated varied from 6.6% for Bos Taurus autosome (BTA) 23 to 16.0% for BTA14. Pairwise linkage disequilibrium (LD), between adjacent SNPs, ranged from r2=0.11 (BTA28) to 0.17 (BTA14). The largest run of homozygosity (ROH), located on BTA13, was 225.82 kilobases (kb) in length and was identified in 23% of the animals sampled. The ROH-based inbreeding coefficients (FROH) estimated (e.g. FROH>1Mb=0.07, where FROH>1Mb denotes FROH calculated for all ROH longer than 1 megabase pair), indicated sufficient within-breed relatedness to achieve accurate imputation. During the imputation step, imputation accuracy from several custom-derived lower density panels varying in SNP density and the SNP selection strategy were compared. Imputation accuracy increased as SNP density increased; a genotyping panel consisting of 10 000 SNPs, selected based on a combination of their MAF and LD with neighbouring SNPs, could be used to achieve <3% imputation error on average. At this density of SNPs, a mean correlation coefficient (±standard deviation) between true- and imputed SNPs of 0.972±0.024 was achieved in a set of validation animals (n=235). Low MAF SNPs were imputed with lesser accuracy; a difference of 0.071 units was observed between the mean accuracy of imputed SNP categorized into low- (0.01<MAF≤0.1) versus high MAF (0.4<MAF<0.5) classes. Post-imputation, the utility of imputed genotypes in genomic breeding value (GEBV) estimation was evaluated by comparing prediction accuracies achieved from the use of true versus imputed SNPs in generating the H-inverse matrix applied in single-step GS. Breeding values were estimated for two growth traits, considering direct and maternal components. Prediction accuracies were improved by using genomic information in addition to traditional pedigree information; the largest improvement (0.026 units increase in accuracy) was observed for maternal birth weight. Marginal differences were observed between GEBV accuracies produced from true (GEBV_TRUE) versus imputed genotypes (GEBV_IMPUTED); for example the mean±standard deviation in GEBV_TRUE=0.774±0.056 versus GEBV_IMPUTED=0.773±0.055 accuracy was observed for direct birth weight, suggesting that imputation errors had an almost negligible influence. Results presented in this thesis demonstrated the usefulness of imputation as a viable genomic strategy towards low-cost implementation of genomically enhanced prediction of EBVs for a breed such as the SA Drakensberger.