Abstract:
The research fields of bioinformatics and computational biology are growing rapidly in South Africa.
Bioinformatics pipelines play an integral part in handling sequencing data, which are used to investigate
the aetiology of common and rare diseases. Bioinformatics platforms for common disease aetiology are
well supported and continuously being developed in South Africa. However, the same is not the case for
rare diseases aetiology research. Investigations into the latter rely on international cloud-based tools for
data analyses and ultimately confirmation of a genetic disease. However, these tools are not necessarily
optimised for ethnically diverse population groups. We present an in-house developed bioinformatics
pipeline to enable researchers to annotate and filter variants in either exome or amplicon next-generation
sequencing data. This pipeline was developed using next-generation sequencing data of a predominantly
African cohort of patients diagnosed with rare disease.
SIGNIFICANCE :
• We demonstrate the feasibility of in-country development of ethnicity-sensitive, automated bioinformatics
pipelines using free software in a South African context.
• We provide a roadmap for development of similarly ethnicity-sensitive bioinformatics pipelines.