The aim of this study was to create a genomic resource for a typical plant genome from Illumina short reads, using Psidium guajava as a case study. Here we present a bioinformatics approach to produce a de novo plant genome assembly, perform annotation, and compare the newly assembled and annotated genome to that of a reference genome, in this case Eucalyptus grandis. The assembly pipeline was constructed using a combination of the best results from four different assemblers namely (ABySS, Allpaths-LG, SGA and MaSurCA) with a combination of Illumina paired end and mate pair reads. Each assembler used a different graph-based approach in their assembly strategy, and the output from these assemblers were merged by Metassembler to produce a best assembly. We manage to create comprehensive genomic resource for the guava fruit tree from Illumina short reads. The annotated genome of Psidium guajava will serve a major genomic resource in the investigation of the interaction between the plant and pathogens such as Nalanthamala psidii (N. psidii). Also,our comparative genomics work is a starting point to learn more about the genetic diversity in the Myrtaceae family.
In Chapter 1 a comprehensive literature review of the current state of sequencing technologies, with a focus on third generation sequencing technologies is presented. This is followed by a discussion on different whole genome assembly approaches and techniques, with examples of each type of approach implemented as a software package. The relevance and importance of the non-model organism that we used as a case study, Psidium guajava, is also discussed in Chapter 1
In Chapter 2 the genome assembly and annotation pipelines and processes are discussed. Detailed materials and methods used in this study are provided.
The main findings and results of the study is discussed in Chapter 3, with a concluding remarks chapter presented as the last chapter of this dissertation.
The work presented here has been presented at the following conferences, and a manuscript on the genome resource is in preparation:
1. Poster Presentation – SAGS/SASBI conference 2016 (Durban)