Combined de novo and genome guided assembly and annotation of the Pinus patula juvenile shoot transcriptome

Show simple item record

dc.contributor.author Visser, Erik A.
dc.contributor.author Wegrzyn, Jill L.
dc.contributor.author Steenkamp, Emma Theodora
dc.contributor.author Myburg, Alexander Andrew
dc.contributor.author Naidoo, Sanushka
dc.date.accessioned 2016-02-25T06:00:20Z
dc.date.available 2016-02-25T06:00:20Z
dc.date.issued 2015-12-12
dc.description Additional file 1: Table S1. EvidentialGene tr2aacds pipeline output summary. en_ZA
dc.description Additional file 2: Table S2. Assembly statistics for EvidentialGene tr2aacds pipeline merged assembly compared to average statistics for each assembler. en_ZA
dc.description Additional file 3: Table S3. Predicted species distribution for non-pine origin sequences removed from the Pinus patula v1.0 transcriptome. en_ZA
dc.description Additional file 4: Figure S1. Molecular function gene ontology distribution for the Pinus patula v1.0 transcriptome. en_ZA
dc.description Additional file 5: Table S4. Tribe-MCL gene families and annotations for all 15 species used. en_ZA
dc.description Additional file 6: Table S5. Conditional reciprocal best BLAST alignment results between full-length Sanger sequenced Pinus taeda cDNA and representative Pinus patula transcripts for each cDNA. en_ZA
dc.description Additional file 7: Figure S2. Summary statistics for alignment of Pinus taeda complete CDS sequences to assembled Pinus patula transcripts. Pita = P. taeda. The x-axis represents the query P. taeda cDNA sequence. The solid y-axis (left) illustrates: cDNA query sequence length (pink circle), P. patula subject sequence length (blue square), conditional reciprocal best BLAST alignment length (gold triangle). The dashed y-axis (right) depicts the: percentage identity between sequences (black line), percentage coverage of the P. taeda cDNA by the corresponding P. patula transcript (green cross) and vice versa (purple plus). en_ZA
dc.description Additional file 8: Table S6. EBSeq differential expression analysis results comparing expression between inoculated and mock-inoculated data. en_ZA
dc.description Additional file 9: Table S7. Summarized list of differentially expressed genes between inoculated and mock-inoculated data with annotations. en_ZA
dc.description.abstract BACKGROUND : Pines are the most important tree species to the international forestry industry, covering 42 % of the global industrial forest plantation area. One of the most pressing threats to cultivation of some pine species is the pitch canker fungus, Fusarium circinatum, which can have devastating effects in both the field and nursery. Investigation of the Pinus-F. circinatum host-pathogen interaction is crucial for development of effective disease management strategies. As with many non-model organisms, investigation of host-pathogen interactions in pine species is hampered by limited genomic resources. This was partially alleviated through release of the 22 Gbp Pinus taeda v1.01 genome sequence (http://pinegenome.org/pinerefseq/) in 2014. Despite the fact that the fragmented state of the genome may hamper comprehensive transcriptome analysis, it is possible to leverage the inherent redundancy resulting from deep RNA sequencing with Illumina short reads to assemble transcripts in the absence of a completed reference sequence. These data can then be integrated with available genomic data to produce a comprehensive transcriptome resource. The aim of this study was to provide a foundation for gene expression analysis of disease response mechanisms in Pinus patula through transcriptome assembly. RESULTS : Eighteen de novo and two reference based assemblies were produced for P. patula shoot tissue. For this purpose three transcriptome assemblers, Trinity, Velvet/OASES and SOAPdenovo-Trans, were used to maximise diversity and completeness of assembled transcripts. Redundancy in the assembly was reduced using the EvidentialGene pipeline. The resulting 52 Mb P. patula v1.0 shoot transcriptome consists of 52 112 unigenes, 60 % of which could be functionally annotated. CONCLUSIONS : The assembled transcriptome will serve as a major genomic resource for future investigation of P. patula and represents the largest gene catalogue produced to date for this species. Furthermore, this assembly can help detect gene-based genetic markers for P. patula and the comparative assembly workflow could be applied to generate similar resources for other non-model species. en_ZA
dc.description.librarian am2015 en_ZA
dc.description.sponsorship Forestry South Africa (for seed funding), the Genomics Research Institute (GRI) and the National Research Foundation’s (NRF) Bioinformatics and Functional Genomics Programme (NBFG, UID:71255) as well as Innovation, Thuthuka and THRIP grants (Grant numbers: 84951, 86936, 87912). en_ZA
dc.description.uri http://www.biomedcentral.com/bmcgenomics en_ZA
dc.identifier.citation Visser, EA, Wegrzyn, JL, Steenkmap, ET, Myburg, AA & Naidoo, S 2015, 'Combined de novo and genome guided assembly and annotation of the Pinus patula juvenile shoot transcriptome', BMC Genomics, vol. 16, art. 1057, pp. 1-13. en_ZA
dc.identifier.issn 1471-2164
dc.identifier.other 10.1186/s12864-015-2277-7
dc.identifier.uri http://hdl.handle.net/2263/51538
dc.language.iso en en_ZA
dc.publisher BioMed Central en_ZA
dc.rights © 2015 Visser et al. Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License. en_ZA
dc.subject Pinus patula en_ZA
dc.subject De novo transcriptome assembly en_ZA
dc.subject Genome guided transcriptome assembly en_ZA
dc.subject RNA-seq en_ZA
dc.title Combined de novo and genome guided assembly and annotation of the Pinus patula juvenile shoot transcriptome en_ZA
dc.type Article en_ZA


Files in this item

This item appears in the following Collection(s)

Show simple item record