Jabba : hybrid error correction for long sequencing reads

Show simple item record

dc.contributor.author Miclotte, Giles
dc.contributor.author Heydari, Mahdi
dc.contributor.author Mahdi, Piet
dc.contributor.author Rombauts, Stephane
dc.contributor.author Van de Peer, Yves
dc.contributor.author Audenaert, Pieter
dc.contributor.author Fostier, Jan
dc.date.accessioned 2016-08-15T11:18:35Z
dc.date.available 2016-08-15T11:18:35Z
dc.date.issued 2016-05-03
dc.description.abstract BACKGROUND : Third generation sequencing platforms produce longer reads with higher error rates than second generation technologies. While the improved read length can provide useful information for downstream analysis, underlying algorithms are challenged by the high error rate. Error correction methods in which accurate short reads are used to correct noisy long reads appear to be attractive to generate high-quality long reads. Methods that align short reads to long reads do not optimally use the information contained in the second generation data, and suffer from large runtimes. Recently, a new hybrid error correcting method has been proposed, where the second generation data is first assembled into a de Bruijn graph, on which the long reads are then aligned. RESULTS : In this context we present Jabba, a hybrid method to correct long third generation reads by mapping them on a corrected de Bruijn graph that was constructed from second generation data. Unique to our method is the use of a pseudo alignment approach with a seed-and-extend methodology, using maximal exact matches (MEMs) as seeds. In addition to benchmark results, certain theoretical results concerning the possibilities and limitations of the use of MEMs in the context of third generation reads are presented. CONCLUSION : Jabba produces highly reliable corrected reads: almost all corrected reads align to the reference, and these alignments have a very high identity. Many of the aligned reads are error-free. Additionally, Jabba corrects reads using a very low amount of CPU time. From this we conclude that pseudo alignment with MEMs is a fast and reliable method to map long highly erroneous sequences on a de Bruijn graph. en_ZA
dc.description.department Genetics en_ZA
dc.description.librarian am2016 en_ZA
dc.description.sponsorship The Research Foundation - Flanders (FWO) (G0C3914N) en_ZA
dc.description.uri http://almob.biomedcentral.com en_ZA
dc.identifier.citation Miclotte, G, Heydari, M, Demeester, P, Rombauts, S, Van de Peer, Y, Audenaert, P & Fostier, J 2016, 'Jabba : hybrid error correction for long sequencing reads', Algorithms for Molecular Biology, vol. 11, art. #10, pp. 1-12. en_ZA
dc.identifier.issn 1748-7188
dc.identifier.other 10.1186/s13015-016-0075-7
dc.identifier.uri http://hdl.handle.net/2263/56296
dc.language.iso en en_ZA
dc.publisher BioMed Central en_ZA
dc.rights © 2016 Miclotte et al. This article is distributed under the terms of the Creative Commons Attribution 4.0 International License. en_ZA
dc.subject Sequence analysis en_ZA
dc.subject Error correction en_ZA
dc.subject De Bruijn graph en_ZA
dc.subject Maximal exact matches en_ZA
dc.title Jabba : hybrid error correction for long sequencing reads en_ZA
dc.type Article en_ZA


Files in this item

This item appears in the following Collection(s)

Show simple item record