Evaluation of the impact of Illumina error correction tools on de novo genome assembly
Loading...
Date
Authors
Heydari, Mahdi
Miclotte, Giles
Demeester, Piet
Van de Peer, Yves
Fostier, Jan
Journal Title
Journal ISSN
Volume Title
Publisher
BioMed Central
Abstract
BACKGROUND : Recently, many standalone applications have been proposed to correct sequencing errors in Illumina
data. The key idea is that downstream analysis tools such as de novo genome assemblers benefit from a reduced error
rate in the input data. Surprisingly, a systematic validation of this assumption using state-of-the-art assembly methods
is lacking, even for recently published methods.
RESULTS : For twelve recent Illumina error correction tools (EC tools) we evaluated both their ability to correct
sequencing errors and their ability to improve de novo genome assembly in terms of contig size and accuracy.
CONCLUSIONS : We confirm that most EC tools reduce the number of errors in sequencing data without introducing
many new errors. However, we found that many EC tools suffer from poor performance in certain sequence contexts
such as regions with low coverage or regions that contain short repeated or low-complexity sequences. Reads
overlapping such regions are often ill-corrected in an inconsistent manner, leading to breakpoints in the resulting
assemblies that are not present in assemblies obtained from uncorrected data. Resolving this systematic flaw in future
EC tools could greatly improve the applicability of such tools.
Description
Additional file 1: Supplementary Data. Evaluation of the impact of
Illumina error correction tools on de novo genome assembly.
Keywords
Next-generation sequencing, Error correction, Illumina, Genome assembly, State of the art, Standalone applications, Sequencing errors, Poor performance, Analysis tools, Genes, Error correction tools (EC tools)
Sustainable Development Goals
Citation
Heydari, M., Miclotte, G., Demeester, P., Van de Peer, Y. & Fostier, J. 2017, 'Evaluation of the impact of Illumina error correction tools on de novo genome assembly', BMC Bioinformatics, vol. 18, art. no. 374, pp. 1-13.