Abstract:
Next-generation sequencing technologies have redefined the way genome sequencing is
performed. They are able to produce tens of millions of short sequences (reads), during
a single experiment, and with a much lower cost than previously possible. Due to the
dramatic increase in the amount of data generated, a challenging task is to map (align)
a set of reads to a reference genome. In this paper, we study a different version of
this problem: mapping these reads to a dynamically changing genomic sequence. We
propose a new practical algorithm, which employs a suitable data structure that takes
into account potential dynamic effects (replacements, insertions, deletions) on the genomic
sequence. The presented experimental results demonstrate that the proposed approach can
be extended and applied to address the problem of mapping short reads to multiple related
genomes.