Abstract:
Contamination in sequenced genomes is a relatively common problem and several methods to remove non-target sequences have been devised. Typically, the target and contaminating organisms reside in different kingdoms, simplifying their separation. The authors present the case of a genome for the ascomycete fungus Teratosphaeria eucalypti, contaminated by another ascomycete fungus and a bacterium. Approaching the problem as a low-complexity metagenomics project, the authors used two available software programs, BlobToolKit and anvi'o, to filter the contaminated genome. Both the de novo and reference-assisted approaches yielded a high-quality draft genome assembly for the target fungus. Incorporating reference sequences increased assembly completeness and visualization elucidated previously unknown genome features. The authors suggest that visualization should be routine in any sequencing project, regardless of suspected contamination.
METHOD SUMMARY : Complementary use of the BlobToolKit and anvi'o programs made it possible to resolve DNA sequences originating from closely related organisms. The authors applied de novo and reference-assisted filtering of contaminated raw genomic reads and visualized the filtering process to distinguish between the genomic sequences of two ascomycetous fungi and a bacterium.
DATA DEPOSITION : The genome of Teratosphaeria eucalypti (isolate CMW54005) has been deposited in the National Center for Biotechnology Information (NCBI) genome repository under the accession number JAIZZA000000000. The reference-filtered assembly of CMW55930 has been submitted as a metagenome-assembled genome under the accession number JAJADS000000000.