Abstract:
The Kingdom Fungi adds substantially to the diversity of life, but due to their cryptic morphology and lifestyle, tremendous
diversity, paucity of formally described specimens, and the difficulty in isolating environmental strains into culture, fungal
communities are difficult to characterize. This is especially true for endophytic communities of fungi living in healthy plant
tissue. The developments in next generation sequencing technologies are, however, starting to reveal the true extent of
fungal diversity. One of the promising new technologies, namely semiconductor sequencing, has thus far not been used in
fungal diversity assessments. In this study we sequenced the internal transcribed spacer 1 (ITS1) nuclear encoded ribosomal
RNA of the endophytic community of the economically important tree, Eucalyptus grandis, from South Africa using the Ion
Torrent Personal Genome Machine (PGM). We determined the impact of various analysis parameters on the interpretation of
the results, namely different sequence quality parameter settings, different sequence similarity cutoffs for clustering and
filtering of databases for removal of sequences with incomplete taxonomy. Sequence similarity cutoff values only had a
marginal effect on the identified family numbers, whereas different sequence quality filters had a large effect (89 vs. 48
families between least and most stringent filters). Database filtering had a small, but statistically significant, effect on the
assignment of sequences to reference sequences. The community was dominated by Ascomycota, and particularly by
families in the Dothidiomycetes that harbor well-known plant pathogens. The study demonstrates that semiconductor
sequencing is an ideal strategy for environmental sequencing of fungal communities. It also highlights some potential
pitfalls in subsequent data analyses when using a technology with relatively short read lengths.