Selection of marker genes for genetic barcoding of microorganisms and binning of metagenomic reads by Barcoder software tools

dc.contributor.authorRotimi, A.M. (Adeola)
dc.contributor.authorPierneef, Rian Ewald
dc.contributor.authorReva, Oleg N.
dc.contributor.emailoleg.reva@up.ac.zaen_ZA
dc.date.accessioned2018-11-29T06:24:46Z
dc.date.available2018-11-29T06:24:46Z
dc.date.issued2018-08-30
dc.description.abstractBACKGROUND : Metagenomic approaches have revealed the complexity of environmental microbiomes with the advancement in whole genome sequencing displaying a significant level of genetic heterogeneity on the species level. It has become apparent that patterns of superior bioactivity of bacteria applicable in biotechnology as well as the enhanced virulence of pathogens often requires distinguishing between closely related species or sub-species. Current methods for binning of metagenomic reads usually do not allow for identification below the genus level and generally stops at the family level. RESULTS : In this work, an attempt was made to improve metagenomic binning resolution by creating genome specific barcodes based on the core and accessory genomes. This protocol was implemented in novel software tools available for use and download from http://bargene.bi.up.ac.za/. The most abundant barcode genes from the core genomes were found to encode for ribosomal proteins, certain central metabolic genes and ABC transporters. Performance of metabarcode sequences created by this package was evaluated using artificially generated and publically available metagenomic datasets. Furthermore, a program (Barcoding 2.0) was developed to align reads against barcode sequences and thereafter calculate various parameters to score the alignments and the individual barcodes. Taxonomic units were identified in metagenomic samples by comparison of the calculated barcode scores to set cut-off values. In this study, it was found that varying sample sizes, i.e. number of reads in a metagenome and metabarcode lengths, had no significant effect on the sensitivity and specificity of the algorithm. Receiver operating characteristics (ROC) curves were calculated for different taxonomic groups based on the results of identification of the corresponding genomes in artificial metagenomic datasets. The reliability of distinguishing between species of the same genus or family by the program was nearly perfect. CONCLUSIONS : The results showed that the novel online tool BarcodeGenerator (http://bargene.bi.up.ac.za/) is an efficient approach for generating barcode sequences from a set of complete genomes provided by users. Another program, Barcoder 2.0 is available from the same resource to enable an efficient and practical use of metabarcodes for visualization of the distribution of organisms of interest in environmental and clinical samples.en_ZA
dc.description.departmentBiochemistryen_ZA
dc.description.librarianam2018en_ZA
dc.description.sponsorshipThe South African National Research Foundation (NRF) grant #93664.en_ZA
dc.description.urihttps://bmcbioinformatics.biomedcentral.comen_ZA
dc.identifier.citationRotimi, A.M., Pierneef, R. & Reva, O.N. 2018, 'Selection of marker genes for genetic barcoding of microorganisms and binning of metagenomic reads by Barcoder software tools', BMC Bioinformatics, vol. 19, art. 309, pp. 1-11.en_ZA
dc.identifier.issn1471-2105 (online)
dc.identifier.other10.1186/s12859-018-2320-1
dc.identifier.urihttp://hdl.handle.net/2263/67400
dc.language.isoenen_ZA
dc.publisherBioMed Centralen_ZA
dc.rights© The Author(s). 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License.en_ZA
dc.subjectMetabarcodingen_ZA
dc.subjectMetagenomeen_ZA
dc.subjectBacterial genomeen_ZA
dc.subjectSoftware toolen_ZA
dc.subjectReceiver operating characteristics (ROC)en_ZA
dc.subjectBacteriaen_ZA
dc.subjectBarcodeen_ZA
dc.subjectNext-generation sequencing (NGS)en_ZA
dc.subjectComputer aided software engineeringen_ZA
dc.subjectComputer softwareen_ZA
dc.subjectBacterial genomesen_ZA
dc.subjectGenetic heterogeneitieen_ZA
dc.subjectRibosomal proteinsen_ZA
dc.subjectSensitivityen_ZA
dc.subjectSpecificityen_ZA
dc.subjectGenesen_ZA
dc.subjectWhole genome sequencing (WGS)en_ZA
dc.titleSelection of marker genes for genetic barcoding of microorganisms and binning of metagenomic reads by Barcoder software toolsen_ZA
dc.typeArticleen_ZA

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Rotimi_Selection_2018.pdf
Size:
1.84 MB
Format:
Adobe Portable Document Format
Description:
Article

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.75 KB
Format:
Item-specific license agreed upon to submission
Description: