Abstract:
Gene and genome duplications are major evolutionary forces that shape the diversity and complexity of life. However, different duplication modes have distinct impacts on gene function, expression, and regulation. Existing tools for identifying and classifying duplicated genes are either outdated or not user-friendly. Here, we present doubletrouble, an R/Bioconductor package that provides a comprehensive and robust framework for analyzing duplicated genes from genomic data. doubletrouble can detect and classify gene pairs as derived from six duplication modes (segmental, tandem, proximal, retrotransposon-derived, DNA transposon-derived, and dispersed duplications), calculate substitution rates, detect signatures of putative whole-genome duplication events, and visualize results as publication-ready figures. We applied doubletrouble to classify the duplicated gene repertoire in 822 eukaryotic genomes, and results were made available through a user-friendly web interface.