Revisiting ancient polyploidy in leptosporangiate ferns

Loading...
Thumbnail Image

Authors

Chen, Hengchi
Fang, Yuhan
Zwaenepoel, Arthur
Huang, Sanwen
Van de Peer, Yves
Li, Zhen

Journal Title

Journal ISSN

Volume Title

Publisher

Wiley

Abstract

Ferns, and particularly homosporous ferns, have long been assumed to have experienced recurrent whole-genome duplication (WGD) events because of their substantially large genome sizes, surprisingly high chromosome numbers, and high degrees of polyploidy among many extant members. As the number of sequenced fern genomes is limited, recent studies have employed transcriptome data to find evidence for WGDs in ferns. However, they have reached conflicting results concerning the occurrence of ancient polyploidy, for instance, in the lineage of leptosporangiate ferns. Because identifying WGDs in a phylogenetic context is the foremost step in studying the contribution of ancient polyploidy to evolution, we here revisited earlier identified WGDs in leptosporangiate ferns, mainly the core leptosporangiate ferns, by building KS-age distributions and applying substitution rate corrections and by conducting statistical gene tree–species tree reconciliation analyses. Our integrative analyses not only identified four ancient WGDs in the sampled core leptosporangiate ferns but also identified false positives and false negatives for WGDs that recent studies have reported earlier. In conclusion, we underscore the significance of substitution rate corrections and uncertainties in gene tree–species tree reconciliations in calling WGD events and advance an exemplar workflow to overcome such often-overlooked issues.

Description

DATA AVAILABILITY : The data that support the findings of this study are openly available as summarized in Table S1.
SUPPORTING INFORMATION : FIG. S1 Number of genes in the transcriptome assemblies from the 1KP Initiative (2019) and Huang et al. (2020). FIG. S2 Busco analysis for the transcriptome assemblies from the 1KP Initiative (2019). FIG. S3 KS distributions for the whole paranomes in different species with the Gaussian mixture modeling analysis and the SiZer analysis. FIG. S4 Bayesian information criterion scores in the Gaussian mixture modeling analysis for different species in Fig. S3. FIG. S5 Analyses of Ksrates for different species. FIG. S6 Time-calibrated species tree from TimeTree. FIG. S7 Minimum effective sample size of tree length and the average standard deviation of split frequencies for the 1000 randomly selected gene families. FIG. S8 KS distributions for anchor pairs identified in Azolla filiculoides, Salvinia cucullata, and Adiantum capillus-veneris. FIG. S9 Box plots of the number of genes without tandem duplicates on scaffolds having anchor pairs with KS values < 0.1 and those having anchor pairs with KS values near a potential WGD peak in the three fern genomes. FIG. S10 One-to-one orthologous KS-age distributions between Dipteris conjugata and species from Cyatheales, Salviniales, and Polypodiales. FIG. S11 KS distribution for paranomes of Thyrsopteris elegans (upper) and Plagiogyria japonica (lower) within a KS range of (0, 1.0) and a binwidth of 0.05. FIG. S12 Ratios of collinear blocks for pairwise intergenomic comparisons among the three genome-available ferns. METHODS S1 Julia code for the Whale analyses with the critical and relaxed branch-specific DL + WGD models. METHODS S2 Julia code for the Whale analysis of gene tree–species tree reconciliations. TABLE S1 Taxonomy, number of genes/unigenes, and data source of fern species involved in this study. TABLE S2 Mean, standard deviation, Monte Carlo standard error, effective sample size, and 95% uncertainty interval for parameters estimated under the critical branch-specific DL + WGD model. TABLE S3 Mean, standard deviation, Monte Carlo standard error, effective sample size, and 95% uncertainty interval for parameters estimated under the relaxed branch-specific DL + WGD model. TABLE S4 Mean, standard deviation, Monte Carlo standard error, effective sample size, and 95% uncertainty interval for parameters estimated under the critical branch-specific DL + WGD model for the randomly selected gene families. TABLE S5 Mean, standard deviation, Monte Carlo standard error, effective sample size, and 95% uncertainty interval for parameters estimated under the relaxed branch-specific DL + WGD model for the randomly selected gene families.

Keywords

Ferns, Gene tree–species tree reconciliation, KS-age distribution, Phylogenomics, Polyploidy, Whole-genome duplication (WGD), SDG-15: Life on land

Sustainable Development Goals

SDG-15:Life on land

Citation

Chen, H.C., Fang, Y.H., Zwaenepoel, A. et al. 2023, 'Revisiting ancient polyploidy in leptosporangiate ferns', New Phytologist, vol. 237, no. 4, pp. 1405-1417, doi : 10.1111/nph.18607.