Abstract:
The recent sequencing of several gymnosperm genomes has greatly facilitated studying the
evolution of their genes and gene families. In this study, we examine the evidence for
expression-mediated selection in the first two fully sequenced representatives of the
gymnosperm plant clade (Picea abies and Picea glauca). We use genome-wide estimates of
gene expression (>50,000 expressed genes) to study the relationship between gene expression,
codon bias, rates of sequence divergence, protein length and gene duplication.
We found that gene expression is correlated with rates of sequence divergence and codon
bias, suggesting that natural selection is acting on Picea protein-coding genes for translational
efficiency. Gene expression, rates of sequence divergence and codon bias are correlated with
the size of gene families, with large multi-copy gene families having, on average, a lower
expression level and breadth, lower codon bias, and higher rates of sequence divergence than
single-copy gene families. Tissue-specific patterns of gene expression were more common in
large gene families with large gene expression divergence than in single copy families. Recent family expansions combined with large gene expression variation in paralogs and increased
rates of sequence evolution suggest that some Picea gene families are rapidly evolving to
cope with biotic and abiotic stress.
Our study highlights the importance of gene expression and natural selection in shaping the
evolution of protein-coding genes in Picea species, and sets the ground for further studies
investigating the evolution of individual gene families in gymnosperms.