Abstract:
The woody biomass derived from tree species forms a vital part of the world’s economy, and a thorough understanding of the processes of carbon sequestration and carbohydrate metabolism in trees is paramount in ensuring efficient and sustainable use of this biomass. To date, there is still much to be learnt about wood formation and polysaccharide deposition in plant cell walls. The enzymes responsible for the synthesis, degradation, and modification of polysaccharides and glycosidic bonds are known as Carbohydrate Active enZymes (CAZymes) are organized into functional classes and families based on amino acid sequence. CAZymes in plant genomes can be analyzed using the functional protein domains that form the proteins in order to better understand the functional potential of the carbohydrate metabolism strategy employed by plants. The glycosyltransferase (GT) class of CAZyme domains is responsible for the synthesis of glycosidic bonds, and the glycosylhydrolases (GH), polysaccharide lyase (PL), and carbohydrate esterase (CE) CAZyme domain classes degrade and modify these bonds. The final class of CAZyme domains is the non-enzymatic carbohydrate binding modules (CBMs), which act to increase the activity of the enzymatic CAZyme domain classes via specific binding to polysaccharides, disruption of the cell wall polysaccharide matrix, and proximity effects when appended to enzymatic CAZyme domains in complex CAZyme domain containing proteins. In this project, we used comparative genomics and transcriptomics of CAZyme domains to analyze the functional building blocks of plant carbohydrate metabolism to gain insight into the process of wood formation, with a specific focus on the biomass feedstock crop, Eucalyptus grandis. The aim of this project was to compare the CAZyme domain frequency, diversity and complexity across plant genomes representative of the major land plant lineages and green algae species to identify any delineating factors that contribute to wood formation in tree species. In addition, we analyzed the expression levels of CAZyme domains in the transcriptomes of source and sink tissues in E. grandis and Populus trichocarpa to better understand the expression investment in carbohydrate metabolism in different tissues of divergent tree species. The results show conservation of a fundamental functional strategy for carbohydrate metabolism across land plant evolution. The ratio of CAZyme domain frequency is maintained in land plants, with GTs contributing ?40% of the genomic CAZyme domain content, highlighting the importance of polysaccharide synthesis in plants. The diversity of CAZyme domain families within each class cannot be used to differentiate the genomes of major land plant lineages (lycophytes and bryophytes, monocots, and dicots) from one another, however, species-specific differences in CAZyme domain family diversity are observed. The complexity of CAZyme domain containing proteins shows that CAZyme domains are not very promiscuous, repeated CAZyme domains within a protein are more common than unique combinations of CAZyme domains within a protein, which are also conserved for the most part. The analysis of CAZyme domain expression in six tissues in E. grandis showed that in the wood forming tissue, immature xylem, GT domain families responsible for cellulose and hemicellulose biosynthesis formed the majority of the transcript abundance, a pattern not seen in the other tissues analyzed. This pattern was conserved in P. trichocarpa, highlighting the conserved mechanism for wood formation between divergent tree species. The results of this study reveal the conservation of the fundamental functional machinery responsible for carbohydrate metabolism in land plants, and highlight the importance of differential regulation of this machinery to wood formation. The long-term goal of improving the production of lignocellulosic biomass from trees will be achieved by fully understanding the regulatory mechanisms controlling the concerted expression of these CAZyme domain-containing genes.