Abstract:
A defining component of agroforestry parklands across Sahelo-Sudanian Africa (SSA),
the shea tree (Vitellaria paradoxa) is central to sustaining local livelihoods and the farming
environments of rural communities. Despite its economic and cultural value, however,
not to mention the ecological roles it plays as a dominant parkland species, shea
remains semi-domesticated with virtually no history of systematic genetic improvement.
In truth, shea’s extended juvenile period makes traditional breeding approaches
untenable; but the opportunity for genome-assisted breeding is immense, provided
the foundational resources are available. Here we report the development and public
release of such resources. Using the FALCON-Phase workflow, 162.6 Gb of longread
PacBio sequence data were assembled into a 658.7 Mbp, chromosome-scale
reference genome annotated with 38,505 coding genes. Whole genome duplication
(WGD) analysis based on this gene space revealed clear signatures of two ancient
WGD events in shea’s evolutionary past, one prior to the Astrid-Rosid divergence
(116–126 Mya) and the other at the root of the order Ericales (65–90 Mya). In a
first genome-wide look at the suite of fatty acid (FA) biosynthesis genes that likely
govern stearin content, the primary determinant of shea butter quality, relatively high
copy numbers of six key enzymes were found (KASI, KASIII, FATB, FAD2, FAD3, and
FAX2), some likely originating in shea’s more recent WGD event. To help translate these
findings into practical tools for characterization, selection, and genome-wide association
studies (GWAS), resequencing data from a shea diversity panel was used to develop a
database of more than 3.5 million functionally annotated, physically anchored SNPs.
Two smaller, more curated sets of suggested SNPs, one for GWAS (104,211 SNPs) and
the other targeting FA biosynthesis genes (90 SNPs), are also presented. With these
resources, the hope is to support national programs across the shea belt in the strategic,
genome-enabled conservation and long-term improvement of the shea tree for SSA.