Abstract:
BACKGROUND : Malania oleifera, a member of the Olacaceae family, is an IUCN red listed tree, endemic and restricted to the
Karst region of southwest China. This tree’s seed is valued for its high content of precious fatty acids (especially nervonic
acid). However, studies on its genetic makeup and fatty acid biogenesis are severely hampered by a lack of molecular and
genetic tools. FINDINGS : We generated 51 Gb and 135 Gb of raw DNA sequences, using Pacific Biosciences (PacBio)
single-molecule real-time and 10× Genomics sequencing, respectively. A final genome assembly, with a scaffold N50 size of
4.65 Mb and a total length of 1.51 Gb, was obtained by primary assembly based on PacBio long reads plus scaffolding with
10× Genomics reads. Identified repeats constituted ∼82% of the genome, and 24,064 protein-coding genes were predicted
with high support. The genome has low heterozygosity and shows no evidence for recent whole genome duplication.
Metabolic pathway genes relating to the accumulation of long-chain fatty acid were identified and studied in detail.
CONCLUSIONS : Here, we provide the first genome assembly and gene annotation for M. oleifera. The availability of these
resources will be of great importance for conservation biology and for the functional genomics of nervonic acid
biosynthesis.