First draft genome assembly of the Argane tree (Argania spinosa) [version 2; peer review: 2 approved]
Khayi, Slimane; Azza, Nour Elhouda; Gaboun, Fatima; Pirro, Stacy; Badad, Oussama; Lightfoot, David A.; Unver, Turgay; Chaouni, Bouchra; Merrouch, Redouane; Rahim, Bouchra; Essayeh, Soumaya; Ganoudi, Matika; Abdelwahd, Rabha; Diria, Ghizlane; Mdarhi, Meriem Alaoui; Labhilili, Mustapha; Iraqi, Driss; Claros, M Gonzalo; Mouhaddab, Jamila; Sedrati, Hayat; Memari, Majid; Hamamouch, Noureddine; De Dios Alche, Juan; Alche, Juan de Dios; Boukhatem, Noureddine; Mrabet, Rachid; Dahan, Rachid; Legssyer, Adelkhaleq; Khalfaoui, Mohamed; Badraoui, Mohamed; Van de Peer, Yves; Tatusova, Tatiana; El Mousadik, Abdelhamid; Mentag, Rachid; Ghazal, Hassan
Date:
2020
Abstract:
BACKGROUND : The Argane tree (Argania spinosa L. Skeels) is an endemic tree of mid-western Morocco that plays an important socioeconomic and ecologic role for a dense human population in an arid zone. Several studies confirmed the importance of this species as a food and feed source and as a resource for both pharmaceutical and cosmetic compounds. Unfortunately, the argane tree ecosystem is facing significant threats from environmental changes (global warming, over-population) and over-exploitation. Limited research has been conducted, however, on argane tree genetics and genomics, which hinders its conservation and genetic improvement. METHODS : Here, we present a draft genome assembly of A. spinosa. A reliable reference genome of A. spinosa was created using a hybrid de novo assembly approach combining short and long sequencing reads. RESULTS : In total, 144 Gb Illumina HiSeq reads and 7.6 Gb PacBio reads were produced and assembled. The final draft genome comprises 75 327 scaffolds totaling 671 Mb with an N50 of 49 916 kb. The draft assembly is close to the genome size estimated by k-mers distribution and covers 89% of complete and 4.3 % of partial Arabidopsis orthologous groups in BUSCO. CONCLUSION : The A. spinosa genome will be useful for assessing biodiversity leading to efficient conservation of this endangered endemic tree. Furthermore, the genome may enable genome-assisted cultivar breeding, and provide a better understanding of important metabolic pathways and their underlying genes for both cosmetic and pharmacological.
Description:
DATA AVAILABILITY: All of the A. spinosa datasets can be retrieved under BioProject accession number PRJNA294096: http://identifiers.org/
bioproject:PRJNA294096. The raw reads are available at NCBI
Sequence Reads Archive under accession number SRP077839:
http://identifiers.org/insdc.sra:SRP077839. The complete genome
sequence assembly project has been deposited at GenBank
under accession number QLOD00000000: http://identifiers.
org/ncbigi/GI:1408199612. Data can also be retrieved via the
International Argane Genome Consortium (IAGC) website:
http://www.arganome.org.