Haplogenome assembly reveals structural variation in Eucalyptus interspecific hybrids

Show simple item record

dc.contributor.author Lötter, Anneri
dc.contributor.author Duong, Tuan A.
dc.contributor.author Candotti, Julia
dc.contributor.author Mizrachi, Eshchar
dc.contributor.author Wegrzyn, Jill
dc.contributor.author Myburg, Alexander A.
dc.date.accessioned 2024-07-01T10:56:27Z
dc.date.available 2024-07-01T10:56:27Z
dc.date.issued 2023-08
dc.description DATA AVAILABILITY : Illumina DNA sequencing data was uploaded at NCBI SRA under BioProject: PRJNA885070. High density genetic linkage maps are available on GitLab [78]. The haplogenome assemblies were uploaded to the NCBI database and can be accessed with accession no. JAOPUP000000000 and JAOPUO000000000. All supporting data such as repeat element libraries, genome annotation files, synteny analyses output files etc. are available in GigaDB [79]. en_US
dc.description ADDITIONAL FILES : SUPPLEMENTARY FIGURE S1: Genome size estimates for the (A) E. urophylla, (B) E. grandis and (C) the E. urophylla x E. grandis F1 hybrid genomes. SUPPLEMENTARY FIGURE S2: Benchmarking Universal Single-Copy Orthologs (BUSCO) completeness scores for both haplogenome assemblies as well as the currently available E. grandis v2.0 reference genome. SUPPLEMENTARY FIGURE S3: Alignment of placed haplogenome scaffolds to the E. grandis v2.0 reference genome. SUPPLEMENTARY FIGURE S4: Alignment between the E. grandis and E. urophylla scaffolded haplogenome assemblies. SUPPLEMENTARY FIGURE S5: Pseudochromosomes of E. urophylla haplogenome, reconstructed from two genetic linkage input maps – uro.allmap and gra.allmap, with unequal weights (2 and 1 respectively). SUPPLEMENTARY FIGURE S6: Pseudochromosomes of E. grandis haplogenome, reconstructed from two genetic linkage input maps – gra.allmap and uro.allmap, with unequal weights (2 and 1 respectively). SUPPLEMENTARY FIGURE S7: Scaffolded chromosome sizes of the E. grandis v2.0 and the scaffolded E. grandis and E. urophylla haplogenome assemblies. SUPPLEMENTARY FIGURE S8: Alignment of unplaced E. grandis and E. urophylla haplogenome scaffolds to the E. grandis v2.0 reference genome. SUPPLEMENTARY FIGURE S9: Syntenic and rearranged regions between the E. grandis v2.0, E. grandis and E. urophylla haplogenomes for all eleven chromosomes. SUPPLEMENTARY FIGURE S10: Enriched gene ontology (GO) terms for inverted and translocated gene alignment blocks of the E. grandis haplogenome. SUPPLEMENTARY FIGURE S11: Enriched gene ontology (GO) terms for inverted and translocated gene alignment blocks of the E. urophylla haplogenome. SUPPLEMENTARY FIGURE S12: Enriched gene ontology (GO) terms genes that did not have pairwise alignment between the E. grandis and E. urophylla haplogenomes. SUPPLEMENTARY FIGURE S13: Hap-mer blob plot of the E. grandis and E. urophylla haplogenome assemblies. SUPPLEMENTARY FIGURE S14: Evaluation of haplotype phase blocks. All hap-mer information was generated with Merqury v1.1 [72]. SUPPLEMENTARY FIGURE S15: Genome coverage of the E. grandis v2.0 nuclear reference and plastid genomes. SUPPLEMENTARY FIGURE S16: Summary of the total size and type of elements found in high genome coverage bins. Organellar introgression was identified through BLAST analysis to the E. grandis plastid genomes [77], while repeat elements were identified with RepeatMasker. SUPPLEMENTARY NOTE 1: Hapmer based phasing completeness assessment. SUPPLEMENTARY NOTE 2: Read and assembly alignment and validation of high peak content. SUPPLEMENTARY TABLE S1: Illumina sequencing results. SUPPLEMENTARY TABLE S2: Nanopore sequencing results for the F1 hybrid individual. SUPPLEMENTARY TABLE S3: Summary statistics for long-read binning using the parental short reads. SUPPLEMENTARY TABLE S4: Summary statistics of placed and unplaced contigs after scaffolding with ALLMAPS for the E. urophylla and E. grandis haplogenomes respectively. SUPPLEMENTARY TABLE S5. Repeat element content of assembled haplogenomes. SUPPLEMENTARY TABLE S6: Haplogenome annotation statistics. SUPPLEMENTARY TABLE S7: Number and total length of syntenic and rearranged regions in the E. grandis and E. urophylla haplogenomes. SUPPLEMENTARY TABLE S8: Number and total length of local sequence variation in syntenic and rearranged region in the E. grandis and E. urophylla haplogenomes. SUPPLEMENTARY TABLE S9: Inversions larger than 50 kb between the E. grandis and E. urophylla haplogenomes. SUPPLEMENTARY TABLE S10: Translocations between the E. grandis and E. urophylla haplogenomes that are larger than 50 kb. SUPPLEMENTARY TABLE S11: KEGG pathway enrichment analyses for genes within inverted and translocated gene alignment blocks between the E. grandis and E. urophylla haplogenome assemblies. SUPPLEMENTARY TABLE S12: KEGG pathway enrichment analyses for genes that do not have a pairwise alignment between the E. grandis (reference) and E. urophylla (test) haplogenome assemblies. SUPPLEMENTARY TABLE S13: Altered position and length of genes with an in-frame stop codon. SUPPLEMENTARY TABLE S14: Phase block statistics of the E. grandis and E. urophylla haplo-genome assemblies. SUPPLEMENTARY TABLE S15: E. grandis and E. urophylla high coverage bin content. en_US
dc.description.abstract BACKGROUND De novo phased (haplo)genome assembly using long-read DNA sequencing data has improved the detection and characterization of structural variants (SVs) in plant and animal genomes. Able to span across haplotypes, long reads allow phased, haplogenome assembly in highly outbred organisms such as forest trees. Eucalyptus tree species and interspecific hybrids are the most widely planted hardwood trees with F1 hybrids of Eucalyptus grandis and E. urophylla forming the bulk of fast-growing pulpwood plantations in subtropical regions. The extent of structural variation and its effect on interspecific hybridization is unknown in these trees. As a first step towards elucidating the extent of structural variation between the genomes of E. grandis and E. urophylla, we sequenced and assembled the haplogenomes contained in an F1 hybrid of the two species. FINDINGS Using Nanopore sequencing and a trio-binning approach, we assembled the separate haplogenomes (566.7 Mb and 544.5 Mb) to 98.0% BUSCO completion. High-density SNP genetic linkage maps of both parents allowed scaffolding of 88.0% of the haplogenome contigs into 11 pseudo-chromosomes (scaffold N50 of 43.8 Mb and 42.5 Mb for the E. grandis and E. urophylla haplogenomes, respectively). We identify 48,729 SVs between the two haplogenomes providing the first detailed insight into genome structural rearrangement in these species. The two haplogenomes have similar gene content, 35,572 and 33,915 functionally annotated genes, of which 34.7% are contained in genome rearrangements. CONCLUSIONS Knowledge of SV and haplotype diversity in the two species will form the basis for understanding the genetic basis of hybrid superiority in these trees. en_US
dc.description.department Biochemistry en_US
dc.description.department Forestry and Agricultural Biotechnology Institute (FABI) en_US
dc.description.department Genetics en_US
dc.description.department Microbiology and Plant Pathology en_US
dc.description.librarian am2024 en_US
dc.description.sdg SDG-15:Life on land en_US
dc.description.sponsorship The Department of Science and Innovation (DSI) and Technology Innovation Agency (TIA) of South Africa, Sappi Southern Africa through the Forest Molecular Genetics (FMG) Industry Consortium at the University of Pretoria (UP), the National Research Foundation (NRF) of South Africa and funding from the UP Postgraduate Studies Abroad Programme. en_US
dc.description.uri https://academic.oup.com/gigascience en_US
dc.identifier.citation Lotter, A., Duong, T.A., Candotti, J. et al. 2023, 'Haplogenome assembly reveals structural variation in Eucalyptus interspecific hybrids', GigaScience, vol. 12, pp. 1-15. DOI: 10.1093/gigascience/giad064. en_US
dc.identifier.issn 2047-217X
dc.identifier.other 10.1093/gigascience/giad064
dc.identifier.uri http://hdl.handle.net/2263/96737
dc.language.iso en en_US
dc.publisher Oxford University Press en_US
dc.rights © The Author(s) 2023. This is an Open Access article distributed under the terms of the Creative Commons Attribution License en_US
dc.subject Eucalyptus en_US
dc.subject Trio-binning en_US
dc.subject Phased genome assembly en_US
dc.subject Nanopore en_US
dc.subject Structural variant en_US
dc.subject SDG-15: Life on land en_US
dc.title Haplogenome assembly reveals structural variation in Eucalyptus interspecific hybrids en_US
dc.type Article en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record