DATA AVAILABILITY:
Raw sequence data linked to study have been deposited to the NCBI SRA under accession number PRJNA894371. Supplemental materials are available on the following link: https://doi.org/10.6084/m9.figshare.24032697.
SUPPLEMENTARY TABLES : SUPPLEMENTARY TABLE S1: Physical and chemical properties of the deep ocean samples 3 from the South Indian Ocean sampling sites.
SUPPLEMENTARY TABLE S2: Output summary of results from CheckV. The table indicates the quality scores for viruses predicted using the VirFinder pipeline.
SUPPLEMENTARY TABLE S3: Output summary of results from CheckV. The table shows the quality estimates of the putative viruses predicted using the VirSorter2 pipeline.
SUPPLEMENTARY TABLE S4: A table indicating the IDs of various KOs associated with the 85 putative AMGs detected in dsDNA viruses from the SIO.
SUPPLEMENTARY TABLE S5: A list of pathways predicted to be associated with the KOs and putative AMGs detected in dsDNA viruses from the SIO.
SUPPLEMENTARY TABLE S6: Blastp verification of putative ssDNA viruses predicted using VirSorter2 pipeline.
SUPPLEMENTARY TABLE S7: Blastp and HHpred verification of putative Cressdnaviricota-associated Rep proteins predicted using our HMM-based approach.
SUPPLEMENTARY TABLE 8: Blastp verification of putative Phixviricota-associated VP1 proteins predicted using our HMM-based approach.
SUPPLEMENTARY FIGURES :
SUPPLEMENTARY FIGURE S1 Summary statistics for our metagenomic data. The curves together with the dashed red lines show that our metagenomes had estimated coverage >95%.
SUPPLEMENTARY FIGURE S2 Bar plots indicating the distribution of Eukaryota associated with the metagenomic dataset.
SUPPLEMENTARY FIGURE S3 Sequence similarity networks generated using 1e-60 showing Major Capsid protein sequences acquired from this study and NCBI GenBank.
SUPPLEMENTARY FIGURE S4 Sequence similarity networks of genes acquired from “dark matter” associated circular genetic elements (viral) contigs. Clusters are assigned colours based on structural predictions derived from HHpred. Clusters with >85% probability scores are indicated by different colors that distinguishes them from hypothetical proteins.