DATA AVAILABILITY :
ampliconTraits trait sequence databases and files for database construction are available at https://erda.ku.dk/archives/f5d4b1d41f74ba3d6f73b212dbb11591/published-archive.html
. Code to create databases and documentation for ampliconTraits are hosted at https://github.com/jdonhauser/ampliconTraits. The R package MicEncMod is available at https://github.com/jdonhauser/MicEnvMod. A markdown for all analyses in this manuscript is available in the supplementary information. Raw sequences were deposited in the NCBI Sequence Read Archive under the accession number PRJNA1073882.
SUPPLEMENTARY MATERIAL 1 : FIGURE S1: Overview of sites in Europe, Greenland and South Africa as well as distribution of climatic, vegetation and soil parameters across the dataset. MAT = mean annual temperature, aw= water activity, MAP = mean annual precipitation, BIO5 = maximum temperature warmest month, BIO7 = annual temperature range, BIO15 = precipitation seasonality, WHC = water holding capacity, SOM = soil organic matter. FIGURE S2 Bootstrap values as a function of the sequence identity with the top hit in the reference database as scatterplot (top) and as violin plot for 10 intervals of sequence identity (bottom). Intervals: [54.2,58.8] (58.8,63.4] (63.4,67.9] (67.9,72.5] (72.5,77.1] (77.1,81.7] (81.7,86.3] (86.3,90.8] (90.8,95.4] (95.4,100]. SUPPLEMENTARY METHODS.
SUPPLEMENTARY MATERIAL 2 : Code for cross validation of database.
SUPPLEMENTARY MATERIAL 3 : Code for analyses with environmental sequences.