Plant diversity data from modern sedimentary DNA of lakes in Siberia and China

Here we provide a large dataset on genetic plant diversity retrieved from surface sedimentary DNA (sedDNA) of lakes from Siberia and China spanning over a large environmental gradient. Our dataset encompasses sedDNA sequence data of 244 surface lake sediments and 3 soil samples originating from Sibe...

Full description

Bibliographic Details
Main Authors: Stoof-Leichsenring, Kathleen R., Liu, Sisi, Jia, Weihan, Li, Kai, Pestryakova, Luidmila A., Mischke, Steffen, Cao, Xianyong, Liu, Xinqui, Ni, Jian, Neuhaus, Stefan, Herzschuh, Ulrike
Format: Dataset
Language:English
Published: Dryad 2020
Subjects:
Online Access:https://dx.doi.org/10.5061/dryad.k6djh9w4r
http://datadryad.org/stash/dataset/doi:10.5061/dryad.k6djh9w4r
id ftdatacite:10.5061/dryad.k6djh9w4r
record_format openpolar
institution Open Polar
collection DataCite Metadata Store (German National Library of Science and Technology)
op_collection_id ftdatacite
language English
description Here we provide a large dataset on genetic plant diversity retrieved from surface sedimentary DNA (sedDNA) of lakes from Siberia and China spanning over a large environmental gradient. Our dataset encompasses sedDNA sequence data of 244 surface lake sediments and 3 soil samples originating from Siberia and Chinese lakes. We used a PCR-based metabarcoding approach combined with Next-Generation Sequencing to assess the modern and local plant diversity in and around the analysed lake localities. As a plant specific metabarcode we applied the established chloroplastidal P6 loop trnL marker for plant diversity assessment. PCR products were sequenced on four independent Illumina sequencing runs (ALRK-7, ALRK-3, AGAK-5 and HQD-2). : We extracted sedimentary DNA from lake surface samples by using the DNeasy PowerMax Soil Kit and PowerMax Soil DNA Isolation kit. Further, we used a PCR-based metabarcoding approach combined with Next-Generation Sequencing. As a plant specific metabarcode we applied the established chloroplastidal P6 loop trnL marker for plant diversity assessment and amplified plant DNA from sedimentary DNA extracts. Resulting PCR products were replicated for each sample, resulting in a total of 688 PCR products, which were sequenced on four independent Illumina sequencing runs (ALRK-7, ALRK-3, AGAK-5 and HQD-2). The underlying data set consists of raw R1.fastq and R2.fastq files of the four sequencing runs, two scripts that explain how to use the OBITools pipeline for data analyses and how to prepare taxonomic databases with EcoPCR and OBITools, four different tagfiles needed for demultiplexing the sequence raw data into samples, three database files for taxonomic assignment and eight final data files, two for each sequencing run. : For reanalysis of data we provide the following data files: 1. Illumina sequencing raw data of four sequencing runs (ALRK-7, ALRK-3, AGAK-5, HQD-2). Data files are compressed. ALRK-7 (190820_NB501473_A_L1-4_ALRK-7_R1.fastq.gz, 190820_NB501473_A_L1-4_ALRK-7_R2.fastq.gz). ALRK-3 (190128_NB501850_A_L1-4_ALRK-3_R1.fastq.gz, 190128_NB501850_A_L1-4_ALRK-3_R2.fastq.gz) AGAK-5 (180912_NB501850_A_L1-4_AGAK-5_R1.fastq.gz, 180912_NB501850_A_L1-4_AGAK-5_R2.fastq.gz) HQD-2 (151111_SND104_A_L008_HQD-2_R1.fastq.gz, 151111_SND104_A_L008_HQD-2_R2.fastq.gz) 2. Two scripts to run the OBITools pipeline with a short description of each step. Data analyses with OBITools (Script_data_analyses_with_OBITools.txt) Database creation with EcoPCR and OBITools (Script_Database_creation_for_OBITools.txt) 3. Tagfiles needed for the OBITools pipeline. Sample name in the tagfiles indicates the sequencing run, the sample batch number which includes samples and corresponding controls (DNA extraction blank (BLANK) and PCR negative control (NTC)). ALRK-7 (ALRK-7_tagfile.txt) ALRK-3 (ALRK-3_tagfile.txt) AGAK-5 (AGAK-5_tagfile.txt) HDQ-2 (HQD-2_tagfile.txt) 4. Taxonomic database files needed for the OBITools pipeline (see Script_Database_creation_for_OBITools.txt) EMBL database (g_h_embl138_final.uniqIDs.fasta) Arctic database (arctborbryo_gh.fasta, ecochange.zip) 5. Final data tables after bioinformatic analyses with OBITools. For each sequencing run we provide two data tables, one with the taxonomic assignment of the EMBL and a second with the taxonomic assignment of the Arctic database. ALRK-7 (assigned_ALRK-7_unique_clean_embl138_anno.txt, assigned_ALRK-7_unique_clean_acrtborbryo_anno.txt) ALRK-3 (assigned_ALRK-3_unique_clean_embl138_anno.txt, assigned_ALRK-3_unique_clean_acrtborbryo_anno.txt) AGAK-5 (assigned_AGAK-5_unique_clean_embl138_anno.txt, assigned_AGAK-5_unique_clean_acrtborbryo_anno.txt) HQD-2 (assigned_HQD-2_unique_clean_embl138_anno.txt, assigned_HQD-2_unique_clean_acrtborbryo_anno.txt)
format Dataset
author Stoof-Leichsenring, Kathleen R.
Liu, Sisi
Jia, Weihan
Li, Kai
Pestryakova, Luidmila A.
Mischke, Steffen
Cao, Xianyong
Liu, Xinqui
Ni, Jian
Neuhaus, Stefan
Herzschuh, Ulrike
spellingShingle Stoof-Leichsenring, Kathleen R.
Liu, Sisi
Jia, Weihan
Li, Kai
Pestryakova, Luidmila A.
Mischke, Steffen
Cao, Xianyong
Liu, Xinqui
Ni, Jian
Neuhaus, Stefan
Herzschuh, Ulrike
Plant diversity data from modern sedimentary DNA of lakes in Siberia and China
author_facet Stoof-Leichsenring, Kathleen R.
Liu, Sisi
Jia, Weihan
Li, Kai
Pestryakova, Luidmila A.
Mischke, Steffen
Cao, Xianyong
Liu, Xinqui
Ni, Jian
Neuhaus, Stefan
Herzschuh, Ulrike
author_sort Stoof-Leichsenring, Kathleen R.
title Plant diversity data from modern sedimentary DNA of lakes in Siberia and China
title_short Plant diversity data from modern sedimentary DNA of lakes in Siberia and China
title_full Plant diversity data from modern sedimentary DNA of lakes in Siberia and China
title_fullStr Plant diversity data from modern sedimentary DNA of lakes in Siberia and China
title_full_unstemmed Plant diversity data from modern sedimentary DNA of lakes in Siberia and China
title_sort plant diversity data from modern sedimentary dna of lakes in siberia and china
publisher Dryad
publishDate 2020
url https://dx.doi.org/10.5061/dryad.k6djh9w4r
http://datadryad.org/stash/dataset/doi:10.5061/dryad.k6djh9w4r
geographic Arctic
geographic_facet Arctic
genre Arctic
Siberia
genre_facet Arctic
Siberia
op_rights Creative Commons Zero v1.0 Universal
https://creativecommons.org/publicdomain/zero/1.0/legalcode
cc0-1.0
op_rightsnorm CC0
op_doi https://doi.org/10.5061/dryad.k6djh9w4r
_version_ 1766340420243030016
spelling ftdatacite:10.5061/dryad.k6djh9w4r 2023-05-15T15:09:12+02:00 Plant diversity data from modern sedimentary DNA of lakes in Siberia and China Stoof-Leichsenring, Kathleen R. Liu, Sisi Jia, Weihan Li, Kai Pestryakova, Luidmila A. Mischke, Steffen Cao, Xianyong Liu, Xinqui Ni, Jian Neuhaus, Stefan Herzschuh, Ulrike 2020 https://dx.doi.org/10.5061/dryad.k6djh9w4r http://datadryad.org/stash/dataset/doi:10.5061/dryad.k6djh9w4r en eng Dryad Creative Commons Zero v1.0 Universal https://creativecommons.org/publicdomain/zero/1.0/legalcode cc0-1.0 CC0 dataset Dataset 2020 ftdatacite https://doi.org/10.5061/dryad.k6djh9w4r 2022-02-08T13:02:41Z Here we provide a large dataset on genetic plant diversity retrieved from surface sedimentary DNA (sedDNA) of lakes from Siberia and China spanning over a large environmental gradient. Our dataset encompasses sedDNA sequence data of 244 surface lake sediments and 3 soil samples originating from Siberia and Chinese lakes. We used a PCR-based metabarcoding approach combined with Next-Generation Sequencing to assess the modern and local plant diversity in and around the analysed lake localities. As a plant specific metabarcode we applied the established chloroplastidal P6 loop trnL marker for plant diversity assessment. PCR products were sequenced on four independent Illumina sequencing runs (ALRK-7, ALRK-3, AGAK-5 and HQD-2). : We extracted sedimentary DNA from lake surface samples by using the DNeasy PowerMax Soil Kit and PowerMax Soil DNA Isolation kit. Further, we used a PCR-based metabarcoding approach combined with Next-Generation Sequencing. As a plant specific metabarcode we applied the established chloroplastidal P6 loop trnL marker for plant diversity assessment and amplified plant DNA from sedimentary DNA extracts. Resulting PCR products were replicated for each sample, resulting in a total of 688 PCR products, which were sequenced on four independent Illumina sequencing runs (ALRK-7, ALRK-3, AGAK-5 and HQD-2). The underlying data set consists of raw R1.fastq and R2.fastq files of the four sequencing runs, two scripts that explain how to use the OBITools pipeline for data analyses and how to prepare taxonomic databases with EcoPCR and OBITools, four different tagfiles needed for demultiplexing the sequence raw data into samples, three database files for taxonomic assignment and eight final data files, two for each sequencing run. : For reanalysis of data we provide the following data files: 1. Illumina sequencing raw data of four sequencing runs (ALRK-7, ALRK-3, AGAK-5, HQD-2). Data files are compressed. ALRK-7 (190820_NB501473_A_L1-4_ALRK-7_R1.fastq.gz, 190820_NB501473_A_L1-4_ALRK-7_R2.fastq.gz). ALRK-3 (190128_NB501850_A_L1-4_ALRK-3_R1.fastq.gz, 190128_NB501850_A_L1-4_ALRK-3_R2.fastq.gz) AGAK-5 (180912_NB501850_A_L1-4_AGAK-5_R1.fastq.gz, 180912_NB501850_A_L1-4_AGAK-5_R2.fastq.gz) HQD-2 (151111_SND104_A_L008_HQD-2_R1.fastq.gz, 151111_SND104_A_L008_HQD-2_R2.fastq.gz) 2. Two scripts to run the OBITools pipeline with a short description of each step. Data analyses with OBITools (Script_data_analyses_with_OBITools.txt) Database creation with EcoPCR and OBITools (Script_Database_creation_for_OBITools.txt) 3. Tagfiles needed for the OBITools pipeline. Sample name in the tagfiles indicates the sequencing run, the sample batch number which includes samples and corresponding controls (DNA extraction blank (BLANK) and PCR negative control (NTC)). ALRK-7 (ALRK-7_tagfile.txt) ALRK-3 (ALRK-3_tagfile.txt) AGAK-5 (AGAK-5_tagfile.txt) HDQ-2 (HQD-2_tagfile.txt) 4. Taxonomic database files needed for the OBITools pipeline (see Script_Database_creation_for_OBITools.txt) EMBL database (g_h_embl138_final.uniqIDs.fasta) Arctic database (arctborbryo_gh.fasta, ecochange.zip) 5. Final data tables after bioinformatic analyses with OBITools. For each sequencing run we provide two data tables, one with the taxonomic assignment of the EMBL and a second with the taxonomic assignment of the Arctic database. ALRK-7 (assigned_ALRK-7_unique_clean_embl138_anno.txt, assigned_ALRK-7_unique_clean_acrtborbryo_anno.txt) ALRK-3 (assigned_ALRK-3_unique_clean_embl138_anno.txt, assigned_ALRK-3_unique_clean_acrtborbryo_anno.txt) AGAK-5 (assigned_AGAK-5_unique_clean_embl138_anno.txt, assigned_AGAK-5_unique_clean_acrtborbryo_anno.txt) HQD-2 (assigned_HQD-2_unique_clean_embl138_anno.txt, assigned_HQD-2_unique_clean_acrtborbryo_anno.txt) Dataset Arctic Siberia DataCite Metadata Store (German National Library of Science and Technology) Arctic