Metagenome assembly and protein detection for bacterial communities within supra- and sub-glacial meltwaters of Kangerlussauq, Greenland across the 2012 melting season

The dataset includes both DNA sequencing information and mass spectrometry data for biological samples collected across the melt season at Kangerlussauq, Greenland in 2012. Each individual collection is a filtered water (~10L) sample with ultrafine sediment on a disposable filter. Sample volume per...

Full description

Bibliographic Details
Main Authors: Brook Nunn, Jason Gilmore, Karen Junge
Format: Dataset
Language:unknown
Published: Arctic Data Center 2017
Subjects:
Online Access:https://doi.org/10.18739/A21W0F
Description
Summary:The dataset includes both DNA sequencing information and mass spectrometry data for biological samples collected across the melt season at Kangerlussauq, Greenland in 2012. Each individual collection is a filtered water (~10L) sample with ultrafine sediment on a disposable filter. Sample volume per tube was approximately 5mL and was stored in a 15mL tube at -80C until processing. The samples are classified as “Early” (May-June), “Mid”(June-July) and “Late” (September) season and as either “Supra” or “Sub”-glacial in reference to the meltwater collected. Due to the low abundance of bacteria within each sample, four tubes for each condition pair were pooled with effort to minimize variation in collection time and location. Five pooled samples (“SupEarly”, “SubEarly”, “SupMid”, “SubMid”, and “SubLate”) were processed to extract both DNA and bacterial proteins for subsequent proteomic analysis. Pooled samples were rinsed with a sodium azide containing solution to halt growth and centrifuged to produce muddy pellets. DNA was extracted from 0.25g of this sediment using a PowerSoil DNA kit and sequenced via Illumina NextSeq with 150 bp paired end reads. Metagenomes were assembled with the freely available MOCAT (Metagenomics Assembly and Gene Prediction Toolkit) software. In parallel, “metapeptide” databases were generated using the freely available Sixgill software package. These metagenomes and metaproteomes are provide in FASTA format. The remaining pellets of each sample were processed to isolate bacterial cells, confirmed by cell counting with colorimetric staining, then lysed, and digested with trypsin prior to mass spectrometry analysis. MS data collection was by Data Independent Acquisition (DIA) in duplicate with four injections per sample covering a mass range of 400-900 mass to charge (m/z). Peptides were identified using PECAN and the assembled metagenomes/metaproteomes. Peptide detections along with quality metrics are provided.