Protistan metabolism across the western North Atlantic Ocean revealed through autonomous underwater profiling

Metatranscriptomic assembly, predicted open reading frames, counts, and annotation files from seawater samples obtained in the western North Atlantic Ocean. GitHub notebooks are located here: https://github.com/cnatalie/BATS . Assembly was created using the eukrhythmic pipeline: https://github.com/A...

Full description

Bibliographic Details
Main Authors: Cohen, Natalie, Krinos, Arianna, Alexander, Harriet, Saito, Mak
Format: Other/Unknown Material
Language:English
Published: Zenodo 2024
Subjects:
Online Access:https://doi.org/10.5281/zenodo.12789183
Description
Summary:Metatranscriptomic assembly, predicted open reading frames, counts, and annotation files from seawater samples obtained in the western North Atlantic Ocean. GitHub notebooks are located here: https://github.com/cnatalie/BATS . Assembly was created using the eukrhythmic pipeline: https://github.com/AlexanderLabWHOI/eukrhythmic merged_merged.fasta.gz = Final assembly, merged across 44 metatranscriptomes using4 different assemblers merged.fasta.transdecoder.pep.zip = Open reading frames of final assembly, predicted by Transdecoder merged.fasta.transdecoder-estimated-taxonomy.out.zip= EUKulele-derived taxonomicannotations of ORFs using a combined EukProt, PhyloDB, and RefSeq reference database newtaxa.eukprot.merged.fasta.transdecoder-estimated-taxonomy.out.zip = similar to above, but manually curated mid-level taxonomy for supergroups of interest eggnog.emapper.annotations.zip = eggnog-mapper annotations of ORFs table.tab.zip =counts associated with ORFs (merged.fasta.transdecoder.pep) generated with Salmon TPM_table.tab.zip = community-wide TPM (normalized) counts associated with ORFs (merged.fasta.transdecoder.pep) generated with Salmon copiesperL_ORFs_FactorIncluded.csv.zip = raw counts associated with ORFs (merged.fasta.transdecoder.pep) converted to copies per L taking into account spiked-in RNA standard concentration (copies), standard reads mapped, volume of seawater filtered, and dilution factor used in library preparation assembly.table.tab.zip =counts associated with final assembly (merged_merged.fasta) generated with Salmon SamplesViewReportCLIO_AE1913merged_trans210506_updated220606exclusive.zip = Exclusive spectral counts associated with ORFs (merged.fasta.transdecoder.pep). Peptide-spectrummatches were performed using Sequest algorithm within IseNode Proteome Discoverer 2.2.0.388 (Thermo Fisher Scientific). Scaffold 5.1.2 (Proteome Software) was used for protein grouping and exclusive spectral counting. Note, the (+x) data has been removed from protein names, which indicates whether (and how many) ...