The Sorcerer II Global Ocean Sampling Expedition: Northwest Atlantic through Eastern Tropical Pacific

The world's oceans contain a complex mixture of micro-organisms that are for the most part, uncharacterized both genetically and biochemically. We report here a metagenomic study of the marine planktonic microbiota in which surface (mostly marine) water samples were analyzed as part of the Sorc...

Full description

Bibliographic Details
Published in:PLoS Biology
Main Authors: Rusch, Douglas B, Halpern, Aaron L, Sutton, Granger, Heidelberg, Karla B, Williamson, Shannon, Yooseph, Shibu, Wu, Dongying, Eisen, Jonathan A, Hoffman, Jeff M, Remington, Karin, Beeson, Karen, Tran, Bao, Smith, Hamilton, Baden-Tillson, Holly, Stewart, Clare, Thorpe, Joyce, Freeman, Jason, Andrews-Pfannkoch, Cynthia, Venter, Joseph E, Li, Kelvin, Kravitz, Saul, Heidelberg, John F, Utterback, Terry, Rogers, Yu-Hui, Falcón, Luisa I, Souza, Valeria, Bonilla-Rosso, Germán, Eguiarte, Luis E, Karl, David M, Sathyendranath, Shubha, Platt, Trevor, Bermingham, Eldredge, Gallardo, Victor, Tamayo-Castillo, Giselle, Ferrari, Michael R, Strausberg, Robert L, Nealson, Kenneth, Friedman, Robert, Frazier, Marvin, Venter, J. Craig
Format: Text
Language:English
Published: Public Library of Science 2007
Subjects:
Online Access:http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1821060
http://www.ncbi.nlm.nih.gov/pubmed/17355176
https://doi.org/10.1371/journal.pbio.0050077
Description
Summary:The world's oceans contain a complex mixture of micro-organisms that are for the most part, uncharacterized both genetically and biochemically. We report here a metagenomic study of the marine planktonic microbiota in which surface (mostly marine) water samples were analyzed as part of the Sorcerer II Global Ocean Sampling expedition. These samples, collected across a several-thousand km transect from the North Atlantic through the Panama Canal and ending in the South Pacific yielded an extensive dataset consisting of 7.7 million sequencing reads (6.3 billion bp). Though a few major microbial clades dominate the planktonic marine niche, the dataset contains great diversity with 85% of the assembled sequence and 57% of the unassembled data being unique at a 98% sequence identity cutoff. Using the metadata associated with each sample and sequencing library, we developed new comparative genomic and assembly methods. One comparative genomic method, termed “fragment recruitment,” addressed questions of genome structure, evolution, and taxonomic or phylogenetic diversity, as well as the biochemical diversity of genes and gene families. A second method, termed “extreme assembly,” made possible the assembly and reconstruction of large segments of abundant but clearly nonclonal organisms. Within all abundant populations analyzed, we found extensive intra-ribotype diversity in several forms: (1) extensive sequence variation within orthologous regions throughout a given genome; despite coverage of individual ribotypes approaching 500-fold, most individual sequencing reads are unique; (2) numerous changes in gene content some with direct adaptive implications; and (3) hypervariable genomic islands that are too variable to assemble. The intra-ribotype diversity is organized into genetically isolated populations that have overlapping but independent distributions, implying distinct environmental preference. We present novel methods for measuring the genomic similarity between metagenomic samples and show how they may be ...