North Pacific harbor porpoise SNP and microhaplotype genotypes, mitochondrial control region haplotype sequences ...

Harbor porpoises in the North Pacific are found in coastal waters from southern California to Japan, but population structure is poorly known outside of a few local areas. We used multiplexed amplicon sequencing of 292 loci and genotyped clusters of SNPs as microhaplotypes (N=271 samples) in additio...

Full description

Bibliographic Details
Main Authors: Morin, Phillip, Forester, Brenna, Forney, Karin, Crossman, Carla, Hancock-Hanser, Brittany, Robertson, Kelly, Barrett-Lennard, Lance, Baird, Robin, Calambokidis, John, Gearin, Pat, Hanson, Bradley, Schumacher, Cassie, Harkins, Timothy, Fontaine, Michael, Taylor, Barbara, Parsons, Kim
Format: Dataset
Language:English
Published: Dryad 2020
Subjects:
Online Access:https://dx.doi.org/10.5061/dryad.4tmpg4f6v
https://datadryad.org/stash/dataset/doi:10.5061/dryad.4tmpg4f6v
Description
Summary:Harbor porpoises in the North Pacific are found in coastal waters from southern California to Japan, but population structure is poorly known outside of a few local areas. We used multiplexed amplicon sequencing of 292 loci and genotyped clusters of SNPs as microhaplotypes (N=271 samples) in addition to mtDNA sequence data (N=413 samples), to examine the genetic structure from samples collected along the Pacific coast and inland waterways from California to southern British Columbia. We confirmed an overall pattern of strong isolation-by-distance, suggesting that individual dispersal is restricted. We also found evidence of regions where genetic differences are larger than expected based on geographic distance alone, implying current or historical barriers to gene flow. In particular, the southernmost population in California is genetically distinct (FST = 0.02 (microhaplotypes); 0.31 (mtDNA)), with both reduced genetic variability and high frequency of an otherwise rare mtDNA haplotype. At the northern end ... : Amplicon libraries were prepared following the GT-seq protocol, including the optional Exo-SAP pre-treatment of the samples (Campbell et al., 2015), and pooled libraries were sequenced on an Illumina NextSeq500 sequencer, 1x150 bp reads. Custom scripts for processing GT-seq data (Campbell et al., 2015) were used to demultiplex the sample files and conduct preliminary genotyping. Genotypes were quality checked for duplicate samples, percent missing genotypes per locus and sample, and percent homozygosity using the strataG package in R. Microhaplotypes were generated for all loci using the R package MicrohaPlot (Baetscher et al., 2017). The MicrohaPlot algorithm inserts N’s for missing sequence data at SNPs within haplotypes, so we used a custom R-scripts (supplemental materials) to identify SNPs with >10% N’s. The identified SNPs were removed from the original vcf file using vcfTools, and MicrohaPlot was used to generate new microhaplotypes with the remaining variable SNP positions. The unfiltered ...