Data from: Number of alleles as a predictor of the relative assignment accuracy of STR and SNP baselines for chum salmon

Short tandem repeat (STR) markers, which exhibit many alleles per locus, are commonly used to assign fish to their populations of origin. Single nucleotide polymorphisms (SNPs), which have many technical advantages over STRs, typically exhibit only two alleles per locus. Simulation studies have indi...

Full description

Bibliographic Details
Main Authors: Smith, Christian T., Seeb, Lisa W.
Format: Dataset
Language:English
Published: Dryad 2020
Subjects:
STR
SNP
geo
Online Access:https://doi.org/10.5061/dryad.db8hd
Description
Summary:Short tandem repeat (STR) markers, which exhibit many alleles per locus, are commonly used to assign fish to their populations of origin. Single nucleotide polymorphisms (SNPs), which have many technical advantages over STRs, typically exhibit only two alleles per locus. Simulation studies have indicated that number of independent alleles is a good predictor of accuracy of genetic markers for fishery applications. Extant STR baselines for salmon contain hundreds of alleles, and it has been extrapolated that hundreds of SNP markers need to be developed before SNP baselines will compare to these STR baselines. We compared 15 STRs exhibiting 349 independent alleles to 61 SNP assays exhibiting 66 independent alleles for accuracy in assigning to closely related populations of chum salmon. The SNP baseline yielded slightly higher mean accuracies for proportional assignment and comparable accuracies for individual assignment. Overall the SNP baseline performed considerably better, relative to the microsatellite baseline, than predicted based on the number of independent alleles in each baseline. We suggest that this discrepancy is due to the fact that the simulation studies do not capture the impacts of the different strategies commonly employed for discovering and selecting STR and SNP markers. convert input3MSATShort tandem repeat (STR) genotype data in CONVERT formatconvert input3SNPSingle nucleotide polymorphism (SNP) genotype data in CONVERT format. Includes all loci in raw format.snp4b convert inputsingle nucleotide polymorphism (SNP) genotype data in CONVERT format. Linked SNPs converted to haplotypes and uninformative SNPs removed (as described in paper).