Data from: A draft fur seal genome provides insights into factors affecting SNP validation and how to mitigate them ...

Custom genotyping arrays provide a flexible and accurate means of genotyping single nucleotide polymorphisms (SNPs) in a large number of individuals of essentially any organism. However, validation rates, defined as the proportion of putative SNPs that are verified to be polymorphic in a population,...

Full description

Bibliographic Details
Main Authors: Humble, E., Martinez-Barrio, A., Forcada, J., Trathan, P.N., Thorne, M.A.S., Hoffmann, M., Wolf, J. B. W., Hoffman, J.I., Hoffman, J. I., Trathan, P. N., Thorne, M. A. S.
Format: Dataset
Language:English
Published: Dryad 2015
Subjects:
Online Access:https://dx.doi.org/10.5061/dryad.8kn8c
https://datadryad.org/stash/dataset/doi:10.5061/dryad.8kn8c
Description
Summary:Custom genotyping arrays provide a flexible and accurate means of genotyping single nucleotide polymorphisms (SNPs) in a large number of individuals of essentially any organism. However, validation rates, defined as the proportion of putative SNPs that are verified to be polymorphic in a population, are often very low. A number of potential causes of assay failure have been identified, but none have been explored systematically. In particular, as SNPs are often developed from transcriptomes, parameters relating to the genomic context are rarely taken into account. Here, we assembled a draft Antarctic fur seal (Arctocephalus gazella) genome (assembly size: 2.41Gb; scaffold/contig N50: 3.1Mb/27.5kb). We then used this resource to map the probe sequences of 144 putative SNPs genotyped in 480 individuals. The number of probe-to-genome mappings and alignment length together explained almost a third of the variation in validation success, indicating that sequence uniqueness and proximity to intron-exon boundaries ... : submission.assembly.ArcGaz001_AP3.fastaDraft fur seal genome v1.0Seal_assay_SNPs.csvList of pre-validated fur seal SNPs plus variables used for modeling SNP validation success.crossvalidation.RR script to perform the k-fold cross-validation. ...