Data from: A draft fur seal genome provides insights into factors affecting SNP validation and how to mitigate them
Custom genotyping arrays provide a flexible and accurate means of genotyping single nucleotide polymorphisms (SNPs) in a large number of individuals of essentially any organism. However, validation rates, defined as the proportion of putative SNPs that are verified to be polymorphic in a population,...
Main Authors: | , , , , , , , |
---|---|
Format: | Article in Journal/Newspaper |
Language: | unknown |
Published: |
2016
|
Subjects: | |
Online Access: | http://hdl.handle.net/10255/dryad.103928 http://hdl.handle.net/10255/dryad.116332 https://doi.org/10.5061/dryad.8kn8c.2 |
id |
ftdryad:oai:v1.datadryad.org:10255/dryad.116332 |
---|---|
record_format |
openpolar |
spelling |
ftdryad:oai:v1.datadryad.org:10255/dryad.116332 2023-05-15T13:40:12+02:00 Data from: A draft fur seal genome provides insights into factors affecting SNP validation and how to mitigate them Humble, E. Martinez-Barrio, A. Forcada, J. Trathan, P.N. Thorne, M.A.S. Hoffmann, M. Wolf, J. B W. Hoffman, J.I. 2016-05-19T20:51:53Z http://hdl.handle.net/10255/dryad.103928 http://hdl.handle.net/10255/dryad.116332 https://doi.org/10.5061/dryad.8kn8c.2 unknown doi:10.5061/dryad.8kn8c.2/1.2 doi:10.5061/dryad.8kn8c.2/2.2 doi:10.5061/dryad.8kn8c.2/3.2 doi:10.1111/1755-0998.12502 PMID:26683564 doi:10.5061/dryad.8kn8c.2 Humble E, Martinez-Barrio A, Forcada J, Trathan PN, Thorne MAS, Hoffmann M, Wolf JBW, Hoffman JI (2016) A draft fur seal genome provides insights into factors affecting SNP validation and how to mitigate them. Molecular Ecology Resources 16(4): 909-921. http://hdl.handle.net/10255/dryad.103928 http://hdl.handle.net/10255/dryad.116332 Antarctic fur seal draft genome Single nucleotide polymorphism (SNP) cross-validation SNP array Article 2016 ftdryad https://doi.org/10.5061/dryad.8kn8c.2 https://doi.org/10.5061/dryad.8kn8c.2/1.2 https://doi.org/10.5061/dryad.8kn8c.2/2.2 https://doi.org/10.5061/dryad.8kn8c.2/3.2 https://doi.org/10.1111/1755-0998.12502 2020-01-01T15:35:04Z Custom genotyping arrays provide a flexible and accurate means of genotyping single nucleotide polymorphisms (SNPs) in a large number of individuals of essentially any organism. However, validation rates, defined as the proportion of putative SNPs that are verified to be polymorphic in a population, are often very low. A number of potential causes of assay failure have been identified, but none have been explored systematically. In particular, as SNPs are often developed from transcriptomes, parameters relating to the genomic context are rarely taken into account. Here, we assembled a draft Antarctic fur seal (Arctocephalus gazella) genome (assembly size: 2.41Gb; scaffold/contig N50: 3.1Mb/27.5kb). We then used this resource to map the probe sequences of 144 putative SNPs genotyped in 480 individuals. The number of probe-to-genome mappings and alignment length together explained almost a third of the variation in validation success, indicating that sequence uniqueness and proximity to intron-exon boundaries play an important role. The same pattern was found after mapping the probe sequences to the Walrus and Weddell seal genomes, suggesting that the genomes of species divergent by as much as 23 million years can hold information relevant to SNP validation outcomes. Additionally, re-analysis of genotyping data from seven previous studies found the same two variables to be significantly associated with SNP validation success across a variety of taxa. Finally, our study reveals considerable scope for validation rates to be improved, either by simply filtering for SNPs whose flanking sequences align uniquely and completely to a reference genome, or through predictive modeling. Article in Journal/Newspaper Antarc* Antarctic Antarctic Fur Seal Arctocephalus gazella Weddell Seal walrus* Dryad Digital Repository (Duke University) Antarctic Weddell |
institution |
Open Polar |
collection |
Dryad Digital Repository (Duke University) |
op_collection_id |
ftdryad |
language |
unknown |
topic |
Antarctic fur seal draft genome Single nucleotide polymorphism (SNP) cross-validation SNP array |
spellingShingle |
Antarctic fur seal draft genome Single nucleotide polymorphism (SNP) cross-validation SNP array Humble, E. Martinez-Barrio, A. Forcada, J. Trathan, P.N. Thorne, M.A.S. Hoffmann, M. Wolf, J. B W. Hoffman, J.I. Data from: A draft fur seal genome provides insights into factors affecting SNP validation and how to mitigate them |
topic_facet |
Antarctic fur seal draft genome Single nucleotide polymorphism (SNP) cross-validation SNP array |
description |
Custom genotyping arrays provide a flexible and accurate means of genotyping single nucleotide polymorphisms (SNPs) in a large number of individuals of essentially any organism. However, validation rates, defined as the proportion of putative SNPs that are verified to be polymorphic in a population, are often very low. A number of potential causes of assay failure have been identified, but none have been explored systematically. In particular, as SNPs are often developed from transcriptomes, parameters relating to the genomic context are rarely taken into account. Here, we assembled a draft Antarctic fur seal (Arctocephalus gazella) genome (assembly size: 2.41Gb; scaffold/contig N50: 3.1Mb/27.5kb). We then used this resource to map the probe sequences of 144 putative SNPs genotyped in 480 individuals. The number of probe-to-genome mappings and alignment length together explained almost a third of the variation in validation success, indicating that sequence uniqueness and proximity to intron-exon boundaries play an important role. The same pattern was found after mapping the probe sequences to the Walrus and Weddell seal genomes, suggesting that the genomes of species divergent by as much as 23 million years can hold information relevant to SNP validation outcomes. Additionally, re-analysis of genotyping data from seven previous studies found the same two variables to be significantly associated with SNP validation success across a variety of taxa. Finally, our study reveals considerable scope for validation rates to be improved, either by simply filtering for SNPs whose flanking sequences align uniquely and completely to a reference genome, or through predictive modeling. |
format |
Article in Journal/Newspaper |
author |
Humble, E. Martinez-Barrio, A. Forcada, J. Trathan, P.N. Thorne, M.A.S. Hoffmann, M. Wolf, J. B W. Hoffman, J.I. |
author_facet |
Humble, E. Martinez-Barrio, A. Forcada, J. Trathan, P.N. Thorne, M.A.S. Hoffmann, M. Wolf, J. B W. Hoffman, J.I. |
author_sort |
Humble, E. |
title |
Data from: A draft fur seal genome provides insights into factors affecting SNP validation and how to mitigate them |
title_short |
Data from: A draft fur seal genome provides insights into factors affecting SNP validation and how to mitigate them |
title_full |
Data from: A draft fur seal genome provides insights into factors affecting SNP validation and how to mitigate them |
title_fullStr |
Data from: A draft fur seal genome provides insights into factors affecting SNP validation and how to mitigate them |
title_full_unstemmed |
Data from: A draft fur seal genome provides insights into factors affecting SNP validation and how to mitigate them |
title_sort |
data from: a draft fur seal genome provides insights into factors affecting snp validation and how to mitigate them |
publishDate |
2016 |
url |
http://hdl.handle.net/10255/dryad.103928 http://hdl.handle.net/10255/dryad.116332 https://doi.org/10.5061/dryad.8kn8c.2 |
geographic |
Antarctic Weddell |
geographic_facet |
Antarctic Weddell |
genre |
Antarc* Antarctic Antarctic Fur Seal Arctocephalus gazella Weddell Seal walrus* |
genre_facet |
Antarc* Antarctic Antarctic Fur Seal Arctocephalus gazella Weddell Seal walrus* |
op_relation |
doi:10.5061/dryad.8kn8c.2/1.2 doi:10.5061/dryad.8kn8c.2/2.2 doi:10.5061/dryad.8kn8c.2/3.2 doi:10.1111/1755-0998.12502 PMID:26683564 doi:10.5061/dryad.8kn8c.2 Humble E, Martinez-Barrio A, Forcada J, Trathan PN, Thorne MAS, Hoffmann M, Wolf JBW, Hoffman JI (2016) A draft fur seal genome provides insights into factors affecting SNP validation and how to mitigate them. Molecular Ecology Resources 16(4): 909-921. http://hdl.handle.net/10255/dryad.103928 http://hdl.handle.net/10255/dryad.116332 |
op_doi |
https://doi.org/10.5061/dryad.8kn8c.2 https://doi.org/10.5061/dryad.8kn8c.2/1.2 https://doi.org/10.5061/dryad.8kn8c.2/2.2 https://doi.org/10.5061/dryad.8kn8c.2/3.2 https://doi.org/10.1111/1755-0998.12502 |
_version_ |
1766130651103232000 |