Transcriptomic SNP discovery for custom genotyping arrays: impacts of sequence data, SNP calling method and genotyping technology on the probability of validation success

Humble E, Thorne MAS, Forcada J, Hoffman J. Transcriptomic SNP discovery for custom genotyping arrays: impacts of sequence data, SNP calling method and genotyping technology on the probability of validation success. BMC Research Notes . 2016;9(1): 418. Background Single nucleotide polymorphism (SNP)...

Full description

Bibliographic Details
Published in:BMC Research Notes
Main Authors: Humble, Emily, Thorne, Michael A. S., Forcada, Jaume, Hoffman, Joseph
Format: Article in Journal/Newspaper
Language:English
Published: Springer Nature 2016
Subjects:
Online Access:https://nbn-resolving.org/urn:nbn:de:0070-pub-29058785
https://pub.uni-bielefeld.de/record/2905878
https://pub.uni-bielefeld.de/download/2905878/2905879
id ftubbiepub:oai:pub.uni-bielefeld.de:2905878
record_format openpolar
spelling ftubbiepub:oai:pub.uni-bielefeld.de:2905878 2023-05-15T14:02:40+02:00 Transcriptomic SNP discovery for custom genotyping arrays: impacts of sequence data, SNP calling method and genotyping technology on the probability of validation success Humble, Emily Thorne, Michael A. S. Forcada, Jaume Hoffman, Joseph 2016 https://nbn-resolving.org/urn:nbn:de:0070-pub-29058785 https://pub.uni-bielefeld.de/record/2905878 https://pub.uni-bielefeld.de/download/2905878/2905879 eng eng Springer Nature info:eu-repo/semantics/altIdentifier/doi/10.1186/s13104-016-2209-x info:eu-repo/semantics/altIdentifier/issn/1756-0500 https://nbn-resolving.org/urn:nbn:de:0070-pub-29058785 https://pub.uni-bielefeld.de/record/2905878 https://pub.uni-bielefeld.de/download/2905878/2905879 info:eu-repo/semantics/openAccess https://rightsstatements.org/vocab/InC/1.0/ Transcriptome Roche 454 sequencing Illumina HiSeq sequencing Single nucleotide polymorphism Validation success Marine mammal Antarctic fur seal Arctocephalus gazella ddc:590 http://purl.org/coar/resource_type/c_6501 info:eu-repo/semantics/article doc-type:article text 2016 ftubbiepub https://doi.org/10.1186/s13104-016-2209-x 2022-02-08T22:34:54Z Humble E, Thorne MAS, Forcada J, Hoffman J. Transcriptomic SNP discovery for custom genotyping arrays: impacts of sequence data, SNP calling method and genotyping technology on the probability of validation success. BMC Research Notes . 2016;9(1): 418. Background Single nucleotide polymorphism (SNP) discovery is an important goal of many studies. However, the number of ‘putative’ SNPs discovered from a sequence resource may not provide a reliable indication of the number that will successfully validate with a given genotyping technology. For this it may be necessary to account for factors such as the method used for SNP discovery and the type of sequence data from which it originates, suitability of the SNP flanking sequences for probe design, and genomic context. To explore the relative importance of these and other factors, we used Illumina sequencing to augment an existing Roche 454 transcriptome assembly for the Antarctic fur seal (Arctocephalus gazella). We then mapped the raw Illumina reads to the new hybrid transcriptome using BWA and BOWTIE2 before calling SNPs with GATK. The resulting markers were pooled with two existing sets of SNPs called from the original 454 assembly using NEWBLER and SWAP454. Finally, we explored the extent to which SNPs discovered using these four methods overlapped and predicted the corresponding validation outcomes for both Illumina Infinium iSelect HD and Affymetrix Axiom arrays. Results Collating markers across all discovery methods resulted in a global list of 34,718 SNPs. However, concordance between the methods was surprisingly poor, with only 51.0 % of SNPs being discovered by more than one method and 13.5 % being called from both the 454 and Illumina datasets. Using a predictive modeling approach, we could also show that SNPs called from the Illumina data were on average more likely to successfully validate, as were SNPs called by more than one method. Above and beyond this pattern, predicted validation outcomes were also consistently better for Affymetrix Axiom arrays. Conclusions Our results suggest that focusing on SNPs called by more than one method could potentially improve validation outcomes. They also highlight possible differences between alternative genotyping technologies that could be explored in future studies of non-model organisms. Article in Journal/Newspaper Antarc* Antarctic Antarctic Fur Seal Arctocephalus gazella PUB - Publications at Bielefeld University Antarctic The Antarctic Thorne ENVELOPE(-60.700,-60.700,-62.933,-62.933) BMC Research Notes 9 1
institution Open Polar
collection PUB - Publications at Bielefeld University
op_collection_id ftubbiepub
language English
topic Transcriptome Roche 454 sequencing Illumina HiSeq sequencing Single nucleotide polymorphism Validation success Marine mammal Antarctic fur seal Arctocephalus gazella
ddc:590
spellingShingle Transcriptome Roche 454 sequencing Illumina HiSeq sequencing Single nucleotide polymorphism Validation success Marine mammal Antarctic fur seal Arctocephalus gazella
ddc:590
Humble, Emily
Thorne, Michael A. S.
Forcada, Jaume
Hoffman, Joseph
Transcriptomic SNP discovery for custom genotyping arrays: impacts of sequence data, SNP calling method and genotyping technology on the probability of validation success
topic_facet Transcriptome Roche 454 sequencing Illumina HiSeq sequencing Single nucleotide polymorphism Validation success Marine mammal Antarctic fur seal Arctocephalus gazella
ddc:590
description Humble E, Thorne MAS, Forcada J, Hoffman J. Transcriptomic SNP discovery for custom genotyping arrays: impacts of sequence data, SNP calling method and genotyping technology on the probability of validation success. BMC Research Notes . 2016;9(1): 418. Background Single nucleotide polymorphism (SNP) discovery is an important goal of many studies. However, the number of ‘putative’ SNPs discovered from a sequence resource may not provide a reliable indication of the number that will successfully validate with a given genotyping technology. For this it may be necessary to account for factors such as the method used for SNP discovery and the type of sequence data from which it originates, suitability of the SNP flanking sequences for probe design, and genomic context. To explore the relative importance of these and other factors, we used Illumina sequencing to augment an existing Roche 454 transcriptome assembly for the Antarctic fur seal (Arctocephalus gazella). We then mapped the raw Illumina reads to the new hybrid transcriptome using BWA and BOWTIE2 before calling SNPs with GATK. The resulting markers were pooled with two existing sets of SNPs called from the original 454 assembly using NEWBLER and SWAP454. Finally, we explored the extent to which SNPs discovered using these four methods overlapped and predicted the corresponding validation outcomes for both Illumina Infinium iSelect HD and Affymetrix Axiom arrays. Results Collating markers across all discovery methods resulted in a global list of 34,718 SNPs. However, concordance between the methods was surprisingly poor, with only 51.0 % of SNPs being discovered by more than one method and 13.5 % being called from both the 454 and Illumina datasets. Using a predictive modeling approach, we could also show that SNPs called from the Illumina data were on average more likely to successfully validate, as were SNPs called by more than one method. Above and beyond this pattern, predicted validation outcomes were also consistently better for Affymetrix Axiom arrays. Conclusions Our results suggest that focusing on SNPs called by more than one method could potentially improve validation outcomes. They also highlight possible differences between alternative genotyping technologies that could be explored in future studies of non-model organisms.
format Article in Journal/Newspaper
author Humble, Emily
Thorne, Michael A. S.
Forcada, Jaume
Hoffman, Joseph
author_facet Humble, Emily
Thorne, Michael A. S.
Forcada, Jaume
Hoffman, Joseph
author_sort Humble, Emily
title Transcriptomic SNP discovery for custom genotyping arrays: impacts of sequence data, SNP calling method and genotyping technology on the probability of validation success
title_short Transcriptomic SNP discovery for custom genotyping arrays: impacts of sequence data, SNP calling method and genotyping technology on the probability of validation success
title_full Transcriptomic SNP discovery for custom genotyping arrays: impacts of sequence data, SNP calling method and genotyping technology on the probability of validation success
title_fullStr Transcriptomic SNP discovery for custom genotyping arrays: impacts of sequence data, SNP calling method and genotyping technology on the probability of validation success
title_full_unstemmed Transcriptomic SNP discovery for custom genotyping arrays: impacts of sequence data, SNP calling method and genotyping technology on the probability of validation success
title_sort transcriptomic snp discovery for custom genotyping arrays: impacts of sequence data, snp calling method and genotyping technology on the probability of validation success
publisher Springer Nature
publishDate 2016
url https://nbn-resolving.org/urn:nbn:de:0070-pub-29058785
https://pub.uni-bielefeld.de/record/2905878
https://pub.uni-bielefeld.de/download/2905878/2905879
long_lat ENVELOPE(-60.700,-60.700,-62.933,-62.933)
geographic Antarctic
The Antarctic
Thorne
geographic_facet Antarctic
The Antarctic
Thorne
genre Antarc*
Antarctic
Antarctic Fur Seal
Arctocephalus gazella
genre_facet Antarc*
Antarctic
Antarctic Fur Seal
Arctocephalus gazella
op_relation info:eu-repo/semantics/altIdentifier/doi/10.1186/s13104-016-2209-x
info:eu-repo/semantics/altIdentifier/issn/1756-0500
https://nbn-resolving.org/urn:nbn:de:0070-pub-29058785
https://pub.uni-bielefeld.de/record/2905878
https://pub.uni-bielefeld.de/download/2905878/2905879
op_rights info:eu-repo/semantics/openAccess
https://rightsstatements.org/vocab/InC/1.0/
op_doi https://doi.org/10.1186/s13104-016-2209-x
container_title BMC Research Notes
container_volume 9
container_issue 1
_version_ 1766273019420868608