SNP detection exploiting multiple sources of redundancy in large EST collections improves validation rates
Motivation: Single nucleotide polymorphism (SNP) detection exploiting redundancy in expressed sequence tag (EST) collections that arises from the presence of transcripts of the same gene from different individuals has been used to generate large collections of SNPs for many species. A second source...
Published in: | Bioinformatics |
---|---|
Main Authors: | , , , , |
Format: | Conference Object |
Language: | English |
Published: |
2007
|
Subjects: | |
Online Access: | https://espace.library.uq.edu.au/view/UQ:398770 |
id |
ftunivqespace:oai:espace.library.uq.edu.au:UQ:398770 |
---|---|
record_format |
openpolar |
spelling |
ftunivqespace:oai:espace.library.uq.edu.au:UQ:398770 2023-05-15T15:32:29+02:00 SNP detection exploiting multiple sources of redundancy in large EST collections improves validation rates Hayes, Ben J. Nilsen, Kjetil Berg, Paul R. Grindflek, Eli Lien, Sigbjorn 2007-07-01 https://espace.library.uq.edu.au/view/UQ:398770 eng eng doi:10.1093/bioinformatics/btm154 issn:1367-4803 orcid:0000-0002-5606-3970 1303 Specialist Studies in Education 1308 Clinical Biochemistry 1312 Molecular Biology 1703 Computational Theory and Mathematics 1706 Computer Science Applications 2605 Computational Mathematics 2613 Statistics and Probability Conference Paper 2007 ftunivqespace https://doi.org/10.1093/bioinformatics/btm154 2020-08-18T02:46:27Z Motivation: Single nucleotide polymorphism (SNP) detection exploiting redundancy in expressed sequence tag (EST) collections that arises from the presence of transcripts of the same gene from different individuals has been used to generate large collections of SNPs for many species. A second source of redundancy, namely that EST collections can contain multiple transcripts of the same gene from the same individual, can be exploited to distinguish true SNPs from sequencing error. In this article, we demonstrate with Atlantic salmon and pig EST collections that splitting the EST collection in two, detecting SNPs in both subsets, then accepting only cross-validated SNPs increases validation rates. Results: In the pig data set, 676 cross-validated putative SNPs were detected in a collection of 160 689 ESTs. When validating a subset of these by genotyping on MassARRAY 85.1% of SNPs were polymorphic in successful assays. In the salmon data set, 856 cross-validated putative SNPs were detected in a collection of 243 674 ESTs. Validation by genotyping showed that 81.0% of the cross-validated putative SNPs were polymorphic in successful assays. Conference Object Atlantic salmon The University of Queensland: UQ eSpace Bioinformatics 23 13 1692 1693 |
institution |
Open Polar |
collection |
The University of Queensland: UQ eSpace |
op_collection_id |
ftunivqespace |
language |
English |
topic |
1303 Specialist Studies in Education 1308 Clinical Biochemistry 1312 Molecular Biology 1703 Computational Theory and Mathematics 1706 Computer Science Applications 2605 Computational Mathematics 2613 Statistics and Probability |
spellingShingle |
1303 Specialist Studies in Education 1308 Clinical Biochemistry 1312 Molecular Biology 1703 Computational Theory and Mathematics 1706 Computer Science Applications 2605 Computational Mathematics 2613 Statistics and Probability Hayes, Ben J. Nilsen, Kjetil Berg, Paul R. Grindflek, Eli Lien, Sigbjorn SNP detection exploiting multiple sources of redundancy in large EST collections improves validation rates |
topic_facet |
1303 Specialist Studies in Education 1308 Clinical Biochemistry 1312 Molecular Biology 1703 Computational Theory and Mathematics 1706 Computer Science Applications 2605 Computational Mathematics 2613 Statistics and Probability |
description |
Motivation: Single nucleotide polymorphism (SNP) detection exploiting redundancy in expressed sequence tag (EST) collections that arises from the presence of transcripts of the same gene from different individuals has been used to generate large collections of SNPs for many species. A second source of redundancy, namely that EST collections can contain multiple transcripts of the same gene from the same individual, can be exploited to distinguish true SNPs from sequencing error. In this article, we demonstrate with Atlantic salmon and pig EST collections that splitting the EST collection in two, detecting SNPs in both subsets, then accepting only cross-validated SNPs increases validation rates. Results: In the pig data set, 676 cross-validated putative SNPs were detected in a collection of 160 689 ESTs. When validating a subset of these by genotyping on MassARRAY 85.1% of SNPs were polymorphic in successful assays. In the salmon data set, 856 cross-validated putative SNPs were detected in a collection of 243 674 ESTs. Validation by genotyping showed that 81.0% of the cross-validated putative SNPs were polymorphic in successful assays. |
format |
Conference Object |
author |
Hayes, Ben J. Nilsen, Kjetil Berg, Paul R. Grindflek, Eli Lien, Sigbjorn |
author_facet |
Hayes, Ben J. Nilsen, Kjetil Berg, Paul R. Grindflek, Eli Lien, Sigbjorn |
author_sort |
Hayes, Ben J. |
title |
SNP detection exploiting multiple sources of redundancy in large EST collections improves validation rates |
title_short |
SNP detection exploiting multiple sources of redundancy in large EST collections improves validation rates |
title_full |
SNP detection exploiting multiple sources of redundancy in large EST collections improves validation rates |
title_fullStr |
SNP detection exploiting multiple sources of redundancy in large EST collections improves validation rates |
title_full_unstemmed |
SNP detection exploiting multiple sources of redundancy in large EST collections improves validation rates |
title_sort |
snp detection exploiting multiple sources of redundancy in large est collections improves validation rates |
publishDate |
2007 |
url |
https://espace.library.uq.edu.au/view/UQ:398770 |
genre |
Atlantic salmon |
genre_facet |
Atlantic salmon |
op_relation |
doi:10.1093/bioinformatics/btm154 issn:1367-4803 orcid:0000-0002-5606-3970 |
op_doi |
https://doi.org/10.1093/bioinformatics/btm154 |
container_title |
Bioinformatics |
container_volume |
23 |
container_issue |
13 |
container_start_page |
1692 |
op_container_end_page |
1693 |
_version_ |
1766362988761055232 |