PMERGE: Computational filtering of paralogous sequences from RAD‐seq data

Restriction‐site associated DNA sequencing (RAD‐seq) can identify and score thousands of genetic markers from a group of samples for population‐genetics studies. One challenge of de novo RAD‐seq analysis is to distinguish paralogous sequence variants (PSVs) from true single‐nucleotide polymorphisms...

Full description

Bibliographic Details
Published in:Ecology and Evolution
Main Authors: Nadukkalam Ravindran, Praveen, Bentzen, Paul, Bradbury, Ian R., Beiko, Robert G.
Format: Text
Language:English
Published: John Wiley and Sons Inc. 2018
Subjects:
Online Access:http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6065343/
https://doi.org/10.1002/ece3.4219
id ftpubmed:oai:pubmedcentral.nih.gov:6065343
record_format openpolar
spelling ftpubmed:oai:pubmedcentral.nih.gov:6065343 2023-05-15T15:31:20+02:00 PMERGE: Computational filtering of paralogous sequences from RAD‐seq data Nadukkalam Ravindran, Praveen Bentzen, Paul Bradbury, Ian R. Beiko, Robert G. 2018-06-11 http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6065343/ https://doi.org/10.1002/ece3.4219 en eng John Wiley and Sons Inc. http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6065343/ http://dx.doi.org/10.1002/ece3.4219 © 2018 The Authors. Ecology and Evolution published by John Wiley & Sons Ltd. This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited. CC-BY Original Research Text 2018 ftpubmed https://doi.org/10.1002/ece3.4219 2018-08-05T00:35:23Z Restriction‐site associated DNA sequencing (RAD‐seq) can identify and score thousands of genetic markers from a group of samples for population‐genetics studies. One challenge of de novo RAD‐seq analysis is to distinguish paralogous sequence variants (PSVs) from true single‐nucleotide polymorphisms (SNPs) associated with orthologous loci. In the absence of a reference genome, it is difficult to differentiate true SNPs from PSVs, and their impact on downstream analysis remains unclear. Here, we introduce a network‐based approach, PMERGE that connects fragments based on their DNA sequence similarity to identify probable PSVs. Applying our method to de novo RAD‐seq data from 150 Atlantic salmon (Salmo salar) samples collected from 15 locations across the Southern Newfoundland coast allowed the identification of 87% of total PSVs identified through alignment to the Atlantic salmon genome. Removal of these paralogs altered the inferred population structure, highlighting the potential impact of filtering in RAD‐seq analysis. PMERGE is also applied to a green crab (Carcinus maenas) data set consisting of 242 samples from 11 different locations and was successfully able to identify and remove the majority of paralogous loci (62%). The PMERGE software can be run as part of the widely used Stacks analysis package. Text Atlantic salmon Newfoundland Salmo salar PubMed Central (PMC) Ecology and Evolution 8 14 7002 7013
institution Open Polar
collection PubMed Central (PMC)
op_collection_id ftpubmed
language English
topic Original Research
spellingShingle Original Research
Nadukkalam Ravindran, Praveen
Bentzen, Paul
Bradbury, Ian R.
Beiko, Robert G.
PMERGE: Computational filtering of paralogous sequences from RAD‐seq data
topic_facet Original Research
description Restriction‐site associated DNA sequencing (RAD‐seq) can identify and score thousands of genetic markers from a group of samples for population‐genetics studies. One challenge of de novo RAD‐seq analysis is to distinguish paralogous sequence variants (PSVs) from true single‐nucleotide polymorphisms (SNPs) associated with orthologous loci. In the absence of a reference genome, it is difficult to differentiate true SNPs from PSVs, and their impact on downstream analysis remains unclear. Here, we introduce a network‐based approach, PMERGE that connects fragments based on their DNA sequence similarity to identify probable PSVs. Applying our method to de novo RAD‐seq data from 150 Atlantic salmon (Salmo salar) samples collected from 15 locations across the Southern Newfoundland coast allowed the identification of 87% of total PSVs identified through alignment to the Atlantic salmon genome. Removal of these paralogs altered the inferred population structure, highlighting the potential impact of filtering in RAD‐seq analysis. PMERGE is also applied to a green crab (Carcinus maenas) data set consisting of 242 samples from 11 different locations and was successfully able to identify and remove the majority of paralogous loci (62%). The PMERGE software can be run as part of the widely used Stacks analysis package.
format Text
author Nadukkalam Ravindran, Praveen
Bentzen, Paul
Bradbury, Ian R.
Beiko, Robert G.
author_facet Nadukkalam Ravindran, Praveen
Bentzen, Paul
Bradbury, Ian R.
Beiko, Robert G.
author_sort Nadukkalam Ravindran, Praveen
title PMERGE: Computational filtering of paralogous sequences from RAD‐seq data
title_short PMERGE: Computational filtering of paralogous sequences from RAD‐seq data
title_full PMERGE: Computational filtering of paralogous sequences from RAD‐seq data
title_fullStr PMERGE: Computational filtering of paralogous sequences from RAD‐seq data
title_full_unstemmed PMERGE: Computational filtering of paralogous sequences from RAD‐seq data
title_sort pmerge: computational filtering of paralogous sequences from rad‐seq data
publisher John Wiley and Sons Inc.
publishDate 2018
url http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6065343/
https://doi.org/10.1002/ece3.4219
genre Atlantic salmon
Newfoundland
Salmo salar
genre_facet Atlantic salmon
Newfoundland
Salmo salar
op_relation http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6065343/
http://dx.doi.org/10.1002/ece3.4219
op_rights © 2018 The Authors. Ecology and Evolution published by John Wiley & Sons Ltd.
This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
op_rightsnorm CC-BY
op_doi https://doi.org/10.1002/ece3.4219
container_title Ecology and Evolution
container_volume 8
container_issue 14
container_start_page 7002
op_container_end_page 7013
_version_ 1766361824721108992