Data from: Genotyping-by-sequencing for estimating relatedness in non-model organisms: avoiding the trap of precise bias

There has been remarkably little attention to using the high resolution provided by genotyping-by-sequencing (i.e. RADseq and similar methods) datasets for assessing relatedness in wildlife populations. A major hurdle is the genotyping error, especially allelic dropout, often found in this type of d...

Full description

Bibliographic Details
Main Authors: Attard, Catherine R.M., Beheregaray, Luciano B., Moller, Luciana M., Attard, Catherine R. M.
Format: Dataset
Language:unknown
Published: 2017
Subjects:
Online Access:https://zenodo.org/record/4996084
https://doi.org/10.5061/dryad.t8ph5
id ftzenodo:oai:zenodo.org:4996084
record_format openpolar
spelling ftzenodo:oai:zenodo.org:4996084 2023-05-15T15:36:25+02:00 Data from: Genotyping-by-sequencing for estimating relatedness in non-model organisms: avoiding the trap of precise bias Attard, Catherine R.M. Beheregaray, Luciano B. Moller, Luciana M. Attard, Catherine R. M. 2017-12-08 https://zenodo.org/record/4996084 https://doi.org/10.5061/dryad.t8ph5 unknown doi:10.1111/1755-0998.12739 https://zenodo.org/communities/dryad https://zenodo.org/record/4996084 https://doi.org/10.5061/dryad.t8ph5 oai:zenodo.org:4996084 info:eu-repo/semantics/openAccess https://creativecommons.org/publicdomain/zero/1.0/legalcode pedigree low coverage relationships Next-generation sequencing Balaenoptera musculus Holocene double-digest restriction-site associated DNA (ddRAD) info:eu-repo/semantics/other dataset 2017 ftzenodo https://doi.org/10.5061/dryad.t8ph510.1111/1755-0998.12739 2023-03-10T19:00:22Z There has been remarkably little attention to using the high resolution provided by genotyping-by-sequencing (i.e. RADseq and similar methods) datasets for assessing relatedness in wildlife populations. A major hurdle is the genotyping error, especially allelic dropout, often found in this type of dataset that could lead to downward-biased, yet precise, estimates of relatedness. Here we assess the applicability of genotyping-by-sequencing datasets for relatedness inferences given their relatively high genotyping error rates. Individuals of known relatedness were simulated under genotyping error, allelic dropout, and missing data scenarios based on an empirical ddRAD dataset, and their true relatedness was compared to that estimated by seven relatedness estimators. We found that an estimator chosen through such analyses can circumvent the influence of genotyping error, with the estimator of Ritland (1996) shown to be unaffected by allelic dropout and to be the most accurate when there is genotyping error. We also found that the choice of estimator should not rely solely on the strength of correlation between estimated and true relatedness as a strong correlation does not necessarily mean estimates are close to true relatedness. We also demonstrated how even a large SNP dataset with genotyping error (allelic dropout or otherwise) or missing data still performs better than a perfectly genotyped microsatellite dataset of tens of markers. The simulation-based approach used here can be easily implemented by others on their own genotyping-by-sequencing datasets to confirm the most appropriate and powerful estimator for their dataset. SNP genotypesGenotype data in COANCESTRY format for 8,294 SNPsCOANCESTRY_input.txtMicrosatellite genotypeGenotype data for one individual at 20 microsatellites. The microsatellite genotypes for the remaining individuals are available in a previous Dryad entry, doi:10.5061/dryad.8m0t6 . The format of the data in the current Dryad entry is the same as the previous entry, except in the ... Dataset Balaenoptera musculus Zenodo
institution Open Polar
collection Zenodo
op_collection_id ftzenodo
language unknown
topic pedigree
low coverage
relationships
Next-generation sequencing
Balaenoptera musculus
Holocene
double-digest restriction-site associated DNA (ddRAD)
spellingShingle pedigree
low coverage
relationships
Next-generation sequencing
Balaenoptera musculus
Holocene
double-digest restriction-site associated DNA (ddRAD)
Attard, Catherine R.M.
Beheregaray, Luciano B.
Moller, Luciana M.
Attard, Catherine R. M.
Data from: Genotyping-by-sequencing for estimating relatedness in non-model organisms: avoiding the trap of precise bias
topic_facet pedigree
low coverage
relationships
Next-generation sequencing
Balaenoptera musculus
Holocene
double-digest restriction-site associated DNA (ddRAD)
description There has been remarkably little attention to using the high resolution provided by genotyping-by-sequencing (i.e. RADseq and similar methods) datasets for assessing relatedness in wildlife populations. A major hurdle is the genotyping error, especially allelic dropout, often found in this type of dataset that could lead to downward-biased, yet precise, estimates of relatedness. Here we assess the applicability of genotyping-by-sequencing datasets for relatedness inferences given their relatively high genotyping error rates. Individuals of known relatedness were simulated under genotyping error, allelic dropout, and missing data scenarios based on an empirical ddRAD dataset, and their true relatedness was compared to that estimated by seven relatedness estimators. We found that an estimator chosen through such analyses can circumvent the influence of genotyping error, with the estimator of Ritland (1996) shown to be unaffected by allelic dropout and to be the most accurate when there is genotyping error. We also found that the choice of estimator should not rely solely on the strength of correlation between estimated and true relatedness as a strong correlation does not necessarily mean estimates are close to true relatedness. We also demonstrated how even a large SNP dataset with genotyping error (allelic dropout or otherwise) or missing data still performs better than a perfectly genotyped microsatellite dataset of tens of markers. The simulation-based approach used here can be easily implemented by others on their own genotyping-by-sequencing datasets to confirm the most appropriate and powerful estimator for their dataset. SNP genotypesGenotype data in COANCESTRY format for 8,294 SNPsCOANCESTRY_input.txtMicrosatellite genotypeGenotype data for one individual at 20 microsatellites. The microsatellite genotypes for the remaining individuals are available in a previous Dryad entry, doi:10.5061/dryad.8m0t6 . The format of the data in the current Dryad entry is the same as the previous entry, except in the ...
format Dataset
author Attard, Catherine R.M.
Beheregaray, Luciano B.
Moller, Luciana M.
Attard, Catherine R. M.
author_facet Attard, Catherine R.M.
Beheregaray, Luciano B.
Moller, Luciana M.
Attard, Catherine R. M.
author_sort Attard, Catherine R.M.
title Data from: Genotyping-by-sequencing for estimating relatedness in non-model organisms: avoiding the trap of precise bias
title_short Data from: Genotyping-by-sequencing for estimating relatedness in non-model organisms: avoiding the trap of precise bias
title_full Data from: Genotyping-by-sequencing for estimating relatedness in non-model organisms: avoiding the trap of precise bias
title_fullStr Data from: Genotyping-by-sequencing for estimating relatedness in non-model organisms: avoiding the trap of precise bias
title_full_unstemmed Data from: Genotyping-by-sequencing for estimating relatedness in non-model organisms: avoiding the trap of precise bias
title_sort data from: genotyping-by-sequencing for estimating relatedness in non-model organisms: avoiding the trap of precise bias
publishDate 2017
url https://zenodo.org/record/4996084
https://doi.org/10.5061/dryad.t8ph5
genre Balaenoptera musculus
genre_facet Balaenoptera musculus
op_relation doi:10.1111/1755-0998.12739
https://zenodo.org/communities/dryad
https://zenodo.org/record/4996084
https://doi.org/10.5061/dryad.t8ph5
oai:zenodo.org:4996084
op_rights info:eu-repo/semantics/openAccess
https://creativecommons.org/publicdomain/zero/1.0/legalcode
op_doi https://doi.org/10.5061/dryad.t8ph510.1111/1755-0998.12739
_version_ 1766366769266556928