Evaluating the role of reference‐genome phylogenetic distance on evolutionary inference

Abstract When a high‐quality genome assembly of a target species is unavailable, an option to avoid the costly de novo assembly process is a mapping‐based assembly. However, mapping shotgun data to a distant relative may lead to biased or erroneous evolutionary inference. Here, we used short‐read da...

Full description

Bibliographic Details
Published in:Molecular Ecology Resources
Main Authors: Prasad, Aparna, Lorenzen, Eline D., Westbury, Michael V.
Other Authors: Det Frie Forskningsråd
Format: Article in Journal/Newspaper
Language:English
Published: Wiley 2021
Subjects:
Online Access:http://dx.doi.org/10.1111/1755-0998.13457
https://onlinelibrary.wiley.com/doi/pdf/10.1111/1755-0998.13457
https://onlinelibrary.wiley.com/doi/full-xml/10.1111/1755-0998.13457
id crwiley:10.1111/1755-0998.13457
record_format openpolar
spelling crwiley:10.1111/1755-0998.13457 2024-04-14T08:09:43+00:00 Evaluating the role of reference‐genome phylogenetic distance on evolutionary inference Prasad, Aparna Lorenzen, Eline D. Westbury, Michael V. Det Frie Forskningsråd 2021 http://dx.doi.org/10.1111/1755-0998.13457 https://onlinelibrary.wiley.com/doi/pdf/10.1111/1755-0998.13457 https://onlinelibrary.wiley.com/doi/full-xml/10.1111/1755-0998.13457 en eng Wiley http://onlinelibrary.wiley.com/termsAndConditions#vor Molecular Ecology Resources volume 22, issue 1, page 45-55 ISSN 1755-098X 1755-0998 Genetics Ecology, Evolution, Behavior and Systematics Biotechnology journal-article 2021 crwiley https://doi.org/10.1111/1755-0998.13457 2024-03-19T10:58:21Z Abstract When a high‐quality genome assembly of a target species is unavailable, an option to avoid the costly de novo assembly process is a mapping‐based assembly. However, mapping shotgun data to a distant relative may lead to biased or erroneous evolutionary inference. Here, we used short‐read data from a mammal (beluga whale) and a bird species (rowi kiwi) to evaluate whether reference genome phylogenetic distance can impact downstream demographic (Pairwise Sequentially Markovian Coalescent) and genetic diversity (heterozygosity, runs of homozygosity) analyses. We mapped to assemblies of species of varying phylogenetic distance (from conspecific to genome‐wide divergence of >7%), and de novo assemblies created using cross‐species scaffolding. We show that while reference genome phylogenetic distance has an impact on demographic analyses, it is not pronounced until using a reference genome with >3% divergence from the target species. When mapping to cross‐species scaffolded assemblies, we are unable to replicate the original beluga demographic results, but are able with the rowi kiwi, presumably reflecting the more fragmented nature of the beluga assemblies. We find that increased phylogenetic distance has a pronounced impact on genetic diversity estimates; heterozygosity estimates deviate incrementally with increasing phylogenetic distance. Moreover, runs of homozygosity are largely undetectable when mapping to any nonconspecific assembly. However, these biases can be reduced when mapping to a cross‐species scaffolded assembly. Taken together, our results show that caution should be exercised when selecting reference genomes. Cross‐species scaffolding may offer a way to avoid a costly, traditional de novo assembly, while still producing robust, evolutionary inference. Article in Journal/Newspaper Beluga Beluga whale Beluga* Wiley Online Library Molecular Ecology Resources 22 1 45 55
institution Open Polar
collection Wiley Online Library
op_collection_id crwiley
language English
topic Genetics
Ecology, Evolution, Behavior and Systematics
Biotechnology
spellingShingle Genetics
Ecology, Evolution, Behavior and Systematics
Biotechnology
Prasad, Aparna
Lorenzen, Eline D.
Westbury, Michael V.
Evaluating the role of reference‐genome phylogenetic distance on evolutionary inference
topic_facet Genetics
Ecology, Evolution, Behavior and Systematics
Biotechnology
description Abstract When a high‐quality genome assembly of a target species is unavailable, an option to avoid the costly de novo assembly process is a mapping‐based assembly. However, mapping shotgun data to a distant relative may lead to biased or erroneous evolutionary inference. Here, we used short‐read data from a mammal (beluga whale) and a bird species (rowi kiwi) to evaluate whether reference genome phylogenetic distance can impact downstream demographic (Pairwise Sequentially Markovian Coalescent) and genetic diversity (heterozygosity, runs of homozygosity) analyses. We mapped to assemblies of species of varying phylogenetic distance (from conspecific to genome‐wide divergence of >7%), and de novo assemblies created using cross‐species scaffolding. We show that while reference genome phylogenetic distance has an impact on demographic analyses, it is not pronounced until using a reference genome with >3% divergence from the target species. When mapping to cross‐species scaffolded assemblies, we are unable to replicate the original beluga demographic results, but are able with the rowi kiwi, presumably reflecting the more fragmented nature of the beluga assemblies. We find that increased phylogenetic distance has a pronounced impact on genetic diversity estimates; heterozygosity estimates deviate incrementally with increasing phylogenetic distance. Moreover, runs of homozygosity are largely undetectable when mapping to any nonconspecific assembly. However, these biases can be reduced when mapping to a cross‐species scaffolded assembly. Taken together, our results show that caution should be exercised when selecting reference genomes. Cross‐species scaffolding may offer a way to avoid a costly, traditional de novo assembly, while still producing robust, evolutionary inference.
author2 Det Frie Forskningsråd
format Article in Journal/Newspaper
author Prasad, Aparna
Lorenzen, Eline D.
Westbury, Michael V.
author_facet Prasad, Aparna
Lorenzen, Eline D.
Westbury, Michael V.
author_sort Prasad, Aparna
title Evaluating the role of reference‐genome phylogenetic distance on evolutionary inference
title_short Evaluating the role of reference‐genome phylogenetic distance on evolutionary inference
title_full Evaluating the role of reference‐genome phylogenetic distance on evolutionary inference
title_fullStr Evaluating the role of reference‐genome phylogenetic distance on evolutionary inference
title_full_unstemmed Evaluating the role of reference‐genome phylogenetic distance on evolutionary inference
title_sort evaluating the role of reference‐genome phylogenetic distance on evolutionary inference
publisher Wiley
publishDate 2021
url http://dx.doi.org/10.1111/1755-0998.13457
https://onlinelibrary.wiley.com/doi/pdf/10.1111/1755-0998.13457
https://onlinelibrary.wiley.com/doi/full-xml/10.1111/1755-0998.13457
genre Beluga
Beluga whale
Beluga*
genre_facet Beluga
Beluga whale
Beluga*
op_source Molecular Ecology Resources
volume 22, issue 1, page 45-55
ISSN 1755-098X 1755-0998
op_rights http://onlinelibrary.wiley.com/termsAndConditions#vor
op_doi https://doi.org/10.1111/1755-0998.13457
container_title Molecular Ecology Resources
container_volume 22
container_issue 1
container_start_page 45
op_container_end_page 55
_version_ 1796307205400559616