WGDTree: a phylogenetic software tool to examine conditional probabilities of retention following whole genome duplication events

BACKGROUND: Multiple processes impact the probability of retention of individual genes following whole genome duplication (WGD) events. In analyzing two consecutive whole genome duplication events that occurred in the lineage leading to Atlantic salmon, a new phylogenetic statistical analysis was de...

Full description

Bibliographic Details
Published in:BMC Bioinformatics
Main Authors: Henry, C. Nicholas, Piper, Kathryn, Wilson, Amanda E., Miraszek, John L., Probst, Claire S., Rong, Yuying, Liberles, David A.
Format: Text
Language:English
Published: BioMed Central 2022
Subjects:
Online Access:http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9701042/
https://doi.org/10.1186/s12859-022-05042-w
id ftpubmed:oai:pubmedcentral.nih.gov:9701042
record_format openpolar
spelling ftpubmed:oai:pubmedcentral.nih.gov:9701042 2023-05-15T15:32:44+02:00 WGDTree: a phylogenetic software tool to examine conditional probabilities of retention following whole genome duplication events Henry, C. Nicholas Piper, Kathryn Wilson, Amanda E. Miraszek, John L. Probst, Claire S. Rong, Yuying Liberles, David A. 2022-11-24 http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9701042/ https://doi.org/10.1186/s12859-022-05042-w en eng BioMed Central http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9701042/ http://dx.doi.org/10.1186/s12859-022-05042-w © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. CC0 PDM CC-BY BMC Bioinformatics Software Text 2022 ftpubmed https://doi.org/10.1186/s12859-022-05042-w 2022-12-04T01:55:57Z BACKGROUND: Multiple processes impact the probability of retention of individual genes following whole genome duplication (WGD) events. In analyzing two consecutive whole genome duplication events that occurred in the lineage leading to Atlantic salmon, a new phylogenetic statistical analysis was developed to examine the contingency of retention in one event based upon retention in a previous event. This analysis is intended to evaluate mechanisms of duplicate gene retention and to provide software to generate the test statistic for any genome with pairs of WGDs in its history. RESULTS: Here a software package written in Python, ‘WGDTree’ for the analysis of duplicate gene retention following whole genome duplication events is presented. Using gene tree-species tree reconciliation to label gene duplicate nodes and differentiate between WGD and SSD duplicates, the tool calculates a statistic based upon the conditional probability of a gene duplicate being retained after a second whole genome duplication dependent upon the retention status after the first event. The package also contains methods for the simulation of gene trees with WGD events. After running simulations, the accuracy of the placement of events has been determined to be high. The conditional probability statistic has been calculated for Phalaenopsis equestris on a monocot species tree with a pair of consecutive WGD events on its lineage, showing the applicability of the method. CONCLUSIONS: A new software tool has been created for the analysis of duplicate genes in examination of retention mechanisms. The software tool has been made available on the Python package index and the source code can be found on GitHub here: https://github.com/cnickh/wgdtree. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-022-05042-w. Text Atlantic salmon PubMed Central (PMC) BMC Bioinformatics 23 1
institution Open Polar
collection PubMed Central (PMC)
op_collection_id ftpubmed
language English
topic Software
spellingShingle Software
Henry, C. Nicholas
Piper, Kathryn
Wilson, Amanda E.
Miraszek, John L.
Probst, Claire S.
Rong, Yuying
Liberles, David A.
WGDTree: a phylogenetic software tool to examine conditional probabilities of retention following whole genome duplication events
topic_facet Software
description BACKGROUND: Multiple processes impact the probability of retention of individual genes following whole genome duplication (WGD) events. In analyzing two consecutive whole genome duplication events that occurred in the lineage leading to Atlantic salmon, a new phylogenetic statistical analysis was developed to examine the contingency of retention in one event based upon retention in a previous event. This analysis is intended to evaluate mechanisms of duplicate gene retention and to provide software to generate the test statistic for any genome with pairs of WGDs in its history. RESULTS: Here a software package written in Python, ‘WGDTree’ for the analysis of duplicate gene retention following whole genome duplication events is presented. Using gene tree-species tree reconciliation to label gene duplicate nodes and differentiate between WGD and SSD duplicates, the tool calculates a statistic based upon the conditional probability of a gene duplicate being retained after a second whole genome duplication dependent upon the retention status after the first event. The package also contains methods for the simulation of gene trees with WGD events. After running simulations, the accuracy of the placement of events has been determined to be high. The conditional probability statistic has been calculated for Phalaenopsis equestris on a monocot species tree with a pair of consecutive WGD events on its lineage, showing the applicability of the method. CONCLUSIONS: A new software tool has been created for the analysis of duplicate genes in examination of retention mechanisms. The software tool has been made available on the Python package index and the source code can be found on GitHub here: https://github.com/cnickh/wgdtree. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-022-05042-w.
format Text
author Henry, C. Nicholas
Piper, Kathryn
Wilson, Amanda E.
Miraszek, John L.
Probst, Claire S.
Rong, Yuying
Liberles, David A.
author_facet Henry, C. Nicholas
Piper, Kathryn
Wilson, Amanda E.
Miraszek, John L.
Probst, Claire S.
Rong, Yuying
Liberles, David A.
author_sort Henry, C. Nicholas
title WGDTree: a phylogenetic software tool to examine conditional probabilities of retention following whole genome duplication events
title_short WGDTree: a phylogenetic software tool to examine conditional probabilities of retention following whole genome duplication events
title_full WGDTree: a phylogenetic software tool to examine conditional probabilities of retention following whole genome duplication events
title_fullStr WGDTree: a phylogenetic software tool to examine conditional probabilities of retention following whole genome duplication events
title_full_unstemmed WGDTree: a phylogenetic software tool to examine conditional probabilities of retention following whole genome duplication events
title_sort wgdtree: a phylogenetic software tool to examine conditional probabilities of retention following whole genome duplication events
publisher BioMed Central
publishDate 2022
url http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9701042/
https://doi.org/10.1186/s12859-022-05042-w
genre Atlantic salmon
genre_facet Atlantic salmon
op_source BMC Bioinformatics
op_relation http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9701042/
http://dx.doi.org/10.1186/s12859-022-05042-w
op_rights © The Author(s) 2022
https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
op_rightsnorm CC0
PDM
CC-BY
op_doi https://doi.org/10.1186/s12859-022-05042-w
container_title BMC Bioinformatics
container_volume 23
container_issue 1
_version_ 1766363230582603776