WGDTree: a phylogenetic software tool to examine conditional probabilities of retention following whole genome duplication events

Abstract Background Multiple processes impact the probability of retention of individual genes following whole genome duplication (WGD) events. In analyzing two consecutive whole genome duplication events that occurred in the lineage leading to Atlantic salmon, a new phylogenetic statistical analysi...

Full description

Bibliographic Details
Published in:BMC Bioinformatics
Main Authors: C. Nicholas Henry, Kathryn Piper, Amanda E. Wilson, John L. Miraszek, Claire S. Probst, Yuying Rong, David A. Liberles
Format: Article in Journal/Newspaper
Language:English
Published: BMC 2022
Subjects:
Online Access:https://doi.org/10.1186/s12859-022-05042-w
https://doaj.org/article/53894a2446ae4d7eb634b72e26f7697e
id ftdoajarticles:oai:doaj.org/article:53894a2446ae4d7eb634b72e26f7697e
record_format openpolar
spelling ftdoajarticles:oai:doaj.org/article:53894a2446ae4d7eb634b72e26f7697e 2023-05-15T15:32:48+02:00 WGDTree: a phylogenetic software tool to examine conditional probabilities of retention following whole genome duplication events C. Nicholas Henry Kathryn Piper Amanda E. Wilson John L. Miraszek Claire S. Probst Yuying Rong David A. Liberles 2022-11-01T00:00:00Z https://doi.org/10.1186/s12859-022-05042-w https://doaj.org/article/53894a2446ae4d7eb634b72e26f7697e EN eng BMC https://doi.org/10.1186/s12859-022-05042-w https://doaj.org/toc/1471-2105 doi:10.1186/s12859-022-05042-w 1471-2105 https://doaj.org/article/53894a2446ae4d7eb634b72e26f7697e BMC Bioinformatics, Vol 23, Iss 1, Pp 1-15 (2022) Whole genome duplication Phylogenetic analysis Gene duplicability Mutational opportunity Computer applications to medicine. Medical informatics R858-859.7 Biology (General) QH301-705.5 article 2022 ftdoajarticles https://doi.org/10.1186/s12859-022-05042-w 2022-12-30T22:34:59Z Abstract Background Multiple processes impact the probability of retention of individual genes following whole genome duplication (WGD) events. In analyzing two consecutive whole genome duplication events that occurred in the lineage leading to Atlantic salmon, a new phylogenetic statistical analysis was developed to examine the contingency of retention in one event based upon retention in a previous event. This analysis is intended to evaluate mechanisms of duplicate gene retention and to provide software to generate the test statistic for any genome with pairs of WGDs in its history. Results Here a software package written in Python, ‘WGDTree’ for the analysis of duplicate gene retention following whole genome duplication events is presented. Using gene tree-species tree reconciliation to label gene duplicate nodes and differentiate between WGD and SSD duplicates, the tool calculates a statistic based upon the conditional probability of a gene duplicate being retained after a second whole genome duplication dependent upon the retention status after the first event. The package also contains methods for the simulation of gene trees with WGD events. After running simulations, the accuracy of the placement of events has been determined to be high. The conditional probability statistic has been calculated for Phalaenopsis equestris on a monocot species tree with a pair of consecutive WGD events on its lineage, showing the applicability of the method. Conclusions A new software tool has been created for the analysis of duplicate genes in examination of retention mechanisms. The software tool has been made available on the Python package index and the source code can be found on GitHub here: https://github.com/cnickh/wgdtree . Article in Journal/Newspaper Atlantic salmon Directory of Open Access Journals: DOAJ Articles BMC Bioinformatics 23 1
institution Open Polar
collection Directory of Open Access Journals: DOAJ Articles
op_collection_id ftdoajarticles
language English
topic Whole genome duplication
Phylogenetic analysis
Gene duplicability
Mutational opportunity
Computer applications to medicine. Medical informatics
R858-859.7
Biology (General)
QH301-705.5
spellingShingle Whole genome duplication
Phylogenetic analysis
Gene duplicability
Mutational opportunity
Computer applications to medicine. Medical informatics
R858-859.7
Biology (General)
QH301-705.5
C. Nicholas Henry
Kathryn Piper
Amanda E. Wilson
John L. Miraszek
Claire S. Probst
Yuying Rong
David A. Liberles
WGDTree: a phylogenetic software tool to examine conditional probabilities of retention following whole genome duplication events
topic_facet Whole genome duplication
Phylogenetic analysis
Gene duplicability
Mutational opportunity
Computer applications to medicine. Medical informatics
R858-859.7
Biology (General)
QH301-705.5
description Abstract Background Multiple processes impact the probability of retention of individual genes following whole genome duplication (WGD) events. In analyzing two consecutive whole genome duplication events that occurred in the lineage leading to Atlantic salmon, a new phylogenetic statistical analysis was developed to examine the contingency of retention in one event based upon retention in a previous event. This analysis is intended to evaluate mechanisms of duplicate gene retention and to provide software to generate the test statistic for any genome with pairs of WGDs in its history. Results Here a software package written in Python, ‘WGDTree’ for the analysis of duplicate gene retention following whole genome duplication events is presented. Using gene tree-species tree reconciliation to label gene duplicate nodes and differentiate between WGD and SSD duplicates, the tool calculates a statistic based upon the conditional probability of a gene duplicate being retained after a second whole genome duplication dependent upon the retention status after the first event. The package also contains methods for the simulation of gene trees with WGD events. After running simulations, the accuracy of the placement of events has been determined to be high. The conditional probability statistic has been calculated for Phalaenopsis equestris on a monocot species tree with a pair of consecutive WGD events on its lineage, showing the applicability of the method. Conclusions A new software tool has been created for the analysis of duplicate genes in examination of retention mechanisms. The software tool has been made available on the Python package index and the source code can be found on GitHub here: https://github.com/cnickh/wgdtree .
format Article in Journal/Newspaper
author C. Nicholas Henry
Kathryn Piper
Amanda E. Wilson
John L. Miraszek
Claire S. Probst
Yuying Rong
David A. Liberles
author_facet C. Nicholas Henry
Kathryn Piper
Amanda E. Wilson
John L. Miraszek
Claire S. Probst
Yuying Rong
David A. Liberles
author_sort C. Nicholas Henry
title WGDTree: a phylogenetic software tool to examine conditional probabilities of retention following whole genome duplication events
title_short WGDTree: a phylogenetic software tool to examine conditional probabilities of retention following whole genome duplication events
title_full WGDTree: a phylogenetic software tool to examine conditional probabilities of retention following whole genome duplication events
title_fullStr WGDTree: a phylogenetic software tool to examine conditional probabilities of retention following whole genome duplication events
title_full_unstemmed WGDTree: a phylogenetic software tool to examine conditional probabilities of retention following whole genome duplication events
title_sort wgdtree: a phylogenetic software tool to examine conditional probabilities of retention following whole genome duplication events
publisher BMC
publishDate 2022
url https://doi.org/10.1186/s12859-022-05042-w
https://doaj.org/article/53894a2446ae4d7eb634b72e26f7697e
genre Atlantic salmon
genre_facet Atlantic salmon
op_source BMC Bioinformatics, Vol 23, Iss 1, Pp 1-15 (2022)
op_relation https://doi.org/10.1186/s12859-022-05042-w
https://doaj.org/toc/1471-2105
doi:10.1186/s12859-022-05042-w
1471-2105
https://doaj.org/article/53894a2446ae4d7eb634b72e26f7697e
op_doi https://doi.org/10.1186/s12859-022-05042-w
container_title BMC Bioinformatics
container_volume 23
container_issue 1
_version_ 1766363287215144960