WGDTree: a phylogenetic software tool to examine conditional probabilities of retention following whole genome duplication events
Abstract Background Multiple processes impact the probability of retention of individual genes following whole genome duplication (WGD) events. In analyzing two consecutive whole genome duplication events that occurred in the lineage leading to Atlantic salmon, a new phylogenetic statistical analysi...
Published in: | BMC Bioinformatics |
---|---|
Main Authors: | , , , , , , |
Format: | Article in Journal/Newspaper |
Language: | English |
Published: |
BMC
2022
|
Subjects: | |
Online Access: | https://doi.org/10.1186/s12859-022-05042-w https://doaj.org/article/53894a2446ae4d7eb634b72e26f7697e |
id |
ftdoajarticles:oai:doaj.org/article:53894a2446ae4d7eb634b72e26f7697e |
---|---|
record_format |
openpolar |
spelling |
ftdoajarticles:oai:doaj.org/article:53894a2446ae4d7eb634b72e26f7697e 2023-05-15T15:32:48+02:00 WGDTree: a phylogenetic software tool to examine conditional probabilities of retention following whole genome duplication events C. Nicholas Henry Kathryn Piper Amanda E. Wilson John L. Miraszek Claire S. Probst Yuying Rong David A. Liberles 2022-11-01T00:00:00Z https://doi.org/10.1186/s12859-022-05042-w https://doaj.org/article/53894a2446ae4d7eb634b72e26f7697e EN eng BMC https://doi.org/10.1186/s12859-022-05042-w https://doaj.org/toc/1471-2105 doi:10.1186/s12859-022-05042-w 1471-2105 https://doaj.org/article/53894a2446ae4d7eb634b72e26f7697e BMC Bioinformatics, Vol 23, Iss 1, Pp 1-15 (2022) Whole genome duplication Phylogenetic analysis Gene duplicability Mutational opportunity Computer applications to medicine. Medical informatics R858-859.7 Biology (General) QH301-705.5 article 2022 ftdoajarticles https://doi.org/10.1186/s12859-022-05042-w 2022-12-30T22:34:59Z Abstract Background Multiple processes impact the probability of retention of individual genes following whole genome duplication (WGD) events. In analyzing two consecutive whole genome duplication events that occurred in the lineage leading to Atlantic salmon, a new phylogenetic statistical analysis was developed to examine the contingency of retention in one event based upon retention in a previous event. This analysis is intended to evaluate mechanisms of duplicate gene retention and to provide software to generate the test statistic for any genome with pairs of WGDs in its history. Results Here a software package written in Python, ‘WGDTree’ for the analysis of duplicate gene retention following whole genome duplication events is presented. Using gene tree-species tree reconciliation to label gene duplicate nodes and differentiate between WGD and SSD duplicates, the tool calculates a statistic based upon the conditional probability of a gene duplicate being retained after a second whole genome duplication dependent upon the retention status after the first event. The package also contains methods for the simulation of gene trees with WGD events. After running simulations, the accuracy of the placement of events has been determined to be high. The conditional probability statistic has been calculated for Phalaenopsis equestris on a monocot species tree with a pair of consecutive WGD events on its lineage, showing the applicability of the method. Conclusions A new software tool has been created for the analysis of duplicate genes in examination of retention mechanisms. The software tool has been made available on the Python package index and the source code can be found on GitHub here: https://github.com/cnickh/wgdtree . Article in Journal/Newspaper Atlantic salmon Directory of Open Access Journals: DOAJ Articles BMC Bioinformatics 23 1 |
institution |
Open Polar |
collection |
Directory of Open Access Journals: DOAJ Articles |
op_collection_id |
ftdoajarticles |
language |
English |
topic |
Whole genome duplication Phylogenetic analysis Gene duplicability Mutational opportunity Computer applications to medicine. Medical informatics R858-859.7 Biology (General) QH301-705.5 |
spellingShingle |
Whole genome duplication Phylogenetic analysis Gene duplicability Mutational opportunity Computer applications to medicine. Medical informatics R858-859.7 Biology (General) QH301-705.5 C. Nicholas Henry Kathryn Piper Amanda E. Wilson John L. Miraszek Claire S. Probst Yuying Rong David A. Liberles WGDTree: a phylogenetic software tool to examine conditional probabilities of retention following whole genome duplication events |
topic_facet |
Whole genome duplication Phylogenetic analysis Gene duplicability Mutational opportunity Computer applications to medicine. Medical informatics R858-859.7 Biology (General) QH301-705.5 |
description |
Abstract Background Multiple processes impact the probability of retention of individual genes following whole genome duplication (WGD) events. In analyzing two consecutive whole genome duplication events that occurred in the lineage leading to Atlantic salmon, a new phylogenetic statistical analysis was developed to examine the contingency of retention in one event based upon retention in a previous event. This analysis is intended to evaluate mechanisms of duplicate gene retention and to provide software to generate the test statistic for any genome with pairs of WGDs in its history. Results Here a software package written in Python, ‘WGDTree’ for the analysis of duplicate gene retention following whole genome duplication events is presented. Using gene tree-species tree reconciliation to label gene duplicate nodes and differentiate between WGD and SSD duplicates, the tool calculates a statistic based upon the conditional probability of a gene duplicate being retained after a second whole genome duplication dependent upon the retention status after the first event. The package also contains methods for the simulation of gene trees with WGD events. After running simulations, the accuracy of the placement of events has been determined to be high. The conditional probability statistic has been calculated for Phalaenopsis equestris on a monocot species tree with a pair of consecutive WGD events on its lineage, showing the applicability of the method. Conclusions A new software tool has been created for the analysis of duplicate genes in examination of retention mechanisms. The software tool has been made available on the Python package index and the source code can be found on GitHub here: https://github.com/cnickh/wgdtree . |
format |
Article in Journal/Newspaper |
author |
C. Nicholas Henry Kathryn Piper Amanda E. Wilson John L. Miraszek Claire S. Probst Yuying Rong David A. Liberles |
author_facet |
C. Nicholas Henry Kathryn Piper Amanda E. Wilson John L. Miraszek Claire S. Probst Yuying Rong David A. Liberles |
author_sort |
C. Nicholas Henry |
title |
WGDTree: a phylogenetic software tool to examine conditional probabilities of retention following whole genome duplication events |
title_short |
WGDTree: a phylogenetic software tool to examine conditional probabilities of retention following whole genome duplication events |
title_full |
WGDTree: a phylogenetic software tool to examine conditional probabilities of retention following whole genome duplication events |
title_fullStr |
WGDTree: a phylogenetic software tool to examine conditional probabilities of retention following whole genome duplication events |
title_full_unstemmed |
WGDTree: a phylogenetic software tool to examine conditional probabilities of retention following whole genome duplication events |
title_sort |
wgdtree: a phylogenetic software tool to examine conditional probabilities of retention following whole genome duplication events |
publisher |
BMC |
publishDate |
2022 |
url |
https://doi.org/10.1186/s12859-022-05042-w https://doaj.org/article/53894a2446ae4d7eb634b72e26f7697e |
genre |
Atlantic salmon |
genre_facet |
Atlantic salmon |
op_source |
BMC Bioinformatics, Vol 23, Iss 1, Pp 1-15 (2022) |
op_relation |
https://doi.org/10.1186/s12859-022-05042-w https://doaj.org/toc/1471-2105 doi:10.1186/s12859-022-05042-w 1471-2105 https://doaj.org/article/53894a2446ae4d7eb634b72e26f7697e |
op_doi |
https://doi.org/10.1186/s12859-022-05042-w |
container_title |
BMC Bioinformatics |
container_volume |
23 |
container_issue |
1 |
_version_ |
1766363287215144960 |