Phase-type distributions in population genetics

Probability modelling for DNA sequence evolution is well established and provides a rich framework for understanding genetic variation between samples of individuals from one or more populations. We show that both classical and more recent models for coalescence (with or without recombination) can b...

Full description

Bibliographic Details
Main Authors: Hobolth, Asger, Siri-Jégousse, Arno, Bladt, Mogens
Format: Article in Journal/Newspaper
Language:unknown
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S0040580919300140
id ftrepec:oai:RePEc:eee:thpobi:v:127:y:2019:i:c:p:16-32
record_format openpolar
spelling ftrepec:oai:RePEc:eee:thpobi:v:127:y:2019:i:c:p:16-32 2024-04-14T08:08:56+00:00 Phase-type distributions in population genetics Hobolth, Asger Siri-Jégousse, Arno Bladt, Mogens http://www.sciencedirect.com/science/article/pii/S0040580919300140 unknown http://www.sciencedirect.com/science/article/pii/S0040580919300140 article ftrepec 2024-03-19T10:28:46Z Probability modelling for DNA sequence evolution is well established and provides a rich framework for understanding genetic variation between samples of individuals from one or more populations. We show that both classical and more recent models for coalescence (with or without recombination) can be described in terms of the so-called phase-type theory, where complicated and tedious calculations are circumvented by the use of matrix manipulations. The application of phase-type theory in population genetics consists of describing the biological system as a Markov model by appropriately setting up a state space and calculating the corresponding intensity and reward matrices. Formulae of interest are then expressed in terms of these aforementioned matrices. We illustrate this procedure by a number of examples: (a)Â Calculating the mean, (co)variance and even higher order moments of the site frequency spectrum in multiple merger coalescent models, (b)Â Analysing a sample of DNA sequences from the Atlantic Cod using the Beta-coalescent, and (c)Â Determining the correlation of the number of segregating sites for multiple samples in the two-locus ancestral recombination graph. We believe that phase-type theory has great potential as a tool for analysing probability models in population genetics. The compact matrix notation is useful for clarification of current models, and in particular their formal manipulation and calculations, but also for further development or extensions. Coalescent theory; Multiple merger; Phase-type theory; Recombination; Segregating sites; Site frequency spectrum; Article in Journal/Newspaper atlantic cod RePEc (Research Papers in Economics)
institution Open Polar
collection RePEc (Research Papers in Economics)
op_collection_id ftrepec
language unknown
description Probability modelling for DNA sequence evolution is well established and provides a rich framework for understanding genetic variation between samples of individuals from one or more populations. We show that both classical and more recent models for coalescence (with or without recombination) can be described in terms of the so-called phase-type theory, where complicated and tedious calculations are circumvented by the use of matrix manipulations. The application of phase-type theory in population genetics consists of describing the biological system as a Markov model by appropriately setting up a state space and calculating the corresponding intensity and reward matrices. Formulae of interest are then expressed in terms of these aforementioned matrices. We illustrate this procedure by a number of examples: (a)Â Calculating the mean, (co)variance and even higher order moments of the site frequency spectrum in multiple merger coalescent models, (b)Â Analysing a sample of DNA sequences from the Atlantic Cod using the Beta-coalescent, and (c)Â Determining the correlation of the number of segregating sites for multiple samples in the two-locus ancestral recombination graph. We believe that phase-type theory has great potential as a tool for analysing probability models in population genetics. The compact matrix notation is useful for clarification of current models, and in particular their formal manipulation and calculations, but also for further development or extensions. Coalescent theory; Multiple merger; Phase-type theory; Recombination; Segregating sites; Site frequency spectrum;
format Article in Journal/Newspaper
author Hobolth, Asger
Siri-Jégousse, Arno
Bladt, Mogens
spellingShingle Hobolth, Asger
Siri-Jégousse, Arno
Bladt, Mogens
Phase-type distributions in population genetics
author_facet Hobolth, Asger
Siri-Jégousse, Arno
Bladt, Mogens
author_sort Hobolth, Asger
title Phase-type distributions in population genetics
title_short Phase-type distributions in population genetics
title_full Phase-type distributions in population genetics
title_fullStr Phase-type distributions in population genetics
title_full_unstemmed Phase-type distributions in population genetics
title_sort phase-type distributions in population genetics
url http://www.sciencedirect.com/science/article/pii/S0040580919300140
genre atlantic cod
genre_facet atlantic cod
op_relation http://www.sciencedirect.com/science/article/pii/S0040580919300140
_version_ 1796306382553612288