Genome-wide detection of transposable elements for mammalian phylogenomics

Transposable elements (TEs) are replicating genetic elementst hat comprise up to 50% of mammalian genomes. A specific class of TEs are retrotransposons that proliferate by transcription into a RNA intermediate, followed by genomic reintegration into another locus (so called “copy & paste” mechan...

Full description

Bibliographic Details
Main Author: Lammers, Fritjof
Format: Doctoral or Postdoctoral Thesis
Language:English
Published: 2019
Subjects:
Online Access:http://publikationen.ub.uni-frankfurt.de/frontdoor/index/index/docId/51253
https://nbn-resolving.org/urn:nbn:de:hebis:30:3-512533
http://publikationen.ub.uni-frankfurt.de/files/51253/thesis-flammers.public.pdf
Description
Summary:Transposable elements (TEs) are replicating genetic elementst hat comprise up to 50% of mammalian genomes. A specific class of TEs are retrotransposons that proliferate by transcription into a RNA intermediate, followed by genomic reintegration into another locus (so called “copy & paste” mechanism). Due to the lack of removal mechanisms and very rare parallel insertions, the presence of TE insertions at ortholgous genomic loci in multiple taxa provides a virtually homoplasy free phylogenetic marker. So far, developing phylogenetically informative markers from TE insertions has been a tedious work of testing hundreds of putative candidate loci in a trial-and error approach with low success rate. Hence, phylogenetic studies using TE insertions were often limited to a few dozen markers. Recently, genome sequencing of multiple species using reference-mapping allowed the identification of genome-scale datasets of TE insertions. and made the ad-hoc development of phylogenetic informative markers possible. However, genome scale TE detection methods have rarely been applied to non model organisms in which data availability and quality is comparably limited. In this thesis, I developed the TeddyPi pipeline (TE detection and discovery for phylogenetic inference), a software tool that made it possible to obtain reliable genome-scale TE insertion data from low-coverage genomes. This was achieved by integrating the data from multiple TE and structural variation callers as well as applying a stringent filtering pipeline to exclude low-quality insertion calls. Whole-genome sequencing datasets of bears (Ursidae) and baleen whales (Mysticeti) were used to apply TE based phylogenetic inference and evaluate the method in comparison to sequence-based phylogenomic analyses. In the bear genomes, TeddyPi identified 150,513 high-quality transposable element (TE) insertions, which allowed me to reconstruct the evolutionary history of bears despite extensive phylogenetic conflict (Lammers et al., 2017). The large number of detected ...