Data from: Phylogenomics from whole genome sequences using aTRAM
Novel sequencing technologies are rapidly expanding the size of data sets that can be applied to phylogenetic studies. Currently the most commonly used phylogenomic approaches involve some form of genome reduction. While these approaches make assembling phylogenomic data sets more economical for org...
Main Authors: | , , , , , , , , , , , , , |
---|---|
Format: | Other/Unknown Material |
Language: | unknown |
Published: |
Zenodo
2016
|
Subjects: | |
Online Access: | https://doi.org/10.5061/dryad.26j38 |
id |
ftzenodo:oai:zenodo.org:4992900 |
---|---|
record_format |
openpolar |
spelling |
ftzenodo:oai:zenodo.org:4992900 2024-09-15T17:45:20+00:00 Data from: Phylogenomics from whole genome sequences using aTRAM Allen, Julie M. Boyd, Bret Nguyen, Nam-Phuong Vachaspati, Pranjal Warnow, Tandy Huang, Daisie I. Grady, Patrick G. S. Bell, Kayce C. Cronk, Quentin C.B. Mugisha, Lawrence Pittendrigh, Barry R. Soledad Leonardi, M. Reed, David L. Johnson, Kevin P. 2016-11-08 https://doi.org/10.5061/dryad.26j38 unknown Zenodo https://doi.org/10.1093/sysbio/syw105 https://zenodo.org/communities/dryad https://doi.org/10.5061/dryad.26j38 oai:zenodo.org:4992900 info:eu-repo/semantics/openAccess Creative Commons Zero v1.0 Universal https://creativecommons.org/publicdomain/zero/1.0/legalcode Bureelia antiqua Osborniella crotophagae Pedicinus badii gene assembly Haematopinus eurysternus Degeeriella rufa Pthirus gorillae Neohaematopinus pacificus Genome sequencing Linognathus spicatus Pthirus pubis Pediculus humanus Pediculus schaeffi Proechinopthirus fluctus aTRAM Hoplopleura arboricola Stimulopalpus japonicus present day Antarctopthirus microchir Bothriometopus macrocnemus Holocene info:eu-repo/semantics/other 2016 ftzenodo https://doi.org/10.5061/dryad.26j3810.1093/sysbio/syw105 2024-07-25T15:43:53Z Novel sequencing technologies are rapidly expanding the size of data sets that can be applied to phylogenetic studies. Currently the most commonly used phylogenomic approaches involve some form of genome reduction. While these approaches make assembling phylogenomic data sets more economical for organisms with large genomes, they reduce the genomic coverage and thereby the long-term utility of the data. Currently, for organisms with moderate to small genomes (<1000 Mbp) it is feasible to sequence the entire genome at modest coverage (10−30×). Computational challenges for handling these large data sets can be alleviated by assembling targeted reads, rather than assembling the entire genome, to produce a phylogenomic data matrix. Here we demonstrate the use of automated Target Restricted Assembly Method (aTRAM) to assemble 1107 single-copy ortholog genes from whole genome sequencing of sucking lice (Anoplura) and out-groups. We developed a pipeline to extract exon sequences from the aTRAM assemblies by annotating them with respect to the original target protein. We aligned these protein sequences with the inferred amino acids and then performed phylogenetic analyses on both the concatenated matrix of genes and on each gene separately in a coalescent analysis. Finally, we tested the limits of successful assembly in aTRAM by assembling 100 genes from close- to distantly related taxa at high to low levels of coverage. Concatenated alignment and tree Alignment and phylogenetic tree of the concatenated 1,101 exon DNA alignment from 15 louse taxa. Genes were assembled from raw genomic DNA with aTRAM and exons extracted and stitched together. Third codon position was removed due to base composition bias, and tree build in RAxML. Dataset_1.zip Individual Gene Trees and Alignments All 1,101 gene trees and alignments for the 15 taxon dataset. Each gene was aligned using PASTA and UPP for fragmentary sequences. Each gene tree was built using ASTRAL. Dataset_2.zip SupplementaryTable DNA extraction, and quality clean up ... Other/Unknown Material Antarc* Zenodo |
institution |
Open Polar |
collection |
Zenodo |
op_collection_id |
ftzenodo |
language |
unknown |
topic |
Bureelia antiqua Osborniella crotophagae Pedicinus badii gene assembly Haematopinus eurysternus Degeeriella rufa Pthirus gorillae Neohaematopinus pacificus Genome sequencing Linognathus spicatus Pthirus pubis Pediculus humanus Pediculus schaeffi Proechinopthirus fluctus aTRAM Hoplopleura arboricola Stimulopalpus japonicus present day Antarctopthirus microchir Bothriometopus macrocnemus Holocene |
spellingShingle |
Bureelia antiqua Osborniella crotophagae Pedicinus badii gene assembly Haematopinus eurysternus Degeeriella rufa Pthirus gorillae Neohaematopinus pacificus Genome sequencing Linognathus spicatus Pthirus pubis Pediculus humanus Pediculus schaeffi Proechinopthirus fluctus aTRAM Hoplopleura arboricola Stimulopalpus japonicus present day Antarctopthirus microchir Bothriometopus macrocnemus Holocene Allen, Julie M. Boyd, Bret Nguyen, Nam-Phuong Vachaspati, Pranjal Warnow, Tandy Huang, Daisie I. Grady, Patrick G. S. Bell, Kayce C. Cronk, Quentin C.B. Mugisha, Lawrence Pittendrigh, Barry R. Soledad Leonardi, M. Reed, David L. Johnson, Kevin P. Data from: Phylogenomics from whole genome sequences using aTRAM |
topic_facet |
Bureelia antiqua Osborniella crotophagae Pedicinus badii gene assembly Haematopinus eurysternus Degeeriella rufa Pthirus gorillae Neohaematopinus pacificus Genome sequencing Linognathus spicatus Pthirus pubis Pediculus humanus Pediculus schaeffi Proechinopthirus fluctus aTRAM Hoplopleura arboricola Stimulopalpus japonicus present day Antarctopthirus microchir Bothriometopus macrocnemus Holocene |
description |
Novel sequencing technologies are rapidly expanding the size of data sets that can be applied to phylogenetic studies. Currently the most commonly used phylogenomic approaches involve some form of genome reduction. While these approaches make assembling phylogenomic data sets more economical for organisms with large genomes, they reduce the genomic coverage and thereby the long-term utility of the data. Currently, for organisms with moderate to small genomes (<1000 Mbp) it is feasible to sequence the entire genome at modest coverage (10−30×). Computational challenges for handling these large data sets can be alleviated by assembling targeted reads, rather than assembling the entire genome, to produce a phylogenomic data matrix. Here we demonstrate the use of automated Target Restricted Assembly Method (aTRAM) to assemble 1107 single-copy ortholog genes from whole genome sequencing of sucking lice (Anoplura) and out-groups. We developed a pipeline to extract exon sequences from the aTRAM assemblies by annotating them with respect to the original target protein. We aligned these protein sequences with the inferred amino acids and then performed phylogenetic analyses on both the concatenated matrix of genes and on each gene separately in a coalescent analysis. Finally, we tested the limits of successful assembly in aTRAM by assembling 100 genes from close- to distantly related taxa at high to low levels of coverage. Concatenated alignment and tree Alignment and phylogenetic tree of the concatenated 1,101 exon DNA alignment from 15 louse taxa. Genes were assembled from raw genomic DNA with aTRAM and exons extracted and stitched together. Third codon position was removed due to base composition bias, and tree build in RAxML. Dataset_1.zip Individual Gene Trees and Alignments All 1,101 gene trees and alignments for the 15 taxon dataset. Each gene was aligned using PASTA and UPP for fragmentary sequences. Each gene tree was built using ASTRAL. Dataset_2.zip SupplementaryTable DNA extraction, and quality clean up ... |
format |
Other/Unknown Material |
author |
Allen, Julie M. Boyd, Bret Nguyen, Nam-Phuong Vachaspati, Pranjal Warnow, Tandy Huang, Daisie I. Grady, Patrick G. S. Bell, Kayce C. Cronk, Quentin C.B. Mugisha, Lawrence Pittendrigh, Barry R. Soledad Leonardi, M. Reed, David L. Johnson, Kevin P. |
author_facet |
Allen, Julie M. Boyd, Bret Nguyen, Nam-Phuong Vachaspati, Pranjal Warnow, Tandy Huang, Daisie I. Grady, Patrick G. S. Bell, Kayce C. Cronk, Quentin C.B. Mugisha, Lawrence Pittendrigh, Barry R. Soledad Leonardi, M. Reed, David L. Johnson, Kevin P. |
author_sort |
Allen, Julie M. |
title |
Data from: Phylogenomics from whole genome sequences using aTRAM |
title_short |
Data from: Phylogenomics from whole genome sequences using aTRAM |
title_full |
Data from: Phylogenomics from whole genome sequences using aTRAM |
title_fullStr |
Data from: Phylogenomics from whole genome sequences using aTRAM |
title_full_unstemmed |
Data from: Phylogenomics from whole genome sequences using aTRAM |
title_sort |
data from: phylogenomics from whole genome sequences using atram |
publisher |
Zenodo |
publishDate |
2016 |
url |
https://doi.org/10.5061/dryad.26j38 |
genre |
Antarc* |
genre_facet |
Antarc* |
op_relation |
https://doi.org/10.1093/sysbio/syw105 https://zenodo.org/communities/dryad https://doi.org/10.5061/dryad.26j38 oai:zenodo.org:4992900 |
op_rights |
info:eu-repo/semantics/openAccess Creative Commons Zero v1.0 Universal https://creativecommons.org/publicdomain/zero/1.0/legalcode |
op_doi |
https://doi.org/10.5061/dryad.26j3810.1093/sysbio/syw105 |
_version_ |
1810493105730748416 |