Data from: Phylogenomics from whole genome sequences using aTRAM
AbstractNovel sequencing technologies are rapidly expanding the size of data sets that can be applied to phylogenetic studies. Currently the most commonly used phylogenomic approaches involve some form of genome reduction. While these approaches make assembling phylogenomic data sets more economical...
Main Authors: | , , , , , , , , , , , , , |
---|---|
Format: | Dataset |
Language: | unknown |
Published: |
2021
|
Subjects: | |
Online Access: | https://search.dataone.org/view/sha256:55413755041fd224e33d6d01a7d9cd488a23e4b76f94878fd3f75e1f42b7c34a |
id |
dataone:sha256:55413755041fd224e33d6d01a7d9cd488a23e4b76f94878fd3f75e1f42b7c34a |
---|---|
record_format |
openpolar |
spelling |
dataone:sha256:55413755041fd224e33d6d01a7d9cd488a23e4b76f94878fd3f75e1f42b7c34a 2024-06-03T18:46:23+00:00 Data from: Phylogenomics from whole genome sequences using aTRAM Allen, Julie M. Boyd, Bret Nguyen, Nam-Phuong Vachaspati, Pranjal Warnow, Tandy Huang, Daisie I. Grady, Patrick G. S. Bell, Kayce C. Cronk, Quentin C.B. Mugisha, Lawrence Pittendrigh, Barry R. Soledad Leonardi, M. Reed, David L. Johnson, Kevin P. 2021-05-19T00:00:00Z https://search.dataone.org/view/sha256:55413755041fd224e33d6d01a7d9cd488a23e4b76f94878fd3f75e1f42b7c34a unknown Hoplopleura arboricola Pthirus pubis Holocene Haematopinus eurysternus Proechinopthirus fluctus Bothriometopus macrocnemus Pthirus gorillae Pediculus schaeffi Genome sequencing Stimulopalpus japonicus Pediculus humanus Pedicinus badii Osborniella crotophagae Linognathus spicatus gene assembly present day Other Pedicinus badii aTRAM Haematopinus eurysternus Bureelia antiqua Neohaematopinus pacificus Degeeriella rufa Antarctopthirus microchir Dataset 2021 dataone:urn:node:BOREALIS 2024-06-03T18:17:48Z AbstractNovel sequencing technologies are rapidly expanding the size of data sets that can be applied to phylogenetic studies. Currently the most commonly used phylogenomic approaches involve some form of genome reduction. While these approaches make assembling phylogenomic data sets more economical for organisms with large genomes, they reduce the genomic coverage and thereby the long-term utility of the data. Currently, for organisms with moderate to small genomes (<1000 Mbp) it is feasible to sequence the entire genome at modest coverage (10−30×). Computational challenges for handling these large data sets can be alleviated by assembling targeted reads, rather than assembling the entire genome, to produce a phylogenomic data matrix. Here we demonstrate the use of automated Target Restricted Assembly Method (aTRAM) to assemble 1107 single-copy ortholog genes from whole genome sequencing of sucking lice (Anoplura) and out-groups. We developed a pipeline to extract exon sequences from the aTRAM assemblies by annotating them with respect to the original target protein. We aligned these protein sequences with the inferred amino acids and then performed phylogenetic analyses on both the concatenated matrix of genes and on each gene separately in a coalescent analysis. Finally, we tested the limits of successful assembly in aTRAM by assembling 100 genes from close- to distantly related taxa at high to low levels of coverage. Dataset Antarc* Unknown |
institution |
Open Polar |
collection |
Unknown |
op_collection_id |
dataone:urn:node:BOREALIS |
language |
unknown |
topic |
Hoplopleura arboricola Pthirus pubis Holocene Haematopinus eurysternus Proechinopthirus fluctus Bothriometopus macrocnemus Pthirus gorillae Pediculus schaeffi Genome sequencing Stimulopalpus japonicus Pediculus humanus Pedicinus badii Osborniella crotophagae Linognathus spicatus gene assembly present day Other Pedicinus badii aTRAM Haematopinus eurysternus Bureelia antiqua Neohaematopinus pacificus Degeeriella rufa Antarctopthirus microchir |
spellingShingle |
Hoplopleura arboricola Pthirus pubis Holocene Haematopinus eurysternus Proechinopthirus fluctus Bothriometopus macrocnemus Pthirus gorillae Pediculus schaeffi Genome sequencing Stimulopalpus japonicus Pediculus humanus Pedicinus badii Osborniella crotophagae Linognathus spicatus gene assembly present day Other Pedicinus badii aTRAM Haematopinus eurysternus Bureelia antiqua Neohaematopinus pacificus Degeeriella rufa Antarctopthirus microchir Allen, Julie M. Boyd, Bret Nguyen, Nam-Phuong Vachaspati, Pranjal Warnow, Tandy Huang, Daisie I. Grady, Patrick G. S. Bell, Kayce C. Cronk, Quentin C.B. Mugisha, Lawrence Pittendrigh, Barry R. Soledad Leonardi, M. Reed, David L. Johnson, Kevin P. Data from: Phylogenomics from whole genome sequences using aTRAM |
topic_facet |
Hoplopleura arboricola Pthirus pubis Holocene Haematopinus eurysternus Proechinopthirus fluctus Bothriometopus macrocnemus Pthirus gorillae Pediculus schaeffi Genome sequencing Stimulopalpus japonicus Pediculus humanus Pedicinus badii Osborniella crotophagae Linognathus spicatus gene assembly present day Other Pedicinus badii aTRAM Haematopinus eurysternus Bureelia antiqua Neohaematopinus pacificus Degeeriella rufa Antarctopthirus microchir |
description |
AbstractNovel sequencing technologies are rapidly expanding the size of data sets that can be applied to phylogenetic studies. Currently the most commonly used phylogenomic approaches involve some form of genome reduction. While these approaches make assembling phylogenomic data sets more economical for organisms with large genomes, they reduce the genomic coverage and thereby the long-term utility of the data. Currently, for organisms with moderate to small genomes (<1000 Mbp) it is feasible to sequence the entire genome at modest coverage (10−30×). Computational challenges for handling these large data sets can be alleviated by assembling targeted reads, rather than assembling the entire genome, to produce a phylogenomic data matrix. Here we demonstrate the use of automated Target Restricted Assembly Method (aTRAM) to assemble 1107 single-copy ortholog genes from whole genome sequencing of sucking lice (Anoplura) and out-groups. We developed a pipeline to extract exon sequences from the aTRAM assemblies by annotating them with respect to the original target protein. We aligned these protein sequences with the inferred amino acids and then performed phylogenetic analyses on both the concatenated matrix of genes and on each gene separately in a coalescent analysis. Finally, we tested the limits of successful assembly in aTRAM by assembling 100 genes from close- to distantly related taxa at high to low levels of coverage. |
format |
Dataset |
author |
Allen, Julie M. Boyd, Bret Nguyen, Nam-Phuong Vachaspati, Pranjal Warnow, Tandy Huang, Daisie I. Grady, Patrick G. S. Bell, Kayce C. Cronk, Quentin C.B. Mugisha, Lawrence Pittendrigh, Barry R. Soledad Leonardi, M. Reed, David L. Johnson, Kevin P. |
author_facet |
Allen, Julie M. Boyd, Bret Nguyen, Nam-Phuong Vachaspati, Pranjal Warnow, Tandy Huang, Daisie I. Grady, Patrick G. S. Bell, Kayce C. Cronk, Quentin C.B. Mugisha, Lawrence Pittendrigh, Barry R. Soledad Leonardi, M. Reed, David L. Johnson, Kevin P. |
author_sort |
Allen, Julie M. |
title |
Data from: Phylogenomics from whole genome sequences using aTRAM |
title_short |
Data from: Phylogenomics from whole genome sequences using aTRAM |
title_full |
Data from: Phylogenomics from whole genome sequences using aTRAM |
title_fullStr |
Data from: Phylogenomics from whole genome sequences using aTRAM |
title_full_unstemmed |
Data from: Phylogenomics from whole genome sequences using aTRAM |
title_sort |
data from: phylogenomics from whole genome sequences using atram |
publishDate |
2021 |
url |
https://search.dataone.org/view/sha256:55413755041fd224e33d6d01a7d9cd488a23e4b76f94878fd3f75e1f42b7c34a |
genre |
Antarc* |
genre_facet |
Antarc* |
_version_ |
1800871931980808192 |