A poor man’s BLASTX—high-throughput metagenomic protein database search using PAUDA
Abstract Summary: In the context of metagenomics, we introduce a new approach to protein database search called PAUDA, which runs ∼10 000 times faster than BLASTX, while achieving about one-third of the assignment rate of reads to KEGG orthology groups, and producing gene and taxon abundance profile...
Published in: | Bioinformatics |
---|---|
Main Authors: | , |
Format: | Article in Journal/Newspaper |
Language: | English |
Published: |
Oxford University Press (OUP)
2013
|
Subjects: | |
Online Access: | http://dx.doi.org/10.1093/bioinformatics/btt254 https://academic.oup.com/bioinformatics/article-pdf/30/1/38/48913198/bioinformatics_30_1_38.pdf |
id |
croxfordunivpr:10.1093/bioinformatics/btt254 |
---|---|
record_format |
openpolar |
spelling |
croxfordunivpr:10.1093/bioinformatics/btt254 2024-10-29T17:46:56+00:00 A poor man’s BLASTX—high-throughput metagenomic protein database search using PAUDA Huson, Daniel H. Xie, Chao 2013 http://dx.doi.org/10.1093/bioinformatics/btt254 https://academic.oup.com/bioinformatics/article-pdf/30/1/38/48913198/bioinformatics_30_1_38.pdf en eng Oxford University Press (OUP) http://creativecommons.org/licenses/by/3.0/ Bioinformatics volume 30, issue 1, page 38-39 ISSN 1367-4811 1367-4803 journal-article 2013 croxfordunivpr https://doi.org/10.1093/bioinformatics/btt254 2024-10-08T04:05:23Z Abstract Summary: In the context of metagenomics, we introduce a new approach to protein database search called PAUDA, which runs ∼10 000 times faster than BLASTX, while achieving about one-third of the assignment rate of reads to KEGG orthology groups, and producing gene and taxon abundance profiles that are highly correlated to those obtained with BLASTX. PAUDA requires <80 CPU hours to analyze a dataset of 246 million Illumina DNA reads from permafrost soil for which a previous BLASTX analysis (on a subset of 176 million reads) reportedly required 800 000 CPU hours, leading to the same clustering of samples by functional profiles. Availability: PAUDA is freely available from: http://ab.inf.uni-tuebingen.de/software/pauda. Also supplementary method details are available from this website. Contact: daniel.huson@uni-tuebingen.de or xiechao@bic.nus.edu.sg Article in Journal/Newspaper permafrost Oxford University Press Bioinformatics 30 1 38 39 |
institution |
Open Polar |
collection |
Oxford University Press |
op_collection_id |
croxfordunivpr |
language |
English |
description |
Abstract Summary: In the context of metagenomics, we introduce a new approach to protein database search called PAUDA, which runs ∼10 000 times faster than BLASTX, while achieving about one-third of the assignment rate of reads to KEGG orthology groups, and producing gene and taxon abundance profiles that are highly correlated to those obtained with BLASTX. PAUDA requires <80 CPU hours to analyze a dataset of 246 million Illumina DNA reads from permafrost soil for which a previous BLASTX analysis (on a subset of 176 million reads) reportedly required 800 000 CPU hours, leading to the same clustering of samples by functional profiles. Availability: PAUDA is freely available from: http://ab.inf.uni-tuebingen.de/software/pauda. Also supplementary method details are available from this website. Contact: daniel.huson@uni-tuebingen.de or xiechao@bic.nus.edu.sg |
format |
Article in Journal/Newspaper |
author |
Huson, Daniel H. Xie, Chao |
spellingShingle |
Huson, Daniel H. Xie, Chao A poor man’s BLASTX—high-throughput metagenomic protein database search using PAUDA |
author_facet |
Huson, Daniel H. Xie, Chao |
author_sort |
Huson, Daniel H. |
title |
A poor man’s BLASTX—high-throughput metagenomic protein database search using PAUDA |
title_short |
A poor man’s BLASTX—high-throughput metagenomic protein database search using PAUDA |
title_full |
A poor man’s BLASTX—high-throughput metagenomic protein database search using PAUDA |
title_fullStr |
A poor man’s BLASTX—high-throughput metagenomic protein database search using PAUDA |
title_full_unstemmed |
A poor man’s BLASTX—high-throughput metagenomic protein database search using PAUDA |
title_sort |
poor man’s blastx—high-throughput metagenomic protein database search using pauda |
publisher |
Oxford University Press (OUP) |
publishDate |
2013 |
url |
http://dx.doi.org/10.1093/bioinformatics/btt254 https://academic.oup.com/bioinformatics/article-pdf/30/1/38/48913198/bioinformatics_30_1_38.pdf |
genre |
permafrost |
genre_facet |
permafrost |
op_source |
Bioinformatics volume 30, issue 1, page 38-39 ISSN 1367-4811 1367-4803 |
op_rights |
http://creativecommons.org/licenses/by/3.0/ |
op_doi |
https://doi.org/10.1093/bioinformatics/btt254 |
container_title |
Bioinformatics |
container_volume |
30 |
container_issue |
1 |
container_start_page |
38 |
op_container_end_page |
39 |
_version_ |
1814276437568389120 |