A poor man’s BLASTX—high-throughput metagenomic protein database search using PAUDA

Abstract Summary: In the context of metagenomics, we introduce a new approach to protein database search called PAUDA, which runs ∼10 000 times faster than BLASTX, while achieving about one-third of the assignment rate of reads to KEGG orthology groups, and producing gene and taxon abundance profile...

Full description

Bibliographic Details
Published in:Bioinformatics
Main Authors: Huson, Daniel H., Xie, Chao
Format: Article in Journal/Newspaper
Language:English
Published: Oxford University Press (OUP) 2013
Subjects:
Online Access:http://dx.doi.org/10.1093/bioinformatics/btt254
https://academic.oup.com/bioinformatics/article-pdf/30/1/38/48913198/bioinformatics_30_1_38.pdf
id croxfordunivpr:10.1093/bioinformatics/btt254
record_format openpolar
spelling croxfordunivpr:10.1093/bioinformatics/btt254 2024-10-29T17:46:56+00:00 A poor man’s BLASTX—high-throughput metagenomic protein database search using PAUDA Huson, Daniel H. Xie, Chao 2013 http://dx.doi.org/10.1093/bioinformatics/btt254 https://academic.oup.com/bioinformatics/article-pdf/30/1/38/48913198/bioinformatics_30_1_38.pdf en eng Oxford University Press (OUP) http://creativecommons.org/licenses/by/3.0/ Bioinformatics volume 30, issue 1, page 38-39 ISSN 1367-4811 1367-4803 journal-article 2013 croxfordunivpr https://doi.org/10.1093/bioinformatics/btt254 2024-10-08T04:05:23Z Abstract Summary: In the context of metagenomics, we introduce a new approach to protein database search called PAUDA, which runs ∼10 000 times faster than BLASTX, while achieving about one-third of the assignment rate of reads to KEGG orthology groups, and producing gene and taxon abundance profiles that are highly correlated to those obtained with BLASTX. PAUDA requires <80 CPU hours to analyze a dataset of 246 million Illumina DNA reads from permafrost soil for which a previous BLASTX analysis (on a subset of 176 million reads) reportedly required 800 000 CPU hours, leading to the same clustering of samples by functional profiles. Availability: PAUDA is freely available from: http://ab.inf.uni-tuebingen.de/software/pauda. Also supplementary method details are available from this website. Contact: daniel.huson@uni-tuebingen.de or xiechao@bic.nus.edu.sg Article in Journal/Newspaper permafrost Oxford University Press Bioinformatics 30 1 38 39
institution Open Polar
collection Oxford University Press
op_collection_id croxfordunivpr
language English
description Abstract Summary: In the context of metagenomics, we introduce a new approach to protein database search called PAUDA, which runs ∼10 000 times faster than BLASTX, while achieving about one-third of the assignment rate of reads to KEGG orthology groups, and producing gene and taxon abundance profiles that are highly correlated to those obtained with BLASTX. PAUDA requires <80 CPU hours to analyze a dataset of 246 million Illumina DNA reads from permafrost soil for which a previous BLASTX analysis (on a subset of 176 million reads) reportedly required 800 000 CPU hours, leading to the same clustering of samples by functional profiles. Availability: PAUDA is freely available from: http://ab.inf.uni-tuebingen.de/software/pauda. Also supplementary method details are available from this website. Contact: daniel.huson@uni-tuebingen.de or xiechao@bic.nus.edu.sg
format Article in Journal/Newspaper
author Huson, Daniel H.
Xie, Chao
spellingShingle Huson, Daniel H.
Xie, Chao
A poor man’s BLASTX—high-throughput metagenomic protein database search using PAUDA
author_facet Huson, Daniel H.
Xie, Chao
author_sort Huson, Daniel H.
title A poor man’s BLASTX—high-throughput metagenomic protein database search using PAUDA
title_short A poor man’s BLASTX—high-throughput metagenomic protein database search using PAUDA
title_full A poor man’s BLASTX—high-throughput metagenomic protein database search using PAUDA
title_fullStr A poor man’s BLASTX—high-throughput metagenomic protein database search using PAUDA
title_full_unstemmed A poor man’s BLASTX—high-throughput metagenomic protein database search using PAUDA
title_sort poor man’s blastx—high-throughput metagenomic protein database search using pauda
publisher Oxford University Press (OUP)
publishDate 2013
url http://dx.doi.org/10.1093/bioinformatics/btt254
https://academic.oup.com/bioinformatics/article-pdf/30/1/38/48913198/bioinformatics_30_1_38.pdf
genre permafrost
genre_facet permafrost
op_source Bioinformatics
volume 30, issue 1, page 38-39
ISSN 1367-4811 1367-4803
op_rights http://creativecommons.org/licenses/by/3.0/
op_doi https://doi.org/10.1093/bioinformatics/btt254
container_title Bioinformatics
container_volume 30
container_issue 1
container_start_page 38
op_container_end_page 39
_version_ 1814276437568389120