A poor man’s BLASTX—high-throughput metagenomic protein database search using PAUDA

Summary: In the context of metagenomics, we introduce a new approach to protein database search called PAUDA, which runs ∼10 000 times faster than BLASTX, while achieving about one-third of the assignment rate of reads to KEGG orthology groups, and producing gene and taxon abundance profiles that ar...

Full description

Bibliographic Details
Published in:Bioinformatics
Main Authors: Huson, Daniel H., Xie, Chao
Format: Text
Language:English
Published: Oxford University Press 2014
Subjects:
Online Access:http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3866550
http://www.ncbi.nlm.nih.gov/pubmed/23658416
https://doi.org/10.1093/bioinformatics/btt254
id ftpubmed:oai:pubmedcentral.nih.gov:3866550
record_format openpolar
spelling ftpubmed:oai:pubmedcentral.nih.gov:3866550 2023-05-15T17:57:30+02:00 A poor man’s BLASTX—high-throughput metagenomic protein database search using PAUDA Huson, Daniel H. Xie, Chao 2014-01-01 http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3866550 http://www.ncbi.nlm.nih.gov/pubmed/23658416 https://doi.org/10.1093/bioinformatics/btt254 en eng Oxford University Press http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3866550 http://www.ncbi.nlm.nih.gov/pubmed/23658416 http://dx.doi.org/10.1093/bioinformatics/btt254 © The Author(s) 2013. Published by Oxford University Press. http://creativecommons.org/licenses/by/3.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. CC-BY Hitseq Papers Text 2014 ftpubmed https://doi.org/10.1093/bioinformatics/btt254 2013-12-22T01:52:42Z Summary: In the context of metagenomics, we introduce a new approach to protein database search called PAUDA, which runs ∼10 000 times faster than BLASTX, while achieving about one-third of the assignment rate of reads to KEGG orthology groups, and producing gene and taxon abundance profiles that are highly correlated to those obtained with BLASTX. PAUDA requires <80 CPU hours to analyze a dataset of 246 million Illumina DNA reads from permafrost soil for which a previous BLASTX analysis (on a subset of 176 million reads) reportedly required 800 000 CPU hours, leading to the same clustering of samples by functional profiles. Text permafrost PubMed Central (PMC) Bioinformatics 30 1 38 39
institution Open Polar
collection PubMed Central (PMC)
op_collection_id ftpubmed
language English
topic Hitseq Papers
spellingShingle Hitseq Papers
Huson, Daniel H.
Xie, Chao
A poor man’s BLASTX—high-throughput metagenomic protein database search using PAUDA
topic_facet Hitseq Papers
description Summary: In the context of metagenomics, we introduce a new approach to protein database search called PAUDA, which runs ∼10 000 times faster than BLASTX, while achieving about one-third of the assignment rate of reads to KEGG orthology groups, and producing gene and taxon abundance profiles that are highly correlated to those obtained with BLASTX. PAUDA requires <80 CPU hours to analyze a dataset of 246 million Illumina DNA reads from permafrost soil for which a previous BLASTX analysis (on a subset of 176 million reads) reportedly required 800 000 CPU hours, leading to the same clustering of samples by functional profiles.
format Text
author Huson, Daniel H.
Xie, Chao
author_facet Huson, Daniel H.
Xie, Chao
author_sort Huson, Daniel H.
title A poor man’s BLASTX—high-throughput metagenomic protein database search using PAUDA
title_short A poor man’s BLASTX—high-throughput metagenomic protein database search using PAUDA
title_full A poor man’s BLASTX—high-throughput metagenomic protein database search using PAUDA
title_fullStr A poor man’s BLASTX—high-throughput metagenomic protein database search using PAUDA
title_full_unstemmed A poor man’s BLASTX—high-throughput metagenomic protein database search using PAUDA
title_sort poor man’s blastx—high-throughput metagenomic protein database search using pauda
publisher Oxford University Press
publishDate 2014
url http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3866550
http://www.ncbi.nlm.nih.gov/pubmed/23658416
https://doi.org/10.1093/bioinformatics/btt254
genre permafrost
genre_facet permafrost
op_relation http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3866550
http://www.ncbi.nlm.nih.gov/pubmed/23658416
http://dx.doi.org/10.1093/bioinformatics/btt254
op_rights © The Author(s) 2013. Published by Oxford University Press.
http://creativecommons.org/licenses/by/3.0/
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
op_rightsnorm CC-BY
op_doi https://doi.org/10.1093/bioinformatics/btt254
container_title Bioinformatics
container_volume 30
container_issue 1
container_start_page 38
op_container_end_page 39
_version_ 1766165953146519552