BLANT-fast graphlet sampling tool.

SummaryBLAST creates local sequence alignments by first building a database of small k-letter sub-sequences called k-mers. Identical k-mers from different regions provide 'seeds' for longer local alignments. This seed-and-extend heuristic makes BLAST extremely fast and has led to its almos...

Full description

Bibliographic Details
Main Authors: Maharaj, Sridevi, Tracy, Brennan, Hayes, Wayne B
Format: Article in Journal/Newspaper
Language:unknown
Published: eScholarship, University of California 2019
Subjects:
Online Access:https://escholarship.org/uc/item/2746n7s0
Description
Summary:SummaryBLAST creates local sequence alignments by first building a database of small k-letter sub-sequences called k-mers. Identical k-mers from different regions provide 'seeds' for longer local alignments. This seed-and-extend heuristic makes BLAST extremely fast and has led to its almost exclusive use despite the existence of more accurate, but slower, algorithms. In this paper, we introduce the Basic Local Alignment for Networks Tool (BLANT). BLANT is the analog of BLAST, but for networks: given an input graph, it samples small, induced, k-node sub-graphs called k-graphlets. Graphlets have been used to classify networks, quantify structure, align networks both locally and globally, identify topology-function relationships and build taxonomic trees without the use of sequences. Given an input network, BLANT produces millions of graphlet samples in seconds-orders of magnitude faster than existing methods. BLANT offers sampled graphlets in various forms: distributions of graphlets or their orbits; graphlet degree or graphlet orbit degree vectors, the latter being compatible with ORCA; or an index to be used as the basis for seed-and-extend local alignments. We demonstrate BLANT's usefelness by using its indexing mode to find functional similarity between yeast and human PPI networks.Availability and implementationBLANT is written in C and is available at https://github.com/waynebhayes/BLANT/releases.Supplementary informationSupplementary data are available at Bioinformatics online.