Counting small patterns in networks

Networks are an often employed tool that can help us visualize and analyze binary relationships by representing the entities as a set of nodes and the relations between them as edges in the network. One type of relations in the field of bioinformatics that is often modeled by networks are interactio...

Full description

Bibliographic Details
Main Author: Hočevar , Tomaž
Format: Thesis
Language:unknown
Published: 2017
Subjects:
Online Access:http://eprints.fri.uni-lj.si/4034/
http://eprints.fri.uni-lj.si/4034/1/Ho%C4%8Devar_%2D_doktorska_disertacija.pdf
id ftllubljana:oai:generic.eprints.org:4034
record_format openpolar
spelling ftllubljana:oai:generic.eprints.org:4034 2023-05-15T17:53:33+02:00 Counting small patterns in networks Hočevar , Tomaž 2017-12-22 application/pdf http://eprints.fri.uni-lj.si/4034/ http://eprints.fri.uni-lj.si/4034/1/Ho%C4%8Devar_%2D_doktorska_disertacija.pdf unknown http://eprints.fri.uni-lj.si/4034/1/Ho%C4%8Devar_%2D_doktorska_disertacija.pdf Tomaž Hočevar (2017) Counting small patterns in networks. PhD thesis. Computer and Information Science Thesis NonPeerReviewed 2017 ftllubljana 2022-12-09T10:05:22Z Networks are an often employed tool that can help us visualize and analyze binary relationships by representing the entities as a set of nodes and the relations between them as edges in the network. One type of relations in the field of bioinformatics that is often modeled by networks are interactions between pairs of proteins. Recent studies have focused on analyzing the local structure of such networks by observing small connected patterns consisting of 4 or 5 nodes, which are also known as graphlets. The nodes of graphlets are further divided into orbits by their "roles" or symmetries. The number of times a node from the network participates in each orbit forms a signature of the node's local network topology. Working under the assumption that the node's local topology is correlated with its function in the network, researchers have successfully used graphlets to predict new protein functions. The bottleneck of graphlet-based approaches is usually in the time required to count them. This restriction is becoming even more pronounced with a growing amount of available data. This dissertation focuses on improving existing graphlet counting techniques that are based on simple exhaustive enumeration. We present the algorithm Orca that counts graphlets and their orbits instead of enumerating them. It exploits relations between orbit counts to construct a system of equations that can be set up efficiently. Orca achieves this by enumerating (k-1)-node graphlets to count k-node graphlets, effectively obtaining a speed-up by a factor proportional to the maximum degree of a node in the network. In practical terms, it counts graphlets in larger protein-protein interaction networks about 50-100 times faster. Orca was designed for counting graphlets with 4 and 5 nodes. However, we adapt the approach to counting edge-orbits in addition to the original node-orbits with the same gains in run time. We also show that this approach can be generalized to graphlets of arbitrary size by identifying the necessary conditions and ... Thesis Orca University of Ljubljana, Faculty of Computer and Information Science: ePrints.FRI
institution Open Polar
collection University of Ljubljana, Faculty of Computer and Information Science: ePrints.FRI
op_collection_id ftllubljana
language unknown
topic Computer and Information Science
spellingShingle Computer and Information Science
Hočevar , Tomaž
Counting small patterns in networks
topic_facet Computer and Information Science
description Networks are an often employed tool that can help us visualize and analyze binary relationships by representing the entities as a set of nodes and the relations between them as edges in the network. One type of relations in the field of bioinformatics that is often modeled by networks are interactions between pairs of proteins. Recent studies have focused on analyzing the local structure of such networks by observing small connected patterns consisting of 4 or 5 nodes, which are also known as graphlets. The nodes of graphlets are further divided into orbits by their "roles" or symmetries. The number of times a node from the network participates in each orbit forms a signature of the node's local network topology. Working under the assumption that the node's local topology is correlated with its function in the network, researchers have successfully used graphlets to predict new protein functions. The bottleneck of graphlet-based approaches is usually in the time required to count them. This restriction is becoming even more pronounced with a growing amount of available data. This dissertation focuses on improving existing graphlet counting techniques that are based on simple exhaustive enumeration. We present the algorithm Orca that counts graphlets and their orbits instead of enumerating them. It exploits relations between orbit counts to construct a system of equations that can be set up efficiently. Orca achieves this by enumerating (k-1)-node graphlets to count k-node graphlets, effectively obtaining a speed-up by a factor proportional to the maximum degree of a node in the network. In practical terms, it counts graphlets in larger protein-protein interaction networks about 50-100 times faster. Orca was designed for counting graphlets with 4 and 5 nodes. However, we adapt the approach to counting edge-orbits in addition to the original node-orbits with the same gains in run time. We also show that this approach can be generalized to graphlets of arbitrary size by identifying the necessary conditions and ...
format Thesis
author Hočevar , Tomaž
author_facet Hočevar , Tomaž
author_sort Hočevar , Tomaž
title Counting small patterns in networks
title_short Counting small patterns in networks
title_full Counting small patterns in networks
title_fullStr Counting small patterns in networks
title_full_unstemmed Counting small patterns in networks
title_sort counting small patterns in networks
publishDate 2017
url http://eprints.fri.uni-lj.si/4034/
http://eprints.fri.uni-lj.si/4034/1/Ho%C4%8Devar_%2D_doktorska_disertacija.pdf
genre Orca
genre_facet Orca
op_relation http://eprints.fri.uni-lj.si/4034/1/Ho%C4%8Devar_%2D_doktorska_disertacija.pdf
Tomaž Hočevar (2017) Counting small patterns in networks. PhD thesis.
_version_ 1766161251984998400