qflow: a fast customer-oriented NetFlow database for accounting and data retention

Internet service providers in Iceland must manage large databases of network flow data in order to charge customers and comply with data retention laws. The databases need to efficiently handle large volumes of data, often billions or trillions of records, and they must support fast queries of traff...

Full description

Bibliographic Details
Main Author: Hallgrímur H. Gunnarsson 1983-
Other Authors: Háskóli Íslands
Format: Thesis
Language:English
Published: 2014
Subjects:
Online Access:http://hdl.handle.net/1946/19868
Description
Summary:Internet service providers in Iceland must manage large databases of network flow data in order to charge customers and comply with data retention laws. The databases need to efficiently handle large volumes of data, often billions or trillions of records, and they must support fast queries of traffic volume per customer over time and extraction of raw flow data for given customers. Popular open-source tools for storing flow data, such as nfdump and flow-tools, are backed by flat binary files. They do not provide any type of indexing or summaries of customer traffic. As a result, flow queries for a given customer need to linearly scan through all the flow records in a given time period. We present a high-performance customer-oriented flow database that provides fast customer queries and compressed flow storage. The database is backed by indexed flow tablets that allow for fast extraction of customer flows and traffic volume per customer.