Hippo: A Fast, yet Scalable, Database Indexing Approach

Even though existing database indexes (e.g., B+-Tree) speed up the query execution, they suffer from two main drawbacks: (1) A database index usually yields 5% to 15% additional storage overhead which results in non-ignorable dollar cost in big data scenarios especially when deployed on modern stora...

Full description

Bibliographic Details
Main Authors: Yu, Jia, Sarwat, Mohamed
Format: Report
Language:unknown
Published: arXiv 2016
Subjects:
Online Access:https://dx.doi.org/10.48550/arxiv.1604.03234
https://arxiv.org/abs/1604.03234
id ftdatacite:10.48550/arxiv.1604.03234
record_format openpolar
spelling ftdatacite:10.48550/arxiv.1604.03234 2023-05-15T18:32:42+02:00 Hippo: A Fast, yet Scalable, Database Indexing Approach Yu, Jia Sarwat, Mohamed 2016 https://dx.doi.org/10.48550/arxiv.1604.03234 https://arxiv.org/abs/1604.03234 unknown arXiv arXiv.org perpetual, non-exclusive license http://arxiv.org/licenses/nonexclusive-distrib/1.0/ Databases cs.DB FOS Computer and information sciences H.2.2 Preprint Article article CreativeWork 2016 ftdatacite https://doi.org/10.48550/arxiv.1604.03234 2022-04-01T11:34:18Z Even though existing database indexes (e.g., B+-Tree) speed up the query execution, they suffer from two main drawbacks: (1) A database index usually yields 5% to 15% additional storage overhead which results in non-ignorable dollar cost in big data scenarios especially when deployed on modern storage devices like Solid State Disk (SSD) or Non-Volatile Memory (NVM). (2) Maintaining a database index incurs high latency because the DBMS has to find and update those index pages affected by the underlying table changes. This paper proposes Hippo a fast, yet scalable, database indexing approach. Hippo only stores the pointers of disk pages along with light weight histogram-based summaries. The proposed structure significantly shrinks index storage and maintenance overhead without compromising much on query execution performance. Experiments, based on real Hippo implementation inside PostgreSQL 9.5, using the TPC-H benchmark show that Hippo achieves up to two orders of magnitude less storage space and up to three orders of magnitude less maintenance overhead than traditional database indexes, i.e., B+-Tree. Furthermore, the experiments also show that Hippo achieves comparable query execution performance to that of the B+-Tree for various selectivity factors. : 12 pages, 10 figures, conference Report The Pointers DataCite Metadata Store (German National Library of Science and Technology)
institution Open Polar
collection DataCite Metadata Store (German National Library of Science and Technology)
op_collection_id ftdatacite
language unknown
topic Databases cs.DB
FOS Computer and information sciences
H.2.2
spellingShingle Databases cs.DB
FOS Computer and information sciences
H.2.2
Yu, Jia
Sarwat, Mohamed
Hippo: A Fast, yet Scalable, Database Indexing Approach
topic_facet Databases cs.DB
FOS Computer and information sciences
H.2.2
description Even though existing database indexes (e.g., B+-Tree) speed up the query execution, they suffer from two main drawbacks: (1) A database index usually yields 5% to 15% additional storage overhead which results in non-ignorable dollar cost in big data scenarios especially when deployed on modern storage devices like Solid State Disk (SSD) or Non-Volatile Memory (NVM). (2) Maintaining a database index incurs high latency because the DBMS has to find and update those index pages affected by the underlying table changes. This paper proposes Hippo a fast, yet scalable, database indexing approach. Hippo only stores the pointers of disk pages along with light weight histogram-based summaries. The proposed structure significantly shrinks index storage and maintenance overhead without compromising much on query execution performance. Experiments, based on real Hippo implementation inside PostgreSQL 9.5, using the TPC-H benchmark show that Hippo achieves up to two orders of magnitude less storage space and up to three orders of magnitude less maintenance overhead than traditional database indexes, i.e., B+-Tree. Furthermore, the experiments also show that Hippo achieves comparable query execution performance to that of the B+-Tree for various selectivity factors. : 12 pages, 10 figures, conference
format Report
author Yu, Jia
Sarwat, Mohamed
author_facet Yu, Jia
Sarwat, Mohamed
author_sort Yu, Jia
title Hippo: A Fast, yet Scalable, Database Indexing Approach
title_short Hippo: A Fast, yet Scalable, Database Indexing Approach
title_full Hippo: A Fast, yet Scalable, Database Indexing Approach
title_fullStr Hippo: A Fast, yet Scalable, Database Indexing Approach
title_full_unstemmed Hippo: A Fast, yet Scalable, Database Indexing Approach
title_sort hippo: a fast, yet scalable, database indexing approach
publisher arXiv
publishDate 2016
url https://dx.doi.org/10.48550/arxiv.1604.03234
https://arxiv.org/abs/1604.03234
genre The Pointers
genre_facet The Pointers
op_rights arXiv.org perpetual, non-exclusive license
http://arxiv.org/licenses/nonexclusive-distrib/1.0/
op_doi https://doi.org/10.48550/arxiv.1604.03234
_version_ 1766216889565970432