Graph Databases

Gathering huge amounts of complex information (data and knowledge) is very common nowadays. This calls for the necessity to represent, store and manipulate complex information (e.g. detect correlations and patterns, discover explanations, construct predictive models etc.). Furthermore, being autonom...

Full description

Bibliographic Details
Main Authors: Adrian Silvescu, Doina Caragea, Anna Atramentov
Other Authors: The Pennsylvania State University CiteSeerX Archives
Format: Text
Language:English
Subjects:
DML
Online Access:http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.64.3805
http://www.cs.iastate.edu/~silvescu/papers/gdb/report.pdf
Description
Summary:Gathering huge amounts of complex information (data and knowledge) is very common nowadays. This calls for the necessity to represent, store and manipulate complex information (e.g. detect correlations and patterns, discover explanations, construct predictive models etc.). Furthermore, being autonomously maintained, data can change in time or even change its base structure, making it difficult for representation systems to accommodate these changes. Current representation and storage systems are not very flexible in dealing with big changes and also they are not concerned with the ability of performing complex data manipulations of the sort mentioned above. On the other hand, data manipulation systems cannot easily work with structural or relational data, but just with flat data representations. We want to bridge the gap between the two, by introducing a new type of database structure, called Graph Databases (GDB), based on a natural graph representation. Our Graph Databases are able to represent as graphs any kind of information, naturally accommodate changes in data, and they also make easier for Machine Learning methods to use the stored information. We are mainly concerned with the design and implementation of GDB in this paper. Thus, we define the Data Definition Language (DDL) that contains extensional definitions as well as intentional definitions, and Data Manipulation Language (DML) that we use to pose queries. Then we show conceptually how to do updates, how to accommodate changes, and how to define new concepts or ask higher order queries that are not easily answer by SQL language. We also show how to transform a relational database into a GDB. Although, we show how all these things can be done using our graph databases, we do not implement all of them. This is a very laborious project that goes much beyond our class project goals. We do implement the graph databases, show how we can ask queries on them, and also how to transform relational databases into graph databases in order to be able to reuse ...