Wikibooks: Data Management in Bioinformatics/Data Querying

Data Management in Bioinformatics/ChapNav cleanup = Data Querying = = Query Languages = SQL (Structured Query Language ). SQL is a Data Definition and Manipulation Language (both DDL and DML) RA (Relational Algebra) Datalog = Creating Tables = Command CREATE TABLE For the relations Gene( gid name an...

Full description

Bibliographic Details
Format: Book
Language:English
Subjects:
DML
Online Access:https://en.wikibooks.org/wiki/Data_Management_in_Bioinformatics/Data_Querying
Description
Summary:Data Management in Bioinformatics/ChapNav cleanup = Data Querying = = Query Languages = SQL (Structured Query Language ). SQL is a Data Definition and Manipulation Language (both DDL and DML) RA (Relational Algebra) Datalog = Creating Tables = Command CREATE TABLE For the relations Gene( gid name annotation) Experiment( eid name date) Microarray( gid eid exprlevel) Tables can be created using the following set of commands in SQL CREATE TABLE Gene( gid INTEGER name VARCHAR(20) annotation VARCHAR(50) PRIMARY KEY (gid)) CREATE TABLE Experiment(eid INTEGER name VARCHAR(10) date DATE PRIMARY KEY (eid)) CREATE TABLE Microarray (a INTEGER b INTEGER exprlevel REAL PRIMARY KEY (gid eid)) A primary key is used to uniquely identify each row in a table. It can either be part of the actual record itself or it can be an artificial field. SQL will automatically index the table based on the primary key. Indexing with other keys (other than the primary key) is also possible it can be done by using the CREATE INDEX command. Note that the primary key for the Microarray table is composed of keys from the other two tables i.e. Gene and Experiment. These fields are called Foreign Keys. It is important to reference where the table gets its primary key from. The syntax is as follows. CREATE TABLE Microarray (a INTEGER b INTEGER exprlevel REAL PRIMARY KEY (gid eid) FOREIGN KEY (a) REFERENCES Gene (gid) FOREIGN KEY (b) Experiment (eid) ) These types of constraints are often called Referential Integrity Constraints. = Storing Data = Command INSERT INTO Example INSERT INTO Gene VALUES(1 1433E enzyme binding ) Most database management systems have the capability to load bulk data at once. = Querying = = Projection = To display all the data stored in the Gene table SQL SELECT FROM Gene RA Gene Datalog Gene(x y z) To display the names of all genes in the Gene table SQL SELECT name FROM Gene RA \pi {name}(Gene) Datalog Answer(x) \leftarrow Gene(y x z) Note to list out only distinct values use SELECT DISTINCT in SQL. RA and Datalog return ...