Supporting Remote User Defined Functions in Heterogeneous Biological Databases

Similar to most scientic studies, biological analyses demand a great deal of computations and simulations involving sophisticated tools that are often found geographically distributed over the Internet. A worldwide eort in genomics research has resulted in a powerful collection of publicly available...

Full description

Bibliographic Details
Main Authors: Liangyou Chen, Hasan M. Jamil
Other Authors: The Pennsylvania State University CiteSeerX Archives
Format: Text
Language:English
Published: 2001
Subjects:
DML
Online Access:http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.21.1142
http://www.cs.msstate.edu/~jamil/my-pub-papers/chen-bibe-2001.ps.gz
Description
Summary:Similar to most scientic studies, biological analyses demand a great deal of computations and simulations involving sophisticated tools that are often found geographically distributed over the Internet. A worldwide eort in genomics research has resulted in a powerful collection of publicly available sequence analysis tools. These tools often require specialized local services and domain knowledge to function correctly, rendering them unlikely candidates for integration into remote database applications. Thus, integration of heterogeneous \functions" still remains an open problem. Providing a reasonable framework for seamless integration of these tools with database query engines will enable application developers to exploit and harness the power of these eective analysis tools. In this paper, we present an integration framework for such tools by enabling access to them in a user transparent way as part of database queries. In our system, such online tools are abstracted as remote user dened functions (RUDF). An extended SQL DDL language, called the Internet Function Denition Language (IFDL), is presented for the specication and denition of RUDFs. The interface between database system and the Internet is implemented using a layer based on a language called the Hyper Text Query Language (HTQL). The separation of IFDL, DDL, HTQL and SQL DML oers several optimization opportunities and makes it possible to develop an architecture for interoperability of heterogeneous databases with RUDFs in more simple and ecient ways. 1