Friday, August 11, 2006

Update: Automated Database import into RDF/OWL

A few days ago, I reported on the integration of Chris Bizer's D2RQ library into the ontology development toolkit TopBraid Composer. Encouraged by a lot of positive feedback, we have significantly improved this integration for version 1.1.6. The round-tripping between relational databases and OWL ontologies is now better automated and the database browsing performance vastly improved.

Here is how to import any relational database into your OWL/RDF model: Install and start TBC, and open the D2RQ import wizard:

Specify your DB connection settings, enter a file name and namespaces, and press finish. This will in the background launch the new mapping generator, a Java engine that analyzes the database metadata to find suitable mappings between the database tables and OWL classes, properties and individuals. When done, the wizard reports

You can now open the test file to browse your database, as if it were an OWL model - all conversions of database rows into individuals are done on the fly.

You can then also look at the class structures that have been generated from the database. In my first design, there was exactly one OWL class for each database table:

In a sense this means that TopBraid can be used as a database browser. In the current version, more intelligence is applied to convert link tables into a pair of inverse object properties:

You can now do whatever you want with this new OWL/RDF model, except for changing the database contents. You can however modify the automatically generated files if you don't like the long stupid default names, or want to create the person's URI by combining first name and last name. Based on TopBraid's global refactoring support, you can edit both the mapping file and the schema at the same time. You can also add description logics semantics to your classes or add rules to the ontology to perform more interesting tasks on your database than you could do with conventional database technology.

Finally, if you are happy with your new ontology, you can convert the whole stuff into an OWL file or RDF Schema, using the export/merge/convert wizard:

Here you can select to stream the virtual instances from the database together with the classes from the schema into a single file, etc.

We have tried this new feature on a couple of customer databases and it appears to work very smoothly and with high runtime performance. One of the D2RQ developers, Richard Cyganiak, even reports that the database performance often exceeds the speed of a Jena-based triple store! I am eager to receive more benchmarks from our customers, so that I can annoy our friends in Berlin with additional feature requests.

Since a lot of our users have some really good use cases for this feature, I decided to publish the new version within only a few days since the previous build. The ability to visually link, query and perform reasoning over existing databases from within Eclipse will hopefully make it easier to develop semantic applications, and to perform Semantic Data Mining.


At 7:53 AM, Anonymous Anonymous said...

Just wondering how you specify a local jar file for the Driver Jar URL? Being inside a firewall, I don't have access from Eclipse to any outside websites.


At 9:38 AM, Blogger Holger Knublauch said...

If you have a local copy of the driver's jar file, then you can point to it using a URL such as file:/C:/mypath/driver.jar

PS: if you have more questions please follow up on - thanks!


Post a Comment

<< Home