Friday, August 21, 2009

Ontology Mapping with SPIN Templates

The question of how to transform data from one ontology to another comes up again and again, most recently in a question on the W3C Semantic Web list. The requirement is very real: for example, assume you have a class Person (with firstName/lastName) and a class Member (with fullName), and you want to construct one Member for each Person, so that the fullName is derived by concatenating firstName + " " + lastName. So basically you want to transform some (legacy) data into a format that some other application can understand.

Ideally, there should be a reusable standard mapping ontology for this purpose, which is also executable and user-friendly in visual editing tools. I am not aware of such a standard ontology, but I know how it could be built. Clearly, the typical complexity of such mapping tasks goes beyond what is provided by modeling languages like OWL. A graph matching language like SPARQL with rich built-in functions will be better suited. SPARQL CONSTRUCT queries can be used to define such mappings, as described on this blog three years ago.

The SPARQL Inferencing Notation (SPIN) provides a framework for organizing such SPARQL CONSTRUCT queries in a way that is easy to maintain and efficient to execute. In the following example I will walk through the steps needed to create a generic mapping ontology for tasks such as the one above, using SPIN Templates. The example is intentionally held very simple. The resulting file can be downloaded here and you can use TopBraid Composer (even the Free Edition) to execute it.

Let's assume we have two ontologies, person and member with the following classes:

An example instance of the source ontology may look like the following, with values for firstName and lastName filled in:

SPARQL can be used to create a mapping so that all instances of Person become Members, with a fullName derived from firstName and lastName. We would need two CONSTRUCT queries: one that adds the rdf:type triple to make the Persons also Members, and one that concatenates the firstName and lastName values into the fullName. You could attach those CONSTRUCT queries as SPIN rules to the classes as shown below. Note that the variable ?this means "for every instance of the class Person".

This mechanism will work fine, we can press the inferences button to run the SPIN rule engine and it will create the new RDF triples:

We can see that the Person is now also a Member with a fullName:

However, the solution above requires that the person creating the mapping is familiar with SPARQL. Additionally, the transformations can not easily be reused and similar SPARQL queries need to be entered the next time a string concatenation is required.

SPIN Templates can be used to encapsulate SPARQL queries so that they can be reused and edited easily. In the screenshot below I have replaced the hard-coded SPARQL queries with two SPIN template calls, which actually do the same but in a much nicer way:

Another way of visualizing these is using TopBraid's Diagram facility:


Let's look behind the scenes. The two entries under the spin:rule property are now SPIN Template Calls. A Template Call is an instance of a SPIN Template, but with arguments filled in. Here is the definition of the first SPIN Template, the concatenation rule:

The SPIN Template above is wrapping a SPARQL CONSTRUCT query (under spin:body). Templates can take arguments (under spin:constraint), which define how the template can be invoked. The values of the arguments will be "inserted" as variable bindings into the SPARQL query. In the example above, there are three arguments (sourceProperty1, sourceProperty2 and targetProperty) which are referenced in the body query as variables ?sourceProperty1 etc. In order to use such a template, the user simply needs to select the source class, go to "Create from SPIN Template..." under spin:rule, and fill in the arguments, as shown below.

The resulting Template Call will be associated with the class Person as a spin:rule, so that the SPIN (mapping) engine will infer the same new triples. The main achievement though is that the string concatenation module has now been generalized and could be reused in other ontologies. Since SPIN Templates are represented entirely in RDF, they can be shared on the web. Creating a library of such mapping modules would be a great topic for a Master's Thesis...

0 Comments:

Post a Comment

<< Home