Friday, October 30, 2009

Fixing Constraint Violations with spin:fix

Constraint checking is a popular feature of many Semantic Web tools to ensure that instance data meets the design objectives attached to classes and properties in an ontology. Data entry tools like TopBraid Composer and Ensemble use the SPARQL-based constraint checking language SPIN to make sure that users get warnings if the data they are entering is violating constraints. If a violation is reported, the user would read through the violation message and then change the data on the form.

One of the new features of SPIN 1.1 is the property spin:fix which can be used to let the system suggest operations that would repair a constraint violation automatically. The basic idea is that spin:fix can be attached to the spin:ConstraintViolation produced by a spin:constraint query to link the violation with one or more SPARQL Update requests. Those SPARQL Update requests may INSERT or DELETE triples to create a state in which the constraint is no longer violated. If spin:fixes are created, then the user interface may suggest them to the user with a single click. In the TopBraid Composer 3.2 screenshot below, the resource InvalidSquare violates the constraint that instances of Square must have equal width and height:

The following spin:constraint (attached to the Square class) implements the suggestions above:

You can see that the constraint creates a spin:ConstraintViolation that is attached to two instances of :SetObject via spin:fix. The class :SetObject is a SPIN template with the following definition:
The SPIN templates suggested as fixes could come from a library of re-usable building blocks. I assume that most constraint violations can be repaired by replacing, adding or deleting certain triples, and these cases can be generalized easily. Furthermore, the spin:constraints themselves can be generalized into templates, so that the ontology designer in the end just needs to pick the correct template to get a really powerful and convenient constraint checking mechanism.

Also note that the constraint fixing mechanism above could be useful beyond user interfaces, for example to automatically repair incoming data streams in web service calls.

Tuesday, October 27, 2009

OWL 2 Support in TopBraid Composer

Following today's announcement that OWL 2 has become an official W3C recommendation, I am pleased to announce that TopBraid Composer 3.2 (to be published by the end of this week) has comprehensive OWL 2 support. Here is a sampler of some of the new capabilities.

Property Chain Axioms can be used to define relationships between multiple properties, for example to define that an uncle is the brother of a parent. In OWL 2 mode, TopBraid's Properties form contains a new widget for entering such property chains using owl:propertyChainAxiom. Another kind of new property axioms, owl:propertyDisjointWith can be edited on the same page.

User-Defined Datatypes are a mechanism of narrowing down datatype properties to specific value ranges, such as integer > 0. In typical cases, such datatypes are entered as allValuesFrom restrictions on the class form. We use the Manchester Syntax for that purpose:

In addition to class axioms, user-defined datatypes can also be used as global rdfs:ranges:

OWL 2 Class Axioms including qualified cardinality restrictions and all other features supported by the Manchester Syntax can be entered on the class forms:

All other OWL 2 extensions such as new property meta-classes, keys and syntactic sugar can be edited through the generic RDF editing capabilities of TopBraid - the extended OWL 2 system vocabulary has been very helpful for this. Of course, TopBraid Composer can load and save any OWL 2 file in formats such as RDF/XML or Turtle.

At the time of writing this, I am not aware of any OWL 2 compliant inference engines that we could freely distribute with TopBraid Composer. Currently available options include OWL RL engines such as SPIN or Oracle 11g RDF. I am sure more will follow, and anyone in the community is invited to contribute plug-ins to those inference engines that we cannot legally ship with our platform, as separate downloads.

The new OWL 2 support is available in all editions of TopBraid Composer, including the Free Edition.

Monday, October 19, 2009

RDFex: Partial ontology imports

One of the overall design goals of Linked Data and the Semantic Web is vocabulary re-use. Instead of having thousands of "Person" classes, new ontologies should attempt to re-use existing Person definitions, such as those found in the FOAF namespace. This schema re-use makes it easier for Semantic Web agents to link data together, and potentially reduces the maintenance costs as it becomes possible to benefit from the whole infrastructure and community around those shared ontologies.

However, there are a couple of well-known reasons why such a re-use is not always feasible or desirable, leading to situations in which developers feel they need to reinvent the wheel. One particular problem is that the OWL construct for linking vocabularies (owl:imports) has all-or-nothing semantics: if my ontology owl:imports the FOAF namespace, then I would suck all definitions of FOAF into my own model, even though I just care about one or two concepts. The result is that in Semantic Web inference engines, browsers or editors, my ontology will be full of definitions that are just distracting, or unnecessarily increase the complexity of my model. Since owl:imports is not the ideal mechanism, people sometimes simply extract term definitions from other files and paste them into their own files - look for example at the bottom of this file. This, of course, leads to other maintenance problems and is generally not a clean approach.

For an internal project in which I wanted to re-use parts of the SIOC namespace, I have implemented a new web service called RDFex is a very simple, yet IMHO elegant, mechanism for using owl:imports to import snippets of other namespaces without having to copy and paste them. The basic idea is that the server can be used as a proxy for various popular ontologies (including DC, FOAF and SIOC), so that users can specify which classes, properties and individuals from those namespaces they would like to import (using owl:imports). For example, the proxy ontology,firstName represents all triples defining the class foaf:Person and the property foaf:firstName, including their rdf:type, rdfs:labels, rdfs:comments and any relationships between those terms (such as rdfs:domain and rdfs:range). Any combination of those resources is available because the result will be dynamically assembled at request time.

The upcoming release of the TopBraid platform 3.2 also has native support for those rdfex imports, so that the system can do this extraction from local copies instead of having to go to the web. TopQuadrant is committed to supporting this service in the future, so please feel free to use it if you find it useful.