Sunday, January 18, 2009

The Object-Oriented Semantic Web with SPIN

We have recently introduced the SPARQL Inferencing Notation (SPIN) as a mechanism that uses SPARQL to formalize the meaning and behavior of Semantic Web concepts. SPIN provides a light-weight RDF vocabulary that can be used to attach rules (SPARQL CONSTRUCT queries) and constraint checks (ASK queries) to class definitions.

The design of SPIN was driven by requirements that we (at TopQuadrant) have collected from customers from "real-world" semantic technology projects. For many of those customers, the current stack of Semantic Web languages is simply not enough. OWL has been designed for certain use cases, but it fails to meet other requirements due to its open-world assumption. As Stefan Decker points out in the entertaining "An OWL 2 Far?" panel at the 2008 ISWC conference, OWL cannot even be used to check whether an instance of a class meets the cardinality restrictions. In the same panel, Tim Finin argues that OWL does not need to be the only language on the Semantic Web stack, and that instead families of languages are needed depending on the use case. So while OWL is great for some modeling tasks, there is room (and demand) for other modeling paradigms.

In this article I will introduce SPIN from another angle to illustrate what makes it different, and why I believe that it provides good solutions to many real-world requirements. My main point is that SPIN is borrowing good practices from object-oriented programming and modeling languages and integrates object-oriented techniques with the flexible architecture of Semantic Web to produce a new way of working with linked data.

As a "Hello World" example let's start with a simple ontology about geometric shapes. It defines a class Rectangle with the following characteristics (screenshot from TopBraid Composer 3):

In this RDF/SPIN ontology, a Rectangle is a class that has three properties. The values of width and height are specified by the user, and a SPIN rule is used to compute the value of the area property by multiplying width and height. Let's have a look at the rule first. A SPIN rule is a SPARQL CONSTRUCT query that has been attached to a class using the spin:rule property. The CONSTRUCT query can use whatever feature of SPARQL it wants, for example to do mathematical calculations.

A key contribution of SPIN is to introduce a mechanism that allows users to organize those SPARQL queries in a natural, object-oriented way. SPIN rules are not just plain lists of rules like in comparable rule languages (SWRL etc). Instead, the "procedural attachment" of SPIN means that you can arrange the rules in the class hierarchy where they belong. This follows the OO principles of abstraction and encapsulation. Since the rules (and constraints) are attached to classes, any human or agent who looks at the ontology can quickly understand the meaning of the classes and properties. Furthermore, the rules are "scoped" so that tools are better guided when they need to execute the rules and constraints. For example, TopBraid Composer's SPIN engine comes with a mode in which it does incremental inferencing. This means that whenever someone changes the values of width or height, then the value of area will update automatically, as shown with the example instance below:

Now let's extend the example and introduce a sub-class of Rectangle called Square. Squares inherit all characteristics of Rectangles, but they are constrained in so far that width and height of a Square must be equal. This fact is expressed using a SPIN constraint as shown below:

The constraint is an ASK query that must evaluate to false for all instances of the associated class (Square). Again, any SPARQL graph matching pattern can be tested here, and the dedicated variable ?this is used to access the current instance as a starting point. Here is an example instance of the class Square:

You can see that editing tools can use the constraint definitions to verify user input, and to provide warnings (yellow markers) if values violate a constraint. Since the SPIN constraints are scoped to the class, these tests can be performed very efficiently, only on the instance that the user is currently looking at. In TopBraid Composer, constraints can be incrementally tested after each editing step.

The image above also illustrates another object-oriented aspect of SPIN, namely inheritance: The class Square is a sub-class of Rectangle and thus also inherits the rule that calculates the area from width and height. The same inheritance mechanism applies to constraints, and sub-classes can further constrain or specialize the meaning of its parents.

Now let's look at the meta-modeling capabilities of SPIN, and its closed-world semantics. Previous articles on this blog introduced SPIN Functions and SPIN Templates. These are really powerful mechanisms that allow anyone to create their own modeling vocabulary. SPIN Functions are SPARQL functions that can be used in FILTER or LET statements. SPIN Templates are re-usable SPARQL queries that can be instantiated with parameters. In particular you can use pre-defined templates instead of typing in SPARQL queries by hand. SPIN comes with a library of such pre-defined functions and templates to support common modeling patterns. One of the templates from this library, called spl:Attribute, can be used to link a class with a property, specifying min/max cardinality, value type and default value all in a single place. Here is an example attribute definition from the Rectangle class:

This attribute definition is an instance of the spl:Attribute template, attached to Rectangle via spin:constraint. Its equivalent in OWL would be a collection of three owl:Restrictions attached to the class via rdfs:subClassOf. However, if we look at the definition of the spl:Attribute template, we can see that the semantics are different:

The query above is the body of the spl:Attribute template. When used as a spin:constraint, the system will report constraint violations whenever an instance is found that violates one of the three conditions specified in the query. Among then, is the test for the minimum and maximum number of values: if a Rectangle has less than one or more than one value for width or height, then the system will report an error. If a SPIN template is found, the engine will substitute the occurrences of the argument variables (such as ?predicate and ?minCount) with the provided arguments. In other words, instead of using the template spl:Attribute, the class Rectangle could also have the following constraint, encoded directly as a SPARQL ASK query: 

However, such SPARQL queries are complicating and hard to maintain, so that wrapping those queries into re-usable SPIN templates is a far better approach. Again, the principle of object-oriented encapsulation is put to use in SPIN. The user of a template does not need to understand all the low-level details of the underlying SPARQL. Similarly, a user does not need to understand the detailed definitions of SPIN functions (such as spl:objectCount and spl:instanceOf shown above), but instead he or she can simply pass in the arguments and leave the details to the engine. Using the SPIN templates mechanism, even the source code becomes easy to read:

I have uploaded the complete example source code of the spinsquare.n3 file. You will notice that SPIN uses a triple-based RDF representation of SPARQL to overcome the limitations of a pure textual syntax.

To summarize, SPIN applies concepts from object-oriented languages to the Semantic Web. The meaning and behavior of classes can be formalized in a machine-executable way using rules and constraints, and those rules and constraints are scoped locally to aid the execution engine in achieving best performance. Furthermore, SPIN can be used to define new modeling vocabularies, such as attribute definitions with closed-world semantics. If other semantics are needed, then they can be represented as well, as I have shown in a previous posting on expressing OWL 2 RL in SPIN.

The expressive power of SPARQL and its wide adoption in industry means that languages like SPIN could go a long way to extend the reach of Semantic Web technology into areas where other languages fail. SPIN combines a powerful rule language with meta-modeling mechanisms including templates and functions. Instead of limiting ourselves to a single modeling paradigm or language, SPIN makes it possible to create task-specific new languages, or domain-specific services such as unit conversion. All those SPIN-based languages are completely self-describing and thus only require a single unified execution engine. Thinking about this on the Semantic Web scale means that linked data can be published and used in much more powerful ways than currently envisioned.


Post a Comment

<< Home