Monday, March 16, 2009

Event-Condition-Action Rules with SPARQLMotion

Several customers have requested a mechanism to trigger certain actions in response to changes to an RDF source. This can be used, for example, to update entries in some database when new triples come in, or send a message to someone to notify her about changes, or simply to synchronize dependencies among multiple data spaces. For the new beta 2 release of TopBraid, I have added a mechanism that can execute arbitrary SPARQLMotion scripts in response to changes to a model.

Here is an example of how it works. Download 3.0 and drop this file into your workspace. The file contains a SPARQLMotion script. The file is ending with .sms.n3 which instructs TopBraid to load this file into memory when the system starts up. The script (shown below) contains a module of type sml:TrackChanges. Any script with such a module will be executed whenever a change has been made to the current RDF model in TopBraid.

To play with this script, open the kennedys.owl ontology from TopBraid Composer's standard examples folder. Now, change the firstName or lastName of any Person in the model: you will see that the (full) name of the person also changes. This change is performed through SPARQL DELETE and INSERT calls at the end of the script above. This is similar to the SPIN-based mechanism for updating display labels that I described earlier today, but the changes are asserted and not only inferred. 

The basic mechanism is as follows: The TrackChanges module produces instances of the class Change from a simple change ontology. Each Change points to reified triples (instances of rdf:Statement) via change:added or change:deleted. The modules that follow the TrackChanges module can now query the Change objects and decide what to do with it. In the example above, the script iterates over all subjects that have changed their firstName or lastName and then runs the aforementioned DELETE and INSERT calls. Any other module could also be inserted though, for example to send an email, write something to a log or whatever else SPARQLMotion offers.

For the next release I plan to complete the integration of this mechanism with TopBraid Ensemble. Since Ensemble applications store their internal state in RDF, this would make it possible to perform changes to the user interface (alerts, status changes, updating charts) as a result of changes to the database. This will make it possible to implement knowledge-based stock tickers, new forms of social networks, intelligent collaboration platforms, pro-active agents etc - all with a declarative engine and no low-level coding involved apart from SPARQL. Stay tuned...

Using JavaScript to Define SPARQL Functions

Many real-world SPARQL queries make heavy use of built-in functions for tasks such as string processing and mathematical calculations. SPARQL comes with a pre-defined set of such built-in functions. However, in practice, these built-in libraries are frequently extended to solve specific problems that have not been anticipated by the language designers. Such extensions are typically implemented natively for a specific SPARQL execution engine, for example in Java. Needless to say, this is not a solution in the spirit of the (Semantic) Web, because it leads to a Tower of Babel with all kinds of dialects and platform-specific extensions.

We have proposed SPIN Functions as one possible extension mechanism for SPARQL, that allows anyone to derive new SPARQL functions by combining other SPARQL functions and query templates. In general, SPIN functions are Semantic Web resources that can be referenced by their URI to get a description of the function's arguments, return value and executable body. However, even this approach does not cover all possible use cases, because it is still limited by the lower-level SPARQL operations and functions. Many problems can only be solved with a general-purpose programming language.

TopBraid 3.0.0 beta 2 now introduces an extension of the SPIN functions mechanism that can be used to define new SPARQL functions with the help of JavaScript. In a nutshell, a SPINx function is a SPIN function that points to a snippet of JavaScript, or a JavaScript file, which will be executed whenever the function is run. The arguments of the SPARQL function call are made available as arguments to the corresponding JavaScript function. Here is a simple example:
a spin:Function ;
rdfs:subClassOf spin:Functions ;
[ a spl:Argument ;
rdfs:comment "The value to compute the square of" ;
spl:predicate sp:arg1 ;
spl:valueType xsd:float
] ;
spin:returnType xsd:float ;
spinx:javaScriptCode "return arg1 * arg1;" .
The function above can be called such as LET (?sq := ex:square(4)) to calculate the square value of the argument. In addition to having inline code via the spinx:javaScriptCode property, there is an option to simply link to a .js file that contains a function with the same name.

This simple approach (a variation of which had been proposed by Gregory Williams) greatly extends the expressive power of SPARQL through a relatively platform independent mechanism - JavaScript interpreters such as Mozilla Rhino are widely available on all major platforms. We have selected JavaScript for three major reasons: First, JavaScript (known as ECMAScript) is a well-established interpretable language that many users are already familar with. Second, JavaScript is web-friendly: you can reference scripts via URLs and scripts may import other scripts from the web as well. Finally, JavaScript has an attractive security model that will make it more difficult to create malicious SPARQL extensions (of course, details would need to be fleshed out).

The expressivity of JavaScript is great for many problems that are beyond the capabilities of SPARQL and its built-in features. You can freely express if-then-else conditions, loops, sub-procedure calls etc, and thus gradually build up your own function libraries in a portable fashion. A limitation right now is that the JavaScript-based SPARQL functions (in TopBraid) do not have any mechanism yet to access the current RDF graph at execution time. This would be needed to implement things like walking an rdf:List or computing average values etc; in short any use case that requires more background knowledge than what has been explicitly passed into the function as arguments. However, this is limitation can be fixed, for example by defining a small collection of built-in call back functions such as find(S,P,O) in JavaScript. There is a large body of related work on JavaScript RDF APIs that may also be leveraged for that purpose.

Generalizing Graphs into SPARQL Queries

The new TopBraid Composer 3.0 beta 2 now includes a simple visual SPARQL query editor that has been suggested by my colleague Dean Allemang. This query editor is a feature of Composer's graph browser. Dean has discovered that he is using the Graph browser very frequently to explore complex relationships between RDF resources, and asked me: wouldn't it be cool if I could take a snapshot of this particular graph and generalize it so that I can find all sub-graphs of the model with similar structure? Take the example below:

Here, the user has started to explore the RDF graph at a specific instance of Person (Alina Mojica). Opening up the has gender relationship displays the link to the resource female. In the graph above, this link to female as well as the year of birth 1965 have been "fixed", while all other values remain variable. Pressing the star button above the graph now creates a SPARQL query:

Executing this SPARQL query (with another mouse click on the same button) gives us all female Persons that were also born in 1965. The same approach can be used to turn more complex graph patterns into templates - just fix those values that you want to keep and leave the rest as variables.

A very similar graphical SPARQL editor, executing in a rich internet application based on Flex, is now also part of the new TopBraid Ensemble.

Deriving human-readable labels with SPIN

I know, it took me ages to get there, but TopBraid Composer 3.0.0 beta 2 finally introduces the ability to switch the whole user interface to display human-readable labels instead of resource identifiers such as qnames. There is a button in the tool bar that instructs TopBraid to use any values of rdfs:label or sub-properties thereof, and otherwise fall back to qnames. For example, in the following screen shot, Alfred Tucker's wife is displayed as "Kym Smith" instead of "kennedys:KymSmith".

I believe the lack of this (trivial looking) feature has been a major disappointment for people coming from Protege or comparable tools :) Anyway, now that we have this feature in, let's do something cool with it. You may notice that the Kennedy ontology above contains three naming properties: first name, last name and name, where name can basically be derived by concatenating first and last names. In the kennedys.owl that is found in the examples folder of TBC, the name property has been asserted and manually maintained by the editors. A smarter way is to use a SPIN rule to do this maintenance work for us:

Here, the spin:rule gets the current values of firstName and lastName for the given instance of Person (?this), creates a full string (?name) using a SPARQL built-in function call, and then infers this ?name as value for the name property. Turn on incremental SPIN inferencing and the value will update after each change.

The beauty of this approach is that you can now use arbitrary SPARQL expressions to derive labels. And, as we all know, SPARQL is very expressive and becomes even more expressive every day...

Rapid Semantic Web Application Development with TopBraid Ensemble

TopQuadrant's product family, called TopBraid Suite, consists of the semantic modeling tool TopBraid Composer, a semantic server called TopBraid Live, and a client-side framework called TopBraid Ensemble. In the new Composer 3.0.0 beta 2 release we have finally aligned all these products into a single development and testing environment: When you download the Maestro edition of beta 2, you also get a personal edition of TopBraid Ensemble, ready to use without further download or installation needs!

Let me show you how this works in practice. Launch TBC-ME 3.0.0 beta 2, and open a web browser. Then go to the URL http://localhost:8083/tbl. The following screen will show up:

The TopBraid Ensemble team has put great efforts into refactoring and generalizing the Ensemble framework, so that it is now a comprehensive development framework for dynamic business applications. Ensemble applications consist of configurable rich-client components such as trees, forms and graphs (all based on Adobe Flex). These components can be re-arranged and re-wired in many ways, to customize an application's appearance and behavior for specific needs.

A freshly installed Ensemble comes with a default application that contains a pre-defined configuration of all built-in components. You can select this default application and run it on any RDF/OWL model from your workspace. The following screen shows the default Ensemble application on the geotravel.owl example ontology:

Ensemble is entirely driven by RDF/OWL models. Not only are the user interface components such as trees and forms driven by the underlying ontology, but the state of the whole application is represented by a flexible RDF-based data model. For example, if the selection of the Tree component (on the left) changes, then some RDF triples in the application's data model will change. Other components can react on those changes, and update their own internal state. If the selection of the tree changes, then the Results Grid (upper right) will display the instances of the selected class. This behavior is "soft-coded" though, and can be changed with a few mouse clicks. Below is a screenshot of a configuration dialog that can be used to customize and re-wire the components:

As shown above, it is possible to change certain properties for each component. For example, the root object and the transitive property of the tree component can be changed dynamically. Similarly, components can listen to events or publish events for other components to consume. This makes it very easy to create custom applications, for example, an application that displays a hierarchy of SKOS terms using the skos:broader property instead of a class tree. Needless to say, various user interface settings such as the choice of available buttons and style sheets are configurable as well. Any end-user with sufficient privileges can make those adjustments and then only needs to press the Save Application button to store this configuration and share it with her co-workers.

Like the rest of the TopBraid Suite, Ensemble is currently in beta stage and is very much work in progress. Now that the generic architecture and framework have been implemented, the team will focus on fleshing out the many details to make each component as flexible, powerful and visually appealing as possible. Expect significant improvements over the next few months - your feedback is appreciated in the meantime. Yet, our customers are not limited to the built-in components. We do provide a documented client-side API that allows anyone to add new Flex/ActionScript components. Furthermore, it is possible to drive parts of the application using server-side SPARQLMotion scripts.

Using Ensemble as the foundation for custom rich internet applications means that you can re-use large chunks of client-side components as well as server-side infrastructure. This infrastructure is well aligned with Composer and its features, such as the various database back-ends. Instead of re-inventing the wheel, you can focus on building the ontologies for your particular domain, and put them to work in your team.

Saturday, March 07, 2009

TopBraid SPIN API now Open Source

Since the release of SPIN a few weeks ago, we already had several requests from people who would like to integrate SPIN functionality (SPARQL-based constraint checking, inferencing, user-defined functions) into their own applications. In order to encourage the wider adoption of SPIN in the community, we have therefore decided to make the key features of our SPIN implementation open source.

The TopBraid SPIN API is built on Jena and provides the following features:

  • Converters between textual SPARQL syntax and the SPIN RDF Vocabulary
  • A SPIN-based constraint checking engine (via spin:constraint)
  • A SPIN-based inferencing engine (via spin:rule and spin:constructor)
  • Support to execute user-defined SPIN functions using Jena/ARQ
  • Support to execute user-defined SPIN templates

The license has been selected to allow open source projects (from universities etc) to use SPIN without further complications. For closed source users, we offer a commercial license that also provides business users assurance and support. With this policy we hope to encourage researchers to provide their (open source) implementations back to the community to help the SPIN community grow.

The SPIN API is currently in beta, and the official 1.0 release is scheduled in conjunction with TopBraid 3.0 in the next few weeks. We appreciate your feedback in the meantime.