Wednesday, March 31, 2010

UISPIN: Creating HTML and SVG Documents with SPARQL

UISPIN is a new framework that can be used to link RDF resources with user interface descriptions that can be rendered as HTML or Scalable Vector Graphics (SVG) pages. Based on SPIN, UISPIN makes it easy to embed SPARQL expressions into XML snippets, so that the content of those snippets can be dynamically assembled into complete XHTML documents and SVG graphics. Here is an example of a typical UISPIN snippet (for rendering SKOS concepts):

The UISPIN snippet above would be rendered in HTML as follows:

UISPIN is comparable to template languages such as JSP and PHP, but is natively optimized for the Semantic Web and Linked Data. Among others, UISPIN makes it possible to define new XML tags that encapsulate reusable (HTML/SVG) building blocks, and to share those building blocks on the Web in RDF.

TopBraid Composer ME 3.3.0 is the first tool that provides comprehensive support for editing, debugging and browsing UISPIN documents. UISPIN itself and its implementation are still in beta stage, and we publish it to encourage early adopters to play with this new technology and provide feedback. We will move quickly to bring UISPIN to a stable state in the coming months.

There is already a large body of documentation and examples online, and I will report about more case studies and examples on this blog in the coming weeks. To get an overview, I recommend walking through the mini tutorial on uispin.com. This will give you a good idea of the basic concepts and might even be sufficient to get you started. For details and the official specification, please check uispin.org.

Tuesday, March 30, 2010

Converting UML Files to RDF via Java/EMF Objects and SPARQLMotion

TopBraid 3.3 comes with two new SPARQLMotion modules that can be used to convert arbitrary EMF files (including UML and XSD) to RDF. The underlying mechanism is very simple and extremely flexible:
  1. Eclipse is able to load .uml files into a MOF-based Java object model called EMF.
  2. SPARQLMotion provides a new module sml:ImportJavaObjectsFromEMFFile that takes a UML file as argument and loads its EMF Java objects into a SPARQLMotion variable.
  3. SPARQLMotion provides a new module sml:ConvertJavaObjectsToRDF that can traverse arbitrary Java object graphs to create RDF objects with the same structure.
  4. Now that we have the UML file represented in RDF, we can apply arbitrary transformations, in particular SPIN rules (SPARQL CONSTRUCTs) to create whatever ontology we like.
In an example SPARQLMotion script, this process looks as follows:

In the import step, we have to specify the file name (here: an example UML file) and the name of an output variable, that allows subsequent steps to access the EMF objects:

Next, we instruct the SPARQLMotion engine to convert the Java objects to RDF. This will start at the object specified by the variable uml, and then call all public getXY and isXY methods to collect the properties of the objects. The algorithm will include adjacent Java objects, as long as they are from a list of classes and packages specified under sml:javaClass:

As the output of this step, the module has produced an RDF graph containing instances as well as class and property definitions, all derived from the Java objects using reflection. All Java classes are mapped into a default namespace (abbreviated with the prefix "uml" in the screenshot below):

A TopBraid Composer class diagram shows that the generated classes also have RDF properties associated with them, all automatically derived from the Java objects:

Now that we have those RDF/OWL classes, we can attach SPARQL CONSTRUCT queries as SPIN rules to them. (Here, we use a property umlspin:mappingRule which is a subproperty of spin:rule which has max iteration count set to 1 to speed up the mapping):

These two example SPIN rules traverse the generated UML classes starting at the current class ?this, and create corresponding OWL classes and rdfs:subClassOf relationships between them. Of course, additional mapping rules are needed to represent the other UML elements such as associations and attributes. In our example SPARQLMotion script above, those SPIN rules are loaded from a generic mapping file umlspin.ttl. Note that the rules above call a user-defined SPIN function umlspin:getOWLClass that creates a URI resource from a namespace and a UML name.

With those rules in place, we just need to run the SPIN rules engine (TopSPIN) and get an OWL class hierarchy from whatever has been defined in the UML file:

The approach presented here is not completely out-of-the-box yet, but you can see that it offers a great deal of flexibility because developers can add or change any number of SPIN rules to fine tune the mapping. For example, in one case you may want to create owl:ObjectProperties from UML associations, while in other cases a reified object might be a better choice.

The SPIN based importer raises the level of abstraction and makes it possible to define complex mappings without having to work with a programming language like Java. This empowers a larger group of people to contribute mapping rules. The UML/MOF world consists of many related standards, and the SPIN/SPARQLMotion-based mapping enables the domain experts of those standards (e.g. mechanical engineers) to fine tune the importers.

Finally, other EMF-based importers exist for formats such as XML Schema, and the approach presented above would of course also work for any other Java object model, as long as it provides access to its properties using getXY and isXY methods. You can basically connect your own native Java objects into an RDF triple store by providing a bit of glue code (as a SPARQLMotion module similar to the EMF loader above).

Thursday, March 25, 2010

The SPARQLMotion Debugger

TopQuadrant's visual data processing language SPARQLMotion is now in routine use in many projects around the world. Its graphical notation and rich set of features makes it a powerful language to import, process and export linked data from almost any format, using Semantic Web technologies. Over the last two years, TopBraid's SPARQLMotion engine has become increasingly robust and it is now fair to say that it is becoming a de-facto standard language for linked data (at least in the professional world where the TBC-ME price tag is not an obstacle).

An important part of turning a (formerly) experimental language into an industry-strength solution is to provide tools that aid developers in the construction and testing phases of their project. In this spirit, we have added a graphical SPARQLMotion Debugger to TopBraid Composer 3.3. This debugger makes it possible to interrupt an executing SPARQLMotion script, to introspect into variable bindings and arguments, to ask test queries against the current state, and to display the current RDF graph.

Let's look into debugging the SPARQLMotion script shown below.

There are now two execution buttons at the top of the SPARQLMotion graph editor. The green arrow executes the script normally. If a module has been selected, then the script will only execute until this point. The green bug button will execute the script but immediately open the debugger window. The debugger will also be shown whenever a module is reached that has a breakpoint attached to it. To set a breakpoint, select a module and then press the button with the small blue dot. In either case, the debugger will show up as shown below.

The left part of the debugger displays the currently executed modules, including those from nested sub-scripts (shown indented under their parents). You can set additional breakpoints there to make sure you don't miss anything important. The main area of the debugger can be switched between three different tabs, including the Variables tab shown above. This tab displays all current input variable bindings, allowing you to see the exact state of the engine when it enters the current module. Below the variables, you can see the input arguments to the current module. The values of those arguments are shown in exactly the same way as the engine will interpret them, e.g. they will have string templates applied to them already.

The next tab can be used to ask arbitrary SPARQL queries to further explore what the current module will see when it executes. If the module takes a SPARQL query as an input argument, then this will be suggested as the initial query.

Finally, if you wonder about the structure of the input RDF graphs, you can switch to the third tab (shown below). The tree structure of the graphs represents the imports closure (sub-graphs). Having this view will be helpful if you wonder why certain queries don't work as expected - they may operate on a different set of input graphs than you may believe.

When all is said and done, you can use the buttons at the bottom to either continue the execution up to the next breakpoint, or to step into the next executing module. Note that the debugger will also show up when you run web services and TopBraid Ensemble callbacks.

The new SPARQLMotion Debugger addresses one major challenge that many users have: understanding exactly what's happening. SPARQLMotion is now no longer a black box, and there is no longer the need to create artificial debugging and trace output to trace what's happening behind the scenes. I am confident this debugger will significantly reduce the learning curve and create many happy SPARQLMotion users.