Friday, June 27, 2008

SPARQL Functions in Motion

SPARQL is well established as the standard query language for the Semantic Web. Comparable to SQL, SPARQL provides the SELECT keyword to extract information out of an RDF/OWL repository. SPARQL also provides the CONSTRUCT keyword to construct new triples from existing ones, making SPARQL an attractive solution to defining ontology mappings or rule bases.

As so often with W3C standards, the official specifications take you 80% to where you really want to be, while the remaining 20% are often non-standard extensions that make the technology really useful in real-world applications. In the case of SPARQL, many implementations already support some form of SPARQL Update Language with keywords such as INSERT and DELETE, leading to de-facto standards that will hopefully be officially folded into the standard in the next iterations. Another extremely useful extension has recently been implemented by Andy Seaborne in the Jena ARQ SPARQL engine: LET Assignments. Here is an example derived from Andy's blog:

SELECT ?area
WHERE {
?x rdf:type :Rectangle ;
:height ?h ;
:width ?w .
LET (?area := ?h * ?w) .
}

The LET command can be used to create new values out of existing values. The syntax of the right hand side of the assignment provides the same expressivity as FILTER expressions, which are well covered by the standard. What makes this LET command so attractive is that it greatly extends the expressiveness of SPARQL, especially when using the CONSTRUCT or INSERT keywords. We can slightly modify the example above to define a rule that automatically infers an area triple from height and width:

CONSTRUCT {
?x :area ?area .
}
WHERE {
?x rdf:type :Rectangle ;
:height ?h ;
:width ?w .
LET (?area := ?h * ?w) .
}


In addition to simple arithmetical expressions such as above, SPARQL also defines a collection of built-in functions such as bound, isBlank, lang, and str. The Jena ARQ library adds many more, including string functions.

As of version 2.6.0, our RDF/OWL development platform TopBraid Suite, which is based on Jena, includes support for LET assignments and greatly extends this mechanism. We have added a comprehensive library of more SPARQL Functions - the SPARQLMotion Functions. Among others, these functions can be used to build URIs from other names, cast values between datatypes, analyze the class structure, extract sub-strings and convert resources into human-readable names. Many more such functions will be added in future versions, in response to the use cases that we encounter in practice. TopBraid Composer provides a convenient auto-complete and context help feature to use these functions as shown in the next screenshot.



Java programmers can use the Jena API to add new functions if the provided functions are insufficient. TopBraid Composer users can also use SPARQLMotion to define new functions and make them available to any SPARQL query. I have recently uploaded an example SPARQLMotion function definition. The following screenshot, taken from this example, shows that the new function takes some input string and extracts the text in parantheses:


Using the declarative visual scripting language SPARQLMotion, average RDF/OWL experts can custom-tailor SPARQL to their individual needs without having to work with Java.