Monday, March 16, 2009

Using JavaScript to Define SPARQL Functions

Many real-world SPARQL queries make heavy use of built-in functions for tasks such as string processing and mathematical calculations. SPARQL comes with a pre-defined set of such built-in functions. However, in practice, these built-in libraries are frequently extended to solve specific problems that have not been anticipated by the language designers. Such extensions are typically implemented natively for a specific SPARQL execution engine, for example in Java. Needless to say, this is not a solution in the spirit of the (Semantic) Web, because it leads to a Tower of Babel with all kinds of dialects and platform-specific extensions.

We have proposed SPIN Functions as one possible extension mechanism for SPARQL, that allows anyone to derive new SPARQL functions by combining other SPARQL functions and query templates. In general, SPIN functions are Semantic Web resources that can be referenced by their URI to get a description of the function's arguments, return value and executable body. However, even this approach does not cover all possible use cases, because it is still limited by the lower-level SPARQL operations and functions. Many problems can only be solved with a general-purpose programming language.

TopBraid 3.0.0 beta 2 now introduces an extension of the SPIN functions mechanism that can be used to define new SPARQL functions with the help of JavaScript. In a nutshell, a SPINx function is a SPIN function that points to a snippet of JavaScript, or a JavaScript file, which will be executed whenever the function is run. The arguments of the SPARQL function call are made available as arguments to the corresponding JavaScript function. Here is a simple example:
    ex:square
a spin:Function ;
rdfs:subClassOf spin:Functions ;
spin:constraint
[ a spl:Argument ;
rdfs:comment "The value to compute the square of" ;
spl:predicate sp:arg1 ;
spl:valueType xsd:float
] ;
spin:returnType xsd:float ;
spinx:javaScriptCode "return arg1 * arg1;" .
The function above can be called such as LET (?sq := ex:square(4)) to calculate the square value of the argument. In addition to having inline code via the spinx:javaScriptCode property, there is an option to simply link to a .js file that contains a function with the same name.

This simple approach (a variation of which had been proposed by Gregory Williams) greatly extends the expressive power of SPARQL through a relatively platform independent mechanism - JavaScript interpreters such as Mozilla Rhino are widely available on all major platforms. We have selected JavaScript for three major reasons: First, JavaScript (known as ECMAScript) is a well-established interpretable language that many users are already familar with. Second, JavaScript is web-friendly: you can reference scripts via URLs and scripts may import other scripts from the web as well. Finally, JavaScript has an attractive security model that will make it more difficult to create malicious SPARQL extensions (of course, details would need to be fleshed out).

The expressivity of JavaScript is great for many problems that are beyond the capabilities of SPARQL and its built-in features. You can freely express if-then-else conditions, loops, sub-procedure calls etc, and thus gradually build up your own function libraries in a portable fashion. A limitation right now is that the JavaScript-based SPARQL functions (in TopBraid) do not have any mechanism yet to access the current RDF graph at execution time. This would be needed to implement things like walking an rdf:List or computing average values etc; in short any use case that requires more background knowledge than what has been explicitly passed into the function as arguments. However, this is limitation can be fixed, for example by defining a small collection of built-in call back functions such as find(S,P,O) in JavaScript. There is a large body of related work on JavaScript RDF APIs that may also be leveraged for that purpose.

0 Comments:

Post a Comment

<< Home