Friday, January 12, 2007

Real-time Data Browsing with SPARQL

In many cases, good ideas for new features come directly from our end-users. A few days ago, a customer asked for the ability to automatically refresh the SPARQL view whenever the user selects a new resource. This is easy enough to implement, and now available in TopBraid Composer 1.5.1. Look at the following screenshot to see what it does.

Whenever the user selects a different Suburb, the SPARQL window will automatically run the given query and display the matches. This mechanism is triggered if the current query contains the variable ?this. The trick is to automatically bind the ?this variable with the current resource.

This useful mechanism can be used to quickly browse instances by arbitrary queries, e.g. to see how the system would behave during ontology design. Thanks a lot, Geoff, for this excellent idea! We appreciate more input like this...

Saturday, January 06, 2007

More on Mashups: Linear Charts

In my previous postings, I wrote about TopBraid's new Calendar mashup feature as well as our RDFa editor. Version 1.5 introduces yet another feature called the Linear Chart View. This view allows users to display numeric values along a linear chart with X and/or Y axes. If you have an ontology about real estate properties, and want to get a chart to compare the prices of all relevant properties, then you can display them along the X axis:

Alternatively, you can use the Y Axis:

To make things more interesting (also in the sense of mashing up data from multiple sources), select the price along the X axis and the number of bedrooms along the Y axis:

You can use the mouse to drag the chart in all directions or to change the zooming level. The chart is also clickable and has tool tip texts to display details. It is easy to imagine that we could have similar viewers for other types of diagrams such as pie charts.

TopBraid is increasingly becoming an information integration hub that puts a lot of power into the hands of ontology designers and users. For example, you could create ontologies that serve as a view on a relational database or lift Excel spreadsheets into the Semantic Web. Then combine these data sources with other ontologies and finally display them on chart. In the near future you will also be able to publish the very same models and views on the Web so that end users can analyze them with a thin Ajax-based client. There is quite a bit happening with TopBraid these days. I wonder where this journey is taking us...

Calendar Mashup in TopBraid Composer

Mashups are applications that combine content from multiple sources into a single view. For example, if one web page lists the dates and addresses of certain conferences and another web service knows how to convert addresses into geographical coordinates, then it would be helpful to extract both pieces of information into an integrated experience to help with travel arrangements. The open architecture of Semantic Web languages like RDF Schema and OWL make it relatively easy to bring together information from various sources. Once the data structures are brought together and aligned, they can be displayed and analyzed.

We are incrementally extending our RDF/OWL development tool TopBraid Composer with new forms of visualization. This allows ontology designers and data architects to directly experience what they can actually do with the ontologies. TopBraid already has support for geographical information with its Google Maps interface, but version 1.5 now also introduces a Calendar view.

The Calendar View enables users to visualize xsd:date and xsd:dateTime values from an RDF/OWL model on a month-by-month basis. Users select which properties to show and the system then renders them into their corresponding day boxes. In the following screenshot, the Calendar is shown alongside with the geographical viewer (click for a larger image).

As you can see on these images, some temporal values are displayed to cover multiple days, i.e. they represent a "time span". Well, but there is no standard way to formalize a "time span" in RDF, so you may wonder how does the system actually know that some properties shall be interpreted as start and end points of a time interval? The flexibility of RDF Schema enables a quite elegant solution to this problem. The Calendar is connected to a small Calendar ontology which defines two properties calendar:startTime and calendar:endTime. If your domain has a pair of properties travel:departureDate and travel:returnDate then you just need to make these properties subproperties of the calendar properties, and run the classifier to tell the system that all departureDate values shall be interpreted as startTime and all associated returnDates shall become endTimes as well. Instead of having to change the code or configuration of the Calendar itself, we just change the model to drive the existing viewer.

Friday, January 05, 2007

TopBraid is now also an RDFa Editor

A few month back I wrote about RDFa support in our ontology editor TopBraid Composer. RDFa is a set of simple XHTML attributes being proposed by W3C, allowing web developers to add RDF/OWL metadata to existing web pages. For example, RDFa can be used to state that a certain string on a page actually is the first name of a person, and that another section on that page contains the person's date of birth. Such RDFa annotations are one way of shrinking the Meaningless Web.

Although RDFa only consists of little more than a handful of attributes, it is not always simple to combine conventional HTML content with semantic markup. RDF as well as its embedding into HTML with RDFa requires some training and should be supported by tools comparable to professional IDEs. We have been thinking about how to provide tool support to help Web page designers to work with RDFa in conjunction with HTML and JSP. Our first result is the new RDFa editing support in TopBraid. This support currently includes
  • Syntax highlighting of the RDFa keywords
  • Auto-completion of property and resource names (with CTRL-space)
  • Mouse-driven navigation from an RDFa element to the corresponding resource in the ontology (CTRL-mouse over)
  • A button to quickly extract and view the triples encoded inside the RDFa document
  • Error checking at edit-time so that unknown resource names are marked as warnings
  • A library of typical RDFa source pattern snippets
  • Drag and drop from TopBraid's ontology windows into the RDFa text area

The following image shows auto completion with geographical markup.

One mode of working with this support is to open an ontology that imports the relevant namespaces (such as FOAF or SIOC) and also includes the RDFa data source. Users can then switch between editing and browse the ontology, the HTML/RDFa source code and the resulting HTML page (click on the image for a larger version):

Needless to say that people editing RDFa this way can at any time use the TopBraid infrastructure to run reasoners, query engines or all kinds of visualizations and mashups to test how other RDFa-aware applications may work with the page. For example, if an RDFa page defines an entity with geographical coordinates, then these coordinates can be directly displayed on an embedded Google map.

This integrated development environment for RDFa is enabled by the power of the Eclipse platform. Eclipse does not only provide a windowing framework for TopBraid Composer's windows, but it also comes with many reusable components. In the case of the RDFa editor, we stand on the shoulder of giants and exploit the infrastructure of the Eclipse Web Standard Tools subproject that provides editing support for HTML, and the Eclipse J2EE Standard Tools project that adds Java Server Pages support. We have layered our RDFa support on these plugins, allowing Web developers to do incremental steps into the RDFa world from the tools they are familiar with.

AllegroGraph and TopBraid 1.5

Scalability is one of the most pressing requirements of almost all the Semantic Web projects that I have been involved in with TopQuadrant. Realistic applications require the ability to work with millions of RDF triples. These triples may include data and metadata about engineering drawings, multimedia, business knowledge, mathematical simulations, search thesauri etc. The designers and programmers of these systems ask us whether we think that Semantic Web technology will scale and perform well under these heavy loads. After all, everything in RDF (and OWL) is stored as triples, comparable to a single large database table with three columns for subjects, predicates and objects.

Fortunately, a lot of smart people around the world are putting their energy into developing scalable triple stores. In the Java open-source world, Sesame and Jena are the best known choices, and Sesame's database support in particular is known to have excellent performance characteristics. The Jena folks are working on optimizations. Another scalable open-source RDF database is Mulgara, formerly known as Kowari, which I haven't used myself yet.

It is a good sign of a healthy software market that more and more commercial triple stores appear as well. While open source is great, many customers prefer to purchase a professional product license so that they have someone to hold accountable and to get help. Franz, Inc. has been in the software business for quite a while, and is particularly well known as a world-leading provider of Common Lisp products. While Lisp always seems to give the impression of being an academic language that never really made it into the mass market, Franz have optimized their Lisp compiler to astonishing performance. More recently they have started to use their Lisp platform to develop Semantic Web technology solutions. AllegroGraph is one of their Semantic technology flagship products, and they have done some great progress with it in recent months. From what I have seen, AllegroGraph has really good performance and is now (as far as I know) the best professional RDF triple store on the market. They even have a free entry-level version of AllegroGraph, that scales to up to 50 million triples.

The new TopBraid Composer 1.5 has full support of AllegroGraph through and optimized Java bridge. This enables TopBraid users to build very large models, and to convert data from other sources into triple stores. "Large" ontologies such as the infamous NCI ontology do not even come close to the orders of magnitude that these guys are working on. While the price tag of both products may not make it an option for everyone, AllegroGraph is certainly a tool to watch, especially if you are interested in an integrated solution that combines some of the best-of-breed solutions from ontology design to deployment.