Thursday, August 17, 2006

Lifting Excel into the Semantic Web

While the overall Semantic Web vision is fairly, well, visionary, we shouldn't forget that a lot of people out there use plain old office documents like Excel spreadsheet to manage their data, knowledge and processes. Especially in engineering domains, Excel is used not only to enter tabular data, but enriched with Visual Basic scripts and macros, Excel is becoming almost a general-purpose platform.

A lot of this Excel-based data could be mined into a Semantic Web compliant format, for further processing and mapping. In addition, technologies such as OWL can also help to improve and simplify the document-driven work processes.

For a customer project, we therefore implemented a bridge to import Excel files into an ontology. The ontology itself is simple, and essentially contains concepts for workbooks, sheets and cells:

Generating instances of these spreadsheet classes is simple but just the first step. Once every cell is accessible with a unique URI and location, it can be further processed. For example, it can be automatically transformed into something else using a SPARQL query. Or, it can be mapped into another ontology. Using some mapping rules, the cell values can be inserted into other OWL instance documents to serve as input for other tools, including web services. Finally, these other tools can return values, put them back into the Excel file and then a reverse generator can serialize them back to Excel. With some clever semantic mediation service in the middle, this approach can serve as a way to get a handle on complex tool interoperability.

The base building blocks for this, including a mapping engine, will be available in the coming 1.2 version of TopBraid Composer. Stay tuned.


At 11:12 AM, Blogger Unknown said...

This is a very good idea. I've worked on many software projects with engineers, and Excel is the tool they use to model their problems. I've worked with Protege and other semantic web tools to reproduce this functionality in an open standards way. Automating this would be a great advantage.


Post a Comment

<< Home