Nodalities

From Semantic Web to Web of Data
Nodalities

Subscribe

  • Any Podcatcher
  • Any Feed Reader

Categories

Archives

License

Creative Commons License

First impressions of “Semantic Web for the Working Ontologist”

Semantic Web for the Working Ontologist

Semantic Web for the Working Ontologist Effective Modeling in RDFS and OWL is a new book authored by Dean Allemang and Jim Hendler. I can offer a quick summary by teasing apart the title. For starters, the Semantic Web it discusses is generally in line with the current consensus view of the developer community, though with a lean in the direction of the ‘O’ word. The emphasis in the book is very much on Working and Modeling. It is practically oriented, and while it covers most of the technologies associated with the Semantic Web, its focus is on how to describe things using RDF, RDFS and OWL.

There’s a serious shortage of approachable books in the Semantic Web space - if you check the ESW Wiki list, there are only a handful that aren’t heavy duty academic works. Aside from the issue of convincing publishers there’s a market for such material (a problem that’s no doubt evaporating), there’s the difficult problem of what to write about. In the 2003 book Practical RDF, Shelley Powers used the parable of the Blind Men and an Elephant to suggest how RDF has many different aspects and can mean different things to different people - and RDF is just one Semantic Web technology (though arguably the most important). What’s more the elephant changes over time and is lavishly decorated: while the core standards solidified in 2004, since then we’ve seen various auxiliary specifications come along: the SPARQL query language, Turtle/N3 syntax, RDFa, GRDDL and so on. Ideas on best practices have also developed considerably over the years. This book is scoped to modeling with RDF, RDFS and OWL, and covers that ground admirably.

Allemang and Hendler are known experts, well-versed in the subject matter, but what’s more they have spent considerable time teaching courses on the Semantic Web, and this experience shows. The writing is clear and the book’s full of well-illustrated examples, along with a short but very handy FAQ at the end. The practical side is hinted at in their decision to devote significant space to the SKOS Simple Knowledge Organization System and FOAF Friend of a Friend vocabularies. The syntax used throughout is N3/Turtle, which makes a refreshing change from the eyestrain of RDF/XML.

There aren’t any programming (as in running code) examples, and the coverage of things like HTTP and the use of these technologies on the Web is really confined to illustrated prose. I must admit I was disappointed by the limited coverage of SPARQL, I do think this has relevance to modeling decisions. Given the rise of Linked Data in the wild, I would also have expected maybe a chapter devoted specifically to this approach (the ideas are all there in the text, but they don’t jump out).

On the other hand the coverage of reasoning with Semantic Web languages is excellent, material that can be very hard to get a handle on is here presented in an easily digestable form. Similarly the fundamental theory is explained in simple terms without recourse to arcane notation, and common misconceptions around the Semantic Web are disposed of without malice.

Contents

  1. What is the Semantic Web?
  2. Semantic Modeling
  3. RDF – the Basis of the Semantic Web
  4. Semantic Web Application Architecture
  5. RDF and Inferencing
  6. RDF Schema Language
  7. RDFS-Plus
  8. Using RDFS-Plus in the Wild
  9. Basic OWL
  10. Counting and Sets in OWL
  11. Using OWL in the Wild
  12. Good and Bad Modeling Practices
  13. OWL Levels and Logic
  14. Conclusions
  15. Frequently Asked Questions

RDFS-Plus is RDFS with the addition of some handy bits of OWL (IFPs etc).

In conclusion, this is an approachable book for anyone with interest in the field, and gives excellent coverage of the Semantic side of the Semantic Web, as it pertains to modeling the real world. With the caveat that this is the scope of this book, I’d personally strongly recommend it. I do intend to read this book cover to cover thoroughly, it is insightful writing, and as an occasional OWL user I’ll be keeping it on hand for the recipes.

See also: Henry’s [p]review

Lessons for Ontology Writers

I’m in a session called Taming the Open World, being run by Tim Swanson of Semantic Arts. I’m particularly interested in understanding how we can develop open world applications and issue 2 of Nodalities contains an article by Nadeem Shabiron just this issue. Since I have power and wifi which are both in scarce supply, I thought I’d take the opportunity to liveblog the session. Looks like it’s going to be a contrast to the Metaweb/Freebase tutorial earlier since we’re straight into OWL ontologies, running through the notation he’s going to be using. There are no standard notations for diagramming ontologies which is pretty surprising considering the wealth of research activity in this area. It’s looking rather interesting… must see if I can find some examples on the web.

First example is of two classes: Contractor and Employee and a single instance “Joe” who is an Employee. A reasoner will actually create two interpretations of these three facts: one where Joe is an Employee but not a Contractor and one where Joe is both. As more assertions are added, exponentially more interpretations are generated by the reasoner for all possible combinations. The reasoner is looking at all possible solutions and will assume that any fact is potentially true unless it has been explicitly told otherwise. This is the open world assumption - anything unknown could be true or false and a reasoner has to consider both possibilities.

A fact is provable if it is true in every possible interpretation. It is satisfiable if it is true in at least one model. These are the two main uses of a reasoner: to prove a statement or to discover if a statement is possible. However the huge number of possible interpretations massively complicates the problem. To make reasoning problems tractable we have to clean up the open world by removing facts that cannot possibly be true.

Some techiniques that can be used when writing ontologies:

  • Class disjointness - saying that two classes have no members in common such as a Living Thing and a Scheduled Event. This is especially useful with deep ontologies where the root classes are declared disjoint. This disjointness then cascades down the ontology tree to the more specific classes at the bottom eliminating many possible interpretations.
  • Domain and range - this makes the disjointness more effective by adding more information about instance types.
  • Individual differentness - OWL provides a differentFrom predictate but you have to say every individual is different from every other one by one. Some of this can be inferred using functional properties, so if two individuals have different values for a functional property then they can be inferred to be different individuals. Also we can use inverse functional properties but this is not possible with datatype properties, e.g. social security numbers. A workaround is to create a URI scheme for the value.

Some more advanced techniques include stating that an individual does not have a particular property. To do this you have to create a class for the individual resource and define that class as the complement of things that have the property in question. You have to do that for every individual, a massive explosion of triples, but a corresponding reduction in possible interpretations.

In the discussion after the session a few reasoner implementors were discussing some of these ideas. I learnt that a tableau reasoner will take all the URIs in a graph and combine them all to create all possible triples and then start eliminating them using the OWL constraints! I wonder what implications that has for Linked Data’s assignment of URIs to everything?

Working in the open world enforces a different kind of discipline in data modelling. You need to define what is not true as well as what is true. It’s best to work at the highest level possible which ends up being a supporting case for upper ontologies.