Nodalities

From Semantic Web to Web of Data
Nodalities

Subscribe

  • Any Podcatcher
  • Any Feed Reader

Categories

Archives

License

Creative Commons License

Cindy Ché and other interesting people

Last Friday a few of us went down to HP Labs offices in Bristol for a great free event hosted by Andy Seaborne - including free lunch. Free food always seems to make a difference, is that just me?

Nadeem’s blogged some of the sessions in detail over at Virtual Chaos and Ian Davis over at Internet Alchemy gives his own perspective. Other people covering off much of the detail is one of the great benefits of leaving the blogging for a few days after the event ;-)

A few of us are down at HP Labs in Bristol. Andy Seaborne is hosting a great free event for those interested in Semantic Web developments in the UK.

After Andy’s welcome, Ian opened the day with a presentation on where we’ve got to with the Talis Platform - over the past 3 years we’ve come a very long way, as you can from Ian’s Slides and Nadeem’s Summary. Our platform is an example of PaaS (Platform as a Service) - that is, we hope to do the heavy lifting of managing large volumes of data, indexing it, making sure it’s backed up and so on so you can concentrate on building applications. That’s a message that seemed to go down really well with lots of people grabbing us for more information during breaks and lunch.

For the rest of the day there were a good handful of very interesting sessions from a whole host of people trying to do real, practical things with semantic web technologies.

There were a few things that seemed to stand out as threads through the day - a lot of people using Jena, Redland got a couple of mentions, but mostly it was Jena. I had a great chat with Chris Dollin and it’s obvious that they take great pride in Jena, not only in the codebase and what it can do but also in the developer and user community that has formed around it. There was also a lot of interest in ontologies with people focussing on the use of ontologies to assist in user-interactions and various people mapping overlapping ontologies to allow semantic relationship to be recognised between disparate datasets.

In essence this was about people starting to do very real things, a point emphasised by Alberto Reggiori of @semantics when in one slide he announced that RDF is dead, only to have it resurrected 3 days later, complete with a slide featuring the risen Christ - award for best laugh of the day goes to Alberto. To hear more from Alberto, listen to the podcast he just did with us.

Most worthy project of the day has to go Health-e-Child, a project that is helping paediatric medical research by providing federated search services across medical data at several participating European hospitals. The hospitals have to keep their own data, due to confidentiality concerns and this data is in any number of different schemas with varying vocabularies. Ontologies feature heavily in what Peter Bloodsworth has been doing and it will be interested to see how this project progresses. It’s great to hear more from him in his podcast with us.

The back-channel chat on IRC (#swig on irc.freenode.net) was, as usual, a light-hearted and useful tool, with people sharing the links from the presentations in almost real-time. It even resulted in Sir Tim pitching in with a correction for Ordnance Survey’s site:

14:40:24 <timbl> ooops http://www.ordnancesurvey.co.uk/oswebsite/ontology/SpatialRelations.owl is Content-Type: application/octet-stream

IMG_0621.JPG

Catherine Dolbear did a great job of describing the ways in which OS are playing with small RDF datasets. With Ordnance Survey’s current business model (which she stated was unlikely to change without Government changing it for them) the data is their crown jewels, so you don’t get to play with new technologies with; especially when they’d be talking about more than ten billion triples. They have published some ontologies, see timbl quote above, but unfortunately these have been released under cc-nc-sa license, making it hard for them to be widely adopted. In questions later she told me that wasn’t something they would change. Unwittingly, Catherine also provide us with the great slide on the right.

The point of the slide was to indicate the complexity of some of the queries that geo-data requires to be useful, things like “inside” - it just made me laugh inside that “Every island is a kind of land that is surrounded by water” constituted a complex statement. That little laugh, of course, belies much of the problem we have developing the semantic web - stating the bleeding obvious in ways that are complete and unambiguous.

As a nice counter-point to Catherine’s presentation, Richard Cyganiak presented on Sindice (or Cindy Ché as it came up on the #swig back-channel). Sindice is pulling in data from Linked Data sources such as dbpedia, geonames and everyone’s foaf files and indexing them in a semantic search engine offering. What makes the nice contrast with Catherine’s presentation is the scale, 20M+ documents, 80M+ URIs, 4M+ IFPs, 2B+ triples - that’s 2 billion triples… indexing is using Solr, and there’s some hadoop in there for parallel data processing.

It’s great to see the UK semweb community thriving like this. Get-togethers are so important in allowing people the time to do show ‘n’ tell to their friends and peers. Perhaps we should organise another one soon, making sure to find a good caterer for lunch of course.

Leave a Reply