« Presentations from WWW2007 Open Data panel now online | Main | XTech Day 1 - the Ubiquitous Web »

11 May 2007

WWW2007 - Linked Data once again

Posted by Paul Miller at May 11, 2007 06:59 PM

492503213 9Bdde12875 M

It's Friday morning here in Banff, and almost the end of the line for Rob and I as we jump on the bus for Calgary, a flight home, a weekend of washing, and then the trip to Paris for XTech.

But first, one more session on a topic that's been something of a theme for the conference since Tim Berners-Lee's keynote two days ago; a theme that looks likely to carry through to XTech in my paper and a whole track-full of others.

The topic? Linked Data, of course.

“Friday, May 11, 2007
Linked Data (Coleman, 10:30am-noon)

Session Chair: Danny Ayers (Independent)

* Tim Berners-Lee (W3C): Tabulator: A Semantic Web Browser (25 mins)
* Christian Bizer (Freie University Berlin): Querying Wikipedia Like a Database (25 mins)
* Tom Heath (KMi, The Open University): How to Combine the Best of Web2.0 and a Semantic Web: Examples from Revyu.com (25 mins)”

“Independent”? Sounds like Danny's standing for Parliament! Still, there is a vacancy coming up...

First up, Tim Berners-Lee shows Tabulator. He polls the room to start, and finds that about 50% of this packed room considers itself 'familiar' with the Semantic Web and RDF. Fewer have used or seen Tabulator.

Some of his presentation is here.

Tim demonstrates using Tabulator to view and navigate the relationships between nuggets of data stored in Linked Data-friendly repositories such as DBpedia. Interestingly - and importantly - Tabulator displays the provenance of the individual data assertions, backing up the point from his keynote that RDF triples are 'actually a quad'; with the fourth - provenance - being absolutely essential in building a trustworthy Data Web. This point came up several times in our session yesterday, too, as people grappled with issues of trust and authority in a linked network of assertions.

Tim “explains that the fundamental value is in RDF being a language for talking about data whereas XML is just a syntax for structuring documents” (thanks, Rob!)

Next up, Chris Bizer talks about DBpedia (PDF of a slightly older version of the presentation here). DBpedia is a community effort to extract structured information from Wikipedia, and make it available for linking across the web in RDF, under an open licence. The database currently contains 1,600,000 concepts, including almost 60,000 people, 70,000 places, etc. These concepts are described by 91,000,000 RDF triples, using more than 8,000 properties.

DBpedia is made available via a SPARQL endpoint, with a Linked Data interface that Semantic Web browsers such as Tabulator can interact with, and via various RDF data dumps that developers can take and implement for themselves. The DBpedia data is licensed with the GNU Free Documentation License. We really need to crack this licensing of data thing, because people aren't doing it right, and it's only going to get worse.

Chris points to the SWEO Linking Open Data project, suggesting that over 600,000,000 triples are already available via various activities of this project. He reckons there will be 30-40 billion triples within a few months, in DBpedia and in a wide range of other linked data projects capable of interoperating.

Freebase came up, and there was some discussion of the ways in which various versions of 'truth' can be meshed together between the growing number of large RDF data stores. This one is going to run and run, as various activities attempt to represent canonical views of a 'resource', and link it to facets drawn from a variety of third parties.

Finally, Tom Heath shows Revyu (presentation here). He's talking about merging the ease of participation offered by 'Web 2.0' notions, and aligning this with the Linking opportunities offered by Semantic Web data. Unlike many existing reviews sites, Revyu seeks to reach out across the web, and aggregate data from third parties rather than forcing reviewers to review on the site, where the reviews remain locked away. Revyu offers an easy interface to encourage creation of reviews, and exposes the resulting data as open RDF for machine linking and reuse. Currently, the intention is to Creative Commons-license the content (licensing is currently implicit rather than explicit), but Tom referred to Rob's presentation yesterday, and some of the issues we've grappled with in developing the Talis Community License.

In Q&A, Kingsley suggests examining Revyu.com data in an RDF browser, rather than using the site's human interface.. For example, putting the URI for Tom's on-site profile into the OpenLink RDF browser's “Data Source URI” box.

The time really is right for the community to get on and build the Semantic Web - the Data Web - for real, by exposing open - linkable - data to the Web, by realising the realities of appropriately licensing these data (the 'Talis Community License' is our contribution to that particular debate), and by taking existing tools forward to a point at which they will do more good than harm when exposed to an audience that isn't dominated by academic researchers into the theoretical constructs behind the Semantic Web.

Now for lunch... and that bus.

The picture is another one of Rob Styles', showing the view from his bedroom window. Shunt your perspective two windows to the right to imagine the view I've had...

Technorati Tags: , , , , , ,

Trackback Pings

TrackBack URL for this entry:
http://blogs.talis.com/mt/mt-tb.r280.cgi/835

Comments

That is an awesome view from a window (or more).

Lots of info here, thx.

Posted by: BillyG at May 11, 2007 08:31 PM

Hi Paul,

Nice coverage of the session, thanks! Just a word of clarification:

"Revyu seeks to reach out across the web, and aggregate data from third parties rather than forcing reviewers to review on the site, where the reviews remain locked away"

Right now we don't actually aggregate review data from elsewhere, although this is in the pipeline. What we do do is provide a number of mechanisms (JS, RSS, crawlable RDF, SPARQL) by which people can reuse reviews they provide on Revyu, so we provide the reviewing interface and the storage, and people can resyndicate their reviews in reusable formats for use in their own sites/apps. Of course we also provide dereferenceable URIs for People, Reviews, Things, and Tags, meaning people can contribute to the Linked Data/Semantic Web by simply filling in a web form.

Posted by: Tom Heath at May 11, 2007 09:28 PM

Tom

clarification noted; thanks.

Posted by: Paul Miller at May 12, 2007 11:59 AM

Post a comment




Remember Me?

(you may use HTML tags for style)