Nodalities

From Semantic Web to Web of Data
Nodalities

Updates

Follow us on:

Categories

Archives

License

Creative Commons License

Archive for the 'Trend' Category

A New Revolution

A colleague sharing their experience of visiting Ironbridge, promoted as “The Birthplace of the Industrial Revolution” helped clarify some thoughts I have been brewing to help convey where the current Linked Data enthusiasms and initiatives may lead us.

Iron Bridge The famous Iron Bridge, opened in 1781, spans the River Severn in Shropshire, England.  To quote the WikipediaIt was the first arch bridge in the world to be made out of cast iron, a material which was previously far too expensive to use for large structures. However, a new blast furnace nearby lowered the cost and so encouraged local engineers and architects to solve a long-standing problem of a crossing over the river.“  The raw materials of iron ore and coal had been known for a long time, but it took the building of a nearby furnace, using the innovation of coke as a fuel, that enabled the local community to invest in the construction.  The outcome was not only to stimulate the local commercial and administrative economy, but it also became an 18th century tourist attraction, which it continues to be today.

linkeddata_blue All very interesting, but what has this to do with Linked Data and it’s future?

The impact of Linked Data and the Web of Data it enables, on the way we interact and do business, will be greater than that of the World Wide Web that it builds upon.

When one makes statements like that one, you are often asked to justify yourself.  As you may know, I like to use analogies to help clarify things and I believe the Industrial Revolution is a good one in the case of the future for Linked Data and associated techniques.  I am also very aware that analogies tend to fall apart if you pick at the detail too much, so please bear with me on this one.

Like the Industrial Revolution, Linked Data is building on what went before.  Before the Iron Bridge, there were other bridges, roads, and uses of iron.  Before Linked data there was/is the Web – a globally distributed web of linked human-readable web pages, upon which are surfaced words and images for our information, entertainment and commercial desires.  Data of course plays it’s part, often powering the websites that we all consume.

So what is so special about a Web of Data? – The data comes out from behind  those websites to be linked with other data across the web, or maybe an intranet.  Using the same techniques for linking pages together [the URL], data identifiers are given URIs.  This means that a piece of data is given an identifier that is addressable across the web and therefore linkable with other data identified in a similar way.

Curled Wildlife FinderSo where does the Industrial Revolution analogy kick in?  Well, once data are identifiable in a globally distributed context, they can be linked, mixed, mashed, and generally used to add value to each other.  Your data can become the raw material for someone else’s process – your Wikipedia comment about an animal can become the description on a, data powered, BBC page about that species.  As with coal, which after some refinement can become coke to be used to add value to the iron smelting process, any published data can be the raw material for value adding/combining processes.  The processor, utilising their knowledge, skills, and experience to produce an alloy of data, the combination of which is greater than the sum of it’s parts.

blast In the same way that some freely available elements, such as the air pumped in to that blast furnace, were needed to get the process going; freely and openly available data, such as governments and the media are publishing, are priming the pumps of a data revolution.

Whenever there is value to be added in a process there is both community and commercial opportunity.  Once people start using their skill and understanding of a facet of knowledge, to link data from one free, or commercial, source with more free or commercial data they can produce either a saleable result, and/or an enhancement to their own services.  The output of one value-add process can then become one of the sources for yet another, and so on.

To finally stretch my analogy just a little further – looking back to those early days in the Severn Valley, it is possible to identify the building-blocks that led to commercial steel production, the age of steam, the automobile industry, and space flight.  Most of which would have been unthinkable by those early pioneers.  Pre-1994, could we have predicted the growth of Google, YouTube, Wikipeadia, and Twitter?  In 2010 can we identify the building-blocks of a data revolution?  – I think maybe we can.

So how will such a revolution, underpinned by Linked Data, change the way we interact and do business, more fundamentally than the Web has? -  By creating whole new communities and industries to connect, supply, trade, enhance, distribute, interpret, and build services and applications upon a supporting web of globally available data elements and alloys.

We’re excited

Yay!The Talis offices, for the past few weeks, have been awash with geeky excitement—that kind of near giddy excitement that comes with eager expectation. We’ve all been waiting for something important.

For some, this was no doubt augmented with the announcement of Steve’s new iPad; but that’s not what’s gotten us all worked up.

For months, we’ve been looking forward to the launch of data.gov.uk; and last week, the wraps finally came off. The official press release put it:

A major new website has been launched to the public which gives anyone who wants to use it unprecedented and free access to government data in one place.

This doesn’t quite capture the coolness of the launch, for me. Yes, it’s a major new website, and it’s point is to publish information. But, the exciting thing is that this information is being published as data: data that can be used, reused, remixed and enriched. Sir Tim Berners-Lee’s perspective was more exciting:

Making public data available for re-use is about increasing accountability and transparency and letting people create new, innovative ways of using it. Government data should be a public resource. By releasing it, we can unlock new ideas for delivering public services, help communities and society work better, and let talented entrepreneurs and engineers create new businesses and services.

The point is that this public resource is finally getting a home on the web, and an infrastructure to make it not just available, but useful.

The exceptional team behind data.gov.uk have striven to adhere to web standards in its production: including Linked Data as a priority, as Professor Nigel Shadbolt explained:

We are also going to increase the use of ‘Linked Data’ standards, which allows people to provide data in a way that is as flexible and easy-to-use as possible.

Back in November, Leigh Dodds wrote a post explaining how we’ve been involved, and there’s an official Talis Platform press release too. Basically, we’ve been working with the data.gov.uk team to help with the Linked Data part of the site—hosting the SPARQL endpoints and providing consultancy and training, for example.

I can confidently say that we’re very proud of data.gov.uk, the team behind it, and our involvement with it. We’re excited by the prospect of this data being used as raw material for clever people to make interesting, useful, even world-changing things with it. We’ve seen the beginnings and proof-of-concept projects already.

Now comes the really exciting stuff. What are you going to build?

Image: “Yay for happy days!” by le vent le cri via flickr (CC: By)

Postcode Paper: What you can do with the right data.

Last week, I met up with some folks who are building some amazing things with public data. After seeing their Postcode Paper project, I was left with the lasting impression that given the raw materials, there is very little hindrance to what can be built.

In the Postcode Paper, Tom Taylor, Gavin Bell, and Dan Catt brought together data from a whole bunch of online sources into a single resource which can be easily distributed to residents of a local neighbourhood. Also, because this proof-of-concept is a real newspaper (i.e. made from paper and everything), it bridges any digital divide and gives access to people who might not otherwise find such information online.

To me, the real brilliance behind the paper is the context it provides for your location. Through its simple newspaper metaphor of headings and sections, one can very quickly find something exactly, or absorb trivia by browsing the headings. So, for example, there is a section for “Healthcare” which provides a list of dentists, GPs, A&E services and the name of the Primary Care Trusts serving your area. Combining this kind of immediately useful information with some general facts about an area (crime rates and trends, green-space, recycling centres and even allotment information) gives a profoundly well-informed picture of a given neighbourhood.

In a stroke of genius, the lads have added travel times from that postcode to a series of important destinations along with travel times. So, from E5 0JA to Oxford Circus takes 4 minutes via bicycle and 42 on public transport; and it’s 3 hours to Paris or Bristol on the train.There’s even a route-map for local busses and Underground transport.

Part of the thinking behind it is that local authorities could print these every few months or so to send to newcomers. Imagine finding such a rich resource in your post after paying council tax for the first time! I’ve lived in my current town for 2 years now, and I don’t know about half the information this contains. It’s presented extremely clearly and in a very familiar format, so there is very little problem communicating across generational, cultural or potentially even linguistic divides. (Much of the information, such as journey times, doctors surgeries and crime rates would need little translation.) It also doesn’t take much imagination to see additional features or benefits spinning off of this kind of service.

Put the paper online, with live-updates of the information and widgets for transport. Add in some basic demographics (gender, age bracket, long-time resident or visitor), and you’ve got hugely-flexible possibilities for providing an extremely clear UI to your community’s site. Tailoring some specifics, such as age, might let you see more information about local schools, for example, then about old-age care. With print-on-demand kiosks in local libraries and post-offices, you could have an up-to-date snapshot of your neighbourhood whenever you need. This could be used by school children for local projects (and if they can tailor the paper themselves, how much more exciting!). It could be an aid to public transparency with clearly-presented statistics like crime and school standards rates. The list of ideas is endless, really.

That’s the vision, any way. In reality, there are still some huge hurdles to cross before this kind of service could even begin to become a reality. This project took only a few days to put together, but the supremely brainy folk behind it have years of data management skills behind them. They focused on a single postcode, and many of the data needed had to be hand-scraped from various sites and files. The work needed to launch an on-demand service would be daunting indeed, because no local authority would provide a unified point of access for this kind of information. Currently, if a council wanted to provide this kind of resource, researchers would have to go out and find the facts and figures from across the web (NHS sites, central and local government sites, education and reporting services, etc), compile them and produce an individual layout for each individual postcode. And, if you’re an organisation interested in this, you would potentially be required to pay £1000s to access the basic building-blocks: post code lookups and survey boundary data. Needless to say, local authorities and businesses would be hard-pressed to find the time to build such papers to such a fine level of localisation.

Any startup, application, or service wanting to offer localised information is up against some severe inclines. It takes little imagination to see this paper and similar applications taking off and providing huge benefit to where people spend most of their time—at home. However, I fear much of these innovations will remain in imaginations as long as so much of the material needed to build them remains locked away.

Open and Closed Case

So, we’ve been banging on about opening up access to public data for a while. Talis has put its money where its mouth is and helped to fund the PDDL to give organisations a legal framework for dedicating their data to the public domain. (We’ll even host open data for free on the Platform under the Connected Commons.) We see the benefits of open data being shared innovation, and many projects exist which make use of this data for scientific, journalistic, entertaining and just plain useful purposes. We’ve been seeing a strong trend towards removing siloes and encouraging reuse of information resources to the point that we’ve begun to create our own jargon around open access. This is great, and even governments are beginning to see the benefit of this with projects like data.gov and Sir Tim Berners-Lee’s advisory appointment in the UK.

But there is an alternate side to this story of opening up and sharing our data. Where there is open, there is an implied “closed” too. Some closed data is absolutely necessary—you wouldn’t argue that your recent financial transactions are data I should have a right to pry into, for obvious example. There is a lot of hidden data necessary to run applications and to make a profit, and it is entirely right that this should be the case.

But a recent case here in the UK has illustrated the point that if open data encourages innovation, closing down data can quash it. The Royal Mail recently sent cease and desist letters to the directors of ernestmarples.com, who had been providing online services with a set of API’s to turn UK postcodes into location information. This provision had enabled the building of services which, for example, let people look for jobs in their area, and monitor and map political leaflet claims. The Royal Mail charges £4000 to make use of its official list of Postcodes, and wasn’t happy with ernestmarples.org providing postcode data for free. (ernestmarples.com did not license the data, but scraped it from other sites, apparently.) As soon as ernestmarples.com stopped providing the lookup, all the services built on the data were stopped too. So, in effect, the data was enclosed again behind a barrier of a steep paywall and legal action.

There is a lot of discussion about whether the UK postcode data should be free anyway. It was funded by public funds, for one thing, and it only generates around £11million annually for Royal Mail. The subscription rate is high for startups or non-profits, especially when compared with the Zip-code data in the United States, which I found out only costs $500/year to purchase. {1} It could also be argued that the steep pricing is an archaic throw-back to a time where such services cost a lot to provide, so needed to be high in order to recover costs. But this reverse peppercorn rent could no longer be valid, and £4000 must certainly be an order of magnitude (or two) higher than the cost of provision.

There is a lot to discuss about specific datasets like this, and they may need to be tried legally and publicly before all the details are sorted out, but this case is about as illustrative as possible of the principle of encouraging innovation. A single, simple and non-charging service provided a framework for thousands of users for mostly socially-beneficial aims. Imagine the impact if hundreds of source-services had access to postcode data? Perhaps tens of thousands of users could look for employment, or track their local governmental organisations’ progress. Who knows what else might have been developed? It doesn’t take a huge leap of imagination to envision services tailored to your very local locality, does it? Just as easily, though, the enclosure of a single database has cut off a huge network of potential innovation.

The Guardian has covered the story, if you want more details too.

Photo: “Open/Close” bymag3737 via flickr, Creative Commons License

{1} I’m not entirely sure about the licensing of the Zip-code data, but the representative I spoke with at USPS said you can purchase the 5-digit Zipcode product for $500/a.

Interesting semantic web stuff

By Tom Scott
| This guest post originally appeared on Tom Scott’s blog; republished under CreativeCommons License, and with kind permission of the author.

It’s starting to feel like the world has suddenly woken up to the whole Linked Data thing — and that’s clearly a very, very good thing. Not only are Google (and Yahoo!) now using RDFa but a whole bunch of other things are going on, all rather exciting, below is a round up of some of the best. But if you don’t know what I’m talking about you might like to start off with TimBL’s talk at TED.

TimBL is working with the UK Cabinet Office (as an advisor) to make our information more open and accessible on the web [cabinetoffice.gov.uk]
The blog states that he’s working on:

  • overseeing the creation of a single online point of access and work with departments to make this part of their routine operations.
  • helping to select and implement common standards for the release of public data
  • developing Crown Copyright and ‘Crown Commons’ licenses and extending these to the wider public sector
  • driving the use of the internet to improve consultation processes.
  • working with the Government to engage with the leading experts internationally working on public data and standards

The Guardian has an article on the appointment.

Closer to home there have been a few interesting developments

Media Meets Semantic Web – How the BBC Uses DBpedia and Linked Data to Make Connections [pdf]
Our paper at this years European Semantic Web Conference (ESWC2009) looking at how the BBC has adopted semantic web technologies, including DBpedia, to help provide a better, more coherent user experience. For which we won best paper of the in-use track – congratulations to Silver and Georgie.

The BBC has announced a couple SPARQL endpoints, hosted by talis and openlink [welcomebackstage.com]
Both platforms allow you to search and query the BBC data in a number of different ways, including SPARQL — the standard query language for semantic web data. If you’re not familiar with SPARQL, the Talis folk have published a tutorial that uses some NASA data.

A social semantic BBC? [slideshare]
Nice presentation from Simon and Ben on how social discovery of content could work… “show me the radio programmes my friends have listen to, show me the stuff my friends like that I’ve not seen” all built on people’s existing social graph. People meet content via activity.

PriceWaterhouseCooper’s spring technology forecast focuses on Linked Data [pwc.com]
“Linked Data is all about supply and demand. On the demand side, you gain access to the comprehensive data you need to make decisions. On the supply side, you share more of your internal data with partners, suppliers, and—yes—even the public in ways they can take the best advantage of. The Linked Data approach is about confronting your data silos and turning your information management efforts in a different direction for the sake of scalability. It is a component of the information mediation layer enterprises must create to bridge the gap between strategy and operations… The term “Semantic Web” says more about how the technology works than what it is. The goal is a data Web, a Web where not only documents but also individual data elements are linked.”
Including an interview with me!

You should also check out…

sameas.org a service to help link up equivalent URIs
It helps you to find co-references between different data sets. Interestingly it’s also licenced under CC0 which means all copyright and related or neighboring rights are waived.

Enhanced by Zemanta

Image: “Semantic Web Rubik’s Cube” by dullhunk, CC License, via flickr