Nodalities

From Semantic Web to Web of Data
Nodalities

Subscribe

  • Any Podcatcher
  • Any Feed Reader

Updates

Follow us on:

Categories

Archives

License

Creative Commons License

Postcode Paper: What you can do with the right data.

Last week, I met up with some folks who are building some amazing things with public data. After seeing their Postcode Paper project, I was left with the lasting impression that given the raw materials, there is very little hindrance to what can be built.

In the Postcode Paper, Tom Taylor, Gavin Bell, and Dan Catt brought together data from a whole bunch of online sources into a single resource which can be easily distributed to residents of a local neighbourhood. Also, because this proof-of-concept is a real newspaper (i.e. made from paper and everything), it bridges any digital divide and gives access to people who might not otherwise find such information online.

To me, the real brilliance behind the paper is the context it provides for your location. Through its simple newspaper metaphor of headings and sections, one can very quickly find something exactly, or absorb trivia by browsing the headings. So, for example, there is a section for “Healthcare” which provides a list of dentists, GPs, A&E services and the name of the Primary Care Trusts serving your area. Combining this kind of immediately useful information with some general facts about an area (crime rates and trends, green-space, recycling centres and even allotment information) gives a profoundly well-informed picture of a given neighbourhood.

In a stroke of genius, the lads have added travel times from that postcode to a series of important destinations along with travel times. So, from E5 0JA to Oxford Circus takes 4 minutes via bicycle and 42 on public transport; and it’s 3 hours to Paris or Bristol on the train.There’s even a route-map for local busses and Underground transport.

Part of the thinking behind it is that local authorities could print these every few months or so to send to newcomers. Imagine finding such a rich resource in your post after paying council tax for the first time! I’ve lived in my current town for 2 years now, and I don’t know about half the information this contains. It’s presented extremely clearly and in a very familiar format, so there is very little problem communicating across generational, cultural or potentially even linguistic divides. (Much of the information, such as journey times, doctors surgeries and crime rates would need little translation.) It also doesn’t take much imagination to see additional features or benefits spinning off of this kind of service.

Put the paper online, with live-updates of the information and widgets for transport. Add in some basic demographics (gender, age bracket, long-time resident or visitor), and you’ve got hugely-flexible possibilities for providing an extremely clear UI to your community’s site. Tailoring some specifics, such as age, might let you see more information about local schools, for example, then about old-age care. With print-on-demand kiosks in local libraries and post-offices, you could have an up-to-date snapshot of your neighbourhood whenever you need. This could be used by school children for local projects (and if they can tailor the paper themselves, how much more exciting!). It could be an aid to public transparency with clearly-presented statistics like crime and school standards rates. The list of ideas is endless, really.

That’s the vision, any way. In reality, there are still some huge hurdles to cross before this kind of service could even begin to become a reality. This project took only a few days to put together, but the supremely brainy folk behind it have years of data management skills behind them. They focused on a single postcode, and many of the data needed had to be hand-scraped from various sites and files. The work needed to launch an on-demand service would be daunting indeed, because no local authority would provide a unified point of access for this kind of information. Currently, if a council wanted to provide this kind of resource, researchers would have to go out and find the facts and figures from across the web (NHS sites, central and local government sites, education and reporting services, etc), compile them and produce an individual layout for each individual postcode. And, if you’re an organisation interested in this, you would potentially be required to pay £1000s to access the basic building-blocks: post code lookups and survey boundary data. Needless to say, local authorities and businesses would be hard-pressed to find the time to build such papers to such a fine level of localisation.

Any startup, application, or service wanting to offer localised information is up against some severe inclines. It takes little imagination to see this paper and similar applications taking off and providing huge benefit to where people spend most of their time—at home. However, I fear much of these innovations will remain in imaginations as long as so much of the material needed to build them remains locked away.

Trends and Barriers

|This article first appeared in Nodalities Magazine, Issue 7

For anyone following the Nodalities blog, you may have read some of my recent posts discussing the trends boiling up around Web 3.0 (other buzzwords are available). The Mobile Web and upgraded connectivity in general; the rise of ubiquitous computing from chips in every product imaginable; Linked Data and the “Semantic Web” as an organising platform for this rising tide of data—these are three very broad trends seeing a lot of media attention presently. From where I’m standing, I tend to see the next great turning point of the Web as a convergence of some of these trends, and see it as a rise in the importance of and reliance upon data itself and data tools generally.

The mobile web is bringing new sorts of information to people, and they can make use of this info wherever they happen to be because of advances in devices ad connectivity. As phones and web-enabled devices get better, so to do the chips we seem to have embedded all over the place, and we can now begin to have a more clear picture of what we do through the information we gather from our heaters, cars, and pedometers. Also, as more objects become connected, the grunt-work of number-crunching and storage is becoming commoditised into big, efficient, utility-like cloud services, which host and work with our collected information much more effectively than the gadget in your hand could ever hope to do. Others, like us here at Talis, talk about the Semantic Web, which allows for an evolution from a bunch of connected documents to the explicit connections between bits of information.

Also fermenting in this mix is a strengthening trend of political transparency and a public, shared ownership of social data. Barack Obama’s new administration has clearly made this a priority with the launch and work around data.gov; and in the UK, Sir Tim Berners-Lee himself has been appointed to an Parliamentary advisory role. There is growing pressure to be able to have access to public data, and to see it as belonging to the nation’s people rather than allowed to be legitimately filed away in the great, locked bureau of the capitols.

So, picking up two fairly obvious trends here: Social, Public Data and Linked Data; it would seem to follow that people would begin to have access to previously unavailable information in usable, linked forms. And it’s certainly beginning, as articles elsewhere in this magazine have illustrated. But, what about other chunks of public data? What about when data comes from universities, institutions, scientific foundations and NGO’s? What about charities monitoring crime, CO2 emissions and family histories? Wouldn’t these make a useful piece in the web of social data? What resources have the governments themselves got, if they want to make their public-owned data available in a useful format?

These questions form a major part of the thinking behind Talis’ Connected Commons initiative (talis.com/cc). Basically, Talis has made its Semantic Web platform (including data hosting and access tools) available free of charge for any datasets made available to the public. In doing so, we’re hoping to remove the barrier of cost entirely to publishing interesting data in a Linked Data way. One major reason for this is to promote reuse and mashups of this interesting data, and for people to be able to “follow their noses” to the data that completes their projects. But, from a publishers’ perspective, this is important, because it’s removing a major reason not to bother with making data useful, if not only public. So, with this, data can be made public and useable and the developers and users get the benefit of public SPARQL endpoints and API access to interesting data.

To keep the data open and public, datasets need to make use of either the Public Domain Dedication and License (PDDL) or Creative Commons’ CC0 license. Ian Davis, in his article in this magazine, explains more about waivers and the Connected Commons, and there is a lot more about this particular initiative over on the Talis site (talis.com/platform/cc/faqs/).

In a recent interview with the BBC, Sir Tim said: “This is our data. This is our taxpayers’ money which has created this data, so I would like to be able to see it, please.” I wonder if initiatives such as Connected Commons will begin to remove excuses, hindrances, and obstacles? As public awareness of the importance of access gets hotter, this might become a political issue, as well as a pragmatic one. I hope that in the rush to publish data, and in the ensuing discussion and debate that follows, that the users, hackers and developers don’t get sidelined. I think the world is ready for its data back.