Nodalities

From Semantic Web to Web of Data
Nodalities

Updates

Follow us on:

Categories

Archives

License

Creative Commons License

Author Archive

JISC calls for Linked Data projects… Talis can help

JISC calls for Linked Data projects…

Back in December, I met up with the Semantic Technologies Working Group at JISC to talk a bit about the rise of Linked Data and have a high-level look at who’s been doing what. It was a great talk, from my perspective, because I was speaking to a room full of folk who knew EXACTLY what I meant. Instead of stumbling over explaining basic principles, we were all able to have a pretty healthy discussion about the big picture—looking at companies and organisations who’ve told their stories in Nodalities Magazine, for example. I left with the impression that JISC certainly has its eye on the Linked Data ball, as it were.

My impression has been strengthened recently, as I read their commissioned Linked Data “horizon scan” paper published by Paul Miller over on Cloud of Data. The Horizon Scan makes several recommendations for further investigation into Linked Data for Higher Education, the gist of which is to keep their eyes out for good use cases and to engage the Linked Data community where it can to learn more.

Then, as a couple hundred folk gathered at the second Linked Data Meetup in London, JISC announced that it’s putting £750,000 up: “…for projects to make content available on the Web working using linked data approaches.” JISC is calling for Higher Education projects to build Linked Data!

Talis can help

So, it looks like there is some alignment with our purpose here at Talis, then. We often talk about building the web of Linked Data, and we’ve been pushing projects and building stuff to make that happen. Now, it’s your turn…

The deadline for receipt of proposals in response to this call is 12 noon UK time on Tuesday 20 April 2010.

So, there isn’t much time to get proposals in. One way we can help is to host any Linked Data needed for a project on the Platform through the Connected Commons initiative. As we’ve reviewed here before, Talis will host any public data as Linked Data in the Platform. By public, we simply mean rights-waived (using PDDL or CC0) so it can be reused. The Platform hosts data online, and will also give you a SPARQL endpoint and RESTful API for rapid development on top of your new Linked Data.

The other thing we can do is to provide free developer licenses for working with the Platform and the API. There is also an extensive archive of documentation over on the developers’ wiki. Let us know what you’d like to build, and we’ll see if we can help. We’re keen to see more projects surfacing Linked Data, and it’s exciting to see what you will be building!

Finally, I’d love to hear about your projects. I can tell your Linked Data story in Nodalities Magazine, or perhaps as a podcast—whether you’re using the Platform or not. It’s great to share success stories with the wider community, and this should provide many good stories!

JISC funds Higher Education projects in the UK, and their full eligibility criteria are up on their call post.

Open… and Mobile?

light trailsI know what you’re thinking: “He’s going to say Data!”

Well, I might do at some point, but I was going to say “Days”. Last month, Talis flung open its doors to 30 or so folk who were interested in SPARQL, the Semantic Web and Linked … er, Data. The idea was to host an informal event for folks learn about much of what we’ve been talking about for the past few years. We planned some talks on what it means to join up your data, what this Platform is about, and a detailed introduction to SPARQL. With the launch of data.gov.uk and many of the stories covered over in the Magazine, it seemed possible that people were starting to get interested in this whole Linked Data scene.

So, we sent out some invites and tweeted a bit, and soon had to cap the registration numbers. We filled up spaces in the January day not long after New Year, and the February day not long after the January one. March is quickly filling up too (hint). I have to admit, I wasn’t expecting this many people to express an interest so soon. Not only did people sign up, but travelled to Birmingham through adverse weather to come and take part at both ‘Days—and we’ve had a lot of fun.

One thing that seemed to be a good idea was to ask for feedback before the event. It sounds wrong, but the point of an Open Day is to cover things that YOU’re interested in learning or exploring. So, when people registered, they were asked for their expectations and what they’d like to take away with them from such an event—aside from a T-shirt and SPARQL mug, obviously. It made it much easier to work out what we should cover, and I hope it meant that we were able to talk about the things most relevant to the people who came along.

I’d like to do it again, but slightly differently. Instead of hosting an Open Day here at Talis HQ, what if we came to you? Would you be interested in attending a Talis Platform Roadshow? What would you want us to cover? More importantly, where would you like us to go?

Comments below, or email me or tweet me.

Martin Belam Talks with Talis

Martin BelamIn this Nodalities Podcast, I talk with blogger and Guardian information architect Martin Belam. I’ve run into Martin at a few Linked Data events where the news and media industries have had a high profile (including the recent News Media Summit, and News Innovation conference last year). Martin has an interest in Linked Data, and an interesting perspective on where it fits in with News, both as a tool for journalism and research and as a resource for the industry.

Also mentioned:
Guardian Open Platform

We’re excited

Yay!The Talis offices, for the past few weeks, have been awash with geeky excitement—that kind of near giddy excitement that comes with eager expectation. We’ve all been waiting for something important.

For some, this was no doubt augmented with the announcement of Steve’s new iPad; but that’s not what’s gotten us all worked up.

For months, we’ve been looking forward to the launch of data.gov.uk; and last week, the wraps finally came off. The official press release put it:

A major new website has been launched to the public which gives anyone who wants to use it unprecedented and free access to government data in one place.

This doesn’t quite capture the coolness of the launch, for me. Yes, it’s a major new website, and it’s point is to publish information. But, the exciting thing is that this information is being published as data: data that can be used, reused, remixed and enriched. Sir Tim Berners-Lee’s perspective was more exciting:

Making public data available for re-use is about increasing accountability and transparency and letting people create new, innovative ways of using it. Government data should be a public resource. By releasing it, we can unlock new ideas for delivering public services, help communities and society work better, and let talented entrepreneurs and engineers create new businesses and services.

The point is that this public resource is finally getting a home on the web, and an infrastructure to make it not just available, but useful.

The exceptional team behind data.gov.uk have striven to adhere to web standards in its production: including Linked Data as a priority, as Professor Nigel Shadbolt explained:

We are also going to increase the use of ‘Linked Data’ standards, which allows people to provide data in a way that is as flexible and easy-to-use as possible.

Back in November, Leigh Dodds wrote a post explaining how we’ve been involved, and there’s an official Talis Platform press release too. Basically, we’ve been working with the data.gov.uk team to help with the Linked Data part of the site—hosting the SPARQL endpoints and providing consultancy and training, for example.

I can confidently say that we’re very proud of data.gov.uk, the team behind it, and our involvement with it. We’re excited by the prospect of this data being used as raw material for clever people to make interesting, useful, even world-changing things with it. We’ve seen the beginnings and proof-of-concept projects already.

Now comes the really exciting stuff. What are you going to build?

Image: “Yay for happy days!” by le vent le cri via flickr (CC: By)

Stuart Harrison Talks with Talis about Lichfield and Public Data

In a first for the Platform, we’ve done a video podcast. In it, I talk with Stuart Harrison, (@pezholio) webmaster at Lichfield District Council.

We talk about what local authorities here in the UK can do with open data, how Linked Data plays an important role, and what more is needed for local government to provide better web-based services.

This conversation was recorded on 27th October.
For other Talis podcasts in this Nodalities series, see here

Postcode Paper: What you can do with the right data.

Last week, I met up with some folks who are building some amazing things with public data. After seeing their Postcode Paper project, I was left with the lasting impression that given the raw materials, there is very little hindrance to what can be built.

In the Postcode Paper, Tom Taylor, Gavin Bell, and Dan Catt brought together data from a whole bunch of online sources into a single resource which can be easily distributed to residents of a local neighbourhood. Also, because this proof-of-concept is a real newspaper (i.e. made from paper and everything), it bridges any digital divide and gives access to people who might not otherwise find such information online.

To me, the real brilliance behind the paper is the context it provides for your location. Through its simple newspaper metaphor of headings and sections, one can very quickly find something exactly, or absorb trivia by browsing the headings. So, for example, there is a section for “Healthcare” which provides a list of dentists, GPs, A&E services and the name of the Primary Care Trusts serving your area. Combining this kind of immediately useful information with some general facts about an area (crime rates and trends, green-space, recycling centres and even allotment information) gives a profoundly well-informed picture of a given neighbourhood.

In a stroke of genius, the lads have added travel times from that postcode to a series of important destinations along with travel times. So, from E5 0JA to Oxford Circus takes 4 minutes via bicycle and 42 on public transport; and it’s 3 hours to Paris or Bristol on the train.There’s even a route-map for local busses and Underground transport.

Part of the thinking behind it is that local authorities could print these every few months or so to send to newcomers. Imagine finding such a rich resource in your post after paying council tax for the first time! I’ve lived in my current town for 2 years now, and I don’t know about half the information this contains. It’s presented extremely clearly and in a very familiar format, so there is very little problem communicating across generational, cultural or potentially even linguistic divides. (Much of the information, such as journey times, doctors surgeries and crime rates would need little translation.) It also doesn’t take much imagination to see additional features or benefits spinning off of this kind of service.

Put the paper online, with live-updates of the information and widgets for transport. Add in some basic demographics (gender, age bracket, long-time resident or visitor), and you’ve got hugely-flexible possibilities for providing an extremely clear UI to your community’s site. Tailoring some specifics, such as age, might let you see more information about local schools, for example, then about old-age care. With print-on-demand kiosks in local libraries and post-offices, you could have an up-to-date snapshot of your neighbourhood whenever you need. This could be used by school children for local projects (and if they can tailor the paper themselves, how much more exciting!). It could be an aid to public transparency with clearly-presented statistics like crime and school standards rates. The list of ideas is endless, really.

That’s the vision, any way. In reality, there are still some huge hurdles to cross before this kind of service could even begin to become a reality. This project took only a few days to put together, but the supremely brainy folk behind it have years of data management skills behind them. They focused on a single postcode, and many of the data needed had to be hand-scraped from various sites and files. The work needed to launch an on-demand service would be daunting indeed, because no local authority would provide a unified point of access for this kind of information. Currently, if a council wanted to provide this kind of resource, researchers would have to go out and find the facts and figures from across the web (NHS sites, central and local government sites, education and reporting services, etc), compile them and produce an individual layout for each individual postcode. And, if you’re an organisation interested in this, you would potentially be required to pay £1000s to access the basic building-blocks: post code lookups and survey boundary data. Needless to say, local authorities and businesses would be hard-pressed to find the time to build such papers to such a fine level of localisation.

Any startup, application, or service wanting to offer localised information is up against some severe inclines. It takes little imagination to see this paper and similar applications taking off and providing huge benefit to where people spend most of their time—at home. However, I fear much of these innovations will remain in imaginations as long as so much of the material needed to build them remains locked away.

Open and Closed Case

So, we’ve been banging on about opening up access to public data for a while. Talis has put its money where its mouth is and helped to fund the PDDL to give organisations a legal framework for dedicating their data to the public domain. (We’ll even host open data for free on the Platform under the Connected Commons.) We see the benefits of open data being shared innovation, and many projects exist which make use of this data for scientific, journalistic, entertaining and just plain useful purposes. We’ve been seeing a strong trend towards removing siloes and encouraging reuse of information resources to the point that we’ve begun to create our own jargon around open access. This is great, and even governments are beginning to see the benefit of this with projects like data.gov and Sir Tim Berners-Lee’s advisory appointment in the UK.

But there is an alternate side to this story of opening up and sharing our data. Where there is open, there is an implied “closed” too. Some closed data is absolutely necessary—you wouldn’t argue that your recent financial transactions are data I should have a right to pry into, for obvious example. There is a lot of hidden data necessary to run applications and to make a profit, and it is entirely right that this should be the case.

But a recent case here in the UK has illustrated the point that if open data encourages innovation, closing down data can quash it. The Royal Mail recently sent cease and desist letters to the directors of ernestmarples.com, who had been providing online services with a set of API’s to turn UK postcodes into location information. This provision had enabled the building of services which, for example, let people look for jobs in their area, and monitor and map political leaflet claims. The Royal Mail charges £4000 to make use of its official list of Postcodes, and wasn’t happy with ernestmarples.org providing postcode data for free. (ernestmarples.com did not license the data, but scraped it from other sites, apparently.) As soon as ernestmarples.com stopped providing the lookup, all the services built on the data were stopped too. So, in effect, the data was enclosed again behind a barrier of a steep paywall and legal action.

There is a lot of discussion about whether the UK postcode data should be free anyway. It was funded by public funds, for one thing, and it only generates around £11million annually for Royal Mail. The subscription rate is high for startups or non-profits, especially when compared with the Zip-code data in the United States, which I found out only costs $500/year to purchase. {1} It could also be argued that the steep pricing is an archaic throw-back to a time where such services cost a lot to provide, so needed to be high in order to recover costs. But this reverse peppercorn rent could no longer be valid, and £4000 must certainly be an order of magnitude (or two) higher than the cost of provision.

There is a lot to discuss about specific datasets like this, and they may need to be tried legally and publicly before all the details are sorted out, but this case is about as illustrative as possible of the principle of encouraging innovation. A single, simple and non-charging service provided a framework for thousands of users for mostly socially-beneficial aims. Imagine the impact if hundreds of source-services had access to postcode data? Perhaps tens of thousands of users could look for employment, or track their local governmental organisations’ progress. Who knows what else might have been developed? It doesn’t take a huge leap of imagination to envision services tailored to your very local locality, does it? Just as easily, though, the enclosure of a single database has cut off a huge network of potential innovation.

The Guardian has covered the story, if you want more details too.

Photo: “Open/Close” bymag3737 via flickr, Creative Commons License

{1} I’m not entirely sure about the licensing of the Zip-code data, but the representative I spoke with at USPS said you can purchase the 5-digit Zipcode product for $500/a.

Linked Data Meetup

On Wednesday, I had the privilege to attend the first Linked Data Meetup down in Hammersmith. The day was a storming success, with talks and presentations from all over the Linked Data community: from academia to startups. I think the organisers were slightly overwhelmed, because in the end there were nearly 200 people there, making use of the Talis-sponsored bar well into the evening. Apart from being a good opportunity to catch up with people, this meetup had the feeling of a guild-meet of Linked Data professionals—with lots of different perspectives over similar problems.

The two panel discussions gave the opportunity for quite a range of different views and topics to be covered, and seemed to well. The first was about Government Data and was chaired by Carol Tullo from the Office of Public Sector Information (OPSI) and included Sir Tim Berners-Lee on a panel of five. The topics covered a swathe of issues with public data, licensing, rights and infrastructure. This panel had a certain gravitas I wasn’t expecting from a semi-formal “meetup”, probably because it was representing the UK’s actual public sector data workers. After much discussion about what it means to “link data” and what count as “LInked Data”, I was left with the important point from the discussion: there are important and well-placed people currently working to make public data public, and I look forward to the potential benefits this will have.

The second panel covered a topic which has become very important to me, and which is strongly tied up with the first: the Future of Journalism. Although I was unable to hear much of this discussion (there were a fair few of us in that hall!), I certainly found the questions asked of the panel particularly acute. There was a particular emphasis on advertising and the future of revenue for news media in an online world. From this panel, I took the view that Journalists report on the public happenings of their nations and worlds, and often what they’re working with is made available by the very institutions “making the news”. So, the work on public data has a strong bearing on journalism and on citizens’ collective knowledge of what’s going on in their worlds. Paul Bradshaw, who chaired this panel, published his notes from the session, which will give a good overview of the topics there!

I won’t report on every talk that happened here, though the programme is still available on the Meetup site, and if anyone has any links to slides or photos they’d like to share, just ping them in the comments. I had a great time, and I left feeling hugely excited by many of the projects and trends discussed there.

Trends and Barriers

|This article first appeared in Nodalities Magazine, Issue 7

For anyone following the Nodalities blog, you may have read some of my recent posts discussing the trends boiling up around Web 3.0 (other buzzwords are available). The Mobile Web and upgraded connectivity in general; the rise of ubiquitous computing from chips in every product imaginable; Linked Data and the “Semantic Web” as an organising platform for this rising tide of data—these are three very broad trends seeing a lot of media attention presently. From where I’m standing, I tend to see the next great turning point of the Web as a convergence of some of these trends, and see it as a rise in the importance of and reliance upon data itself and data tools generally.

The mobile web is bringing new sorts of information to people, and they can make use of this info wherever they happen to be because of advances in devices ad connectivity. As phones and web-enabled devices get better, so to do the chips we seem to have embedded all over the place, and we can now begin to have a more clear picture of what we do through the information we gather from our heaters, cars, and pedometers. Also, as more objects become connected, the grunt-work of number-crunching and storage is becoming commoditised into big, efficient, utility-like cloud services, which host and work with our collected information much more effectively than the gadget in your hand could ever hope to do. Others, like us here at Talis, talk about the Semantic Web, which allows for an evolution from a bunch of connected documents to the explicit connections between bits of information.

Also fermenting in this mix is a strengthening trend of political transparency and a public, shared ownership of social data. Barack Obama’s new administration has clearly made this a priority with the launch and work around data.gov; and in the UK, Sir Tim Berners-Lee himself has been appointed to an Parliamentary advisory role. There is growing pressure to be able to have access to public data, and to see it as belonging to the nation’s people rather than allowed to be legitimately filed away in the great, locked bureau of the capitols.

So, picking up two fairly obvious trends here: Social, Public Data and Linked Data; it would seem to follow that people would begin to have access to previously unavailable information in usable, linked forms. And it’s certainly beginning, as articles elsewhere in this magazine have illustrated. But, what about other chunks of public data? What about when data comes from universities, institutions, scientific foundations and NGO’s? What about charities monitoring crime, CO2 emissions and family histories? Wouldn’t these make a useful piece in the web of social data? What resources have the governments themselves got, if they want to make their public-owned data available in a useful format?

These questions form a major part of the thinking behind Talis’ Connected Commons initiative (talis.com/cc). Basically, Talis has made its Semantic Web platform (including data hosting and access tools) available free of charge for any datasets made available to the public. In doing so, we’re hoping to remove the barrier of cost entirely to publishing interesting data in a Linked Data way. One major reason for this is to promote reuse and mashups of this interesting data, and for people to be able to “follow their noses” to the data that completes their projects. But, from a publishers’ perspective, this is important, because it’s removing a major reason not to bother with making data useful, if not only public. So, with this, data can be made public and useable and the developers and users get the benefit of public SPARQL endpoints and API access to interesting data.

To keep the data open and public, datasets need to make use of either the Public Domain Dedication and License (PDDL) or Creative Commons’ CC0 license. Ian Davis, in his article in this magazine, explains more about waivers and the Connected Commons, and there is a lot more about this particular initiative over on the Talis site (talis.com/platform/cc/faqs/).

In a recent interview with the BBC, Sir Tim said: “This is our data. This is our taxpayers’ money which has created this data, so I would like to be able to see it, please.” I wonder if initiatives such as Connected Commons will begin to remove excuses, hindrances, and obstacles? As public awareness of the importance of access gets hotter, this might become a political issue, as well as a pragmatic one. I hope that in the rush to publish data, and in the ensuing discussion and debate that follows, that the users, hackers and developers don’t get sidelined. I think the world is ready for its data back.

Jeni Tennison on the Talis Platform

jenitdotcomOver on her site, Jeni Tennison has been blogging about publishing data from the Platform. So far, the three-part series has included generic publishing (by converting CSV files to RDF/xml and getting them into a Platform store), using the Platform to back-end Linked Data, and exposing the published data for reuse.

Jeni’s done a great job of making the process understandable, and has included code snippets to illustrate too. You can follow the series on her site’s Talis category.

If you’ve followed Jeni’s series, or you’ve also published data from the Platform, make sure to drop us a line (platform@talis.com) to let us know what and how you’ve done it.