Nodalities

From Semantic Web to Web of Data
Nodalities

Updates

Follow us on:

Categories

Archives

License

Creative Commons License

Are We Getting A Right to Data?

Friday night – nothing on the TV – I know! I’ll browse through the Protection of Freedoms Bill, currently passing through the UK Parliament. Sad I know, but interesting.

Government spending data published %007C Number10.gov.uk Lets scroll back in time a bit to November 19th 2010 and a government press conference introduced by a video from Prime Minister David Cameron.  The headline story was about the publishing of government spending and contract data, but towards the end of this 109 second short he said the following:

… the most exciting is a new right to data. Which will let people request streams of government information and use it for social or commercial purposes.  Take all this together and we really can make this one of the most open, accountable and transparent governments there is.  Let me end by saying this. You are going to have so much information about what we do, how much of your money we spend doing it, and what the outcome is.  So use it, exploit it, hold us to account.  Together we can set a great example of what a modern democracy aught to look like. (my emphasis)

Obviously to realise this Right to Data there needs to be some legislation, which brings me to the Protection of Freedoms Bill.  This is one of those bills which covers all sorts of issues, from rules for destruction of fingerprints and DNA profiles, CCTV camera regulations, detention of terrorist suspects, to freedom of information and data protection.  Zooming in on the bits on the topic of the release and publication of datasets held by public authorities, we find a set of clauses that amend the Freedom of Information Act 2000

Re-use

After some amendments which allow for datasets and provision in electronic form we get this: “the public authority must, so far as reasonably practicable, provide the information to the applicant in an electronic form which is capable of re-use.”  Unfortunately there is no definition of the term re-use.  It could be argued that a pdf of some tables in a MS Word document could be re-used, where as I believe the spirit of the legislation should be made more explicit to by identifying non-proprietary data formats.  I know this would be a tricky job for the parliamentary draftsmen, as we would not want to restrict it to things, such as XML and csv, that could age and be replaced by something better which then could not be used as it had not been mentioned in the legislation, but I believe that just using the term ‘re-use’ is far too woolly and open to [mis]interpretation.

What is [not] a dataset

This is one of the areas that raises most concern for me. Checkout this wording from the Bill:text1 I am OK with (a) – data collected as part of an authority doing it’s job – and (c) – don’t change the data you have collected – publishing that raw data is important.  However (b) specifically excludes data that is the product of analysis.  Presumably analysis of collected data is one significant way that an authority measures the outcomes of its efforts.  Understanding that analysis will help understand the subsequent decisions and actions they make and take.  I assume that there may be some specific reasons that underpin this blanket exclusion of analysis data.  If there are, they should be identified, instead of generally throttling the output of useful data that will go a long way to helping with Mr Cameron’s stated ambition for us to be able to see “what the outcome is” of the spending of public money.

Release of datasets for re-use

This is a whole new section (11A)  to be added to the 2000 act to cover the release of datasets. It covers ownership, copyright, and/or database right of the information to be published and states that it should be published under “the licence specified by the Secretary of State in a code of practice issued under section 45”. Section 45 basically puts in to the hands of the Secretary of State the definition of the license(s) data should be published under.  As of today the Open Government Licence for public sector information is what is wanted to keep the publishing of information open.  However, what is there to stop a future Secretary of State, who has a less open outlook in replacing it with far more restrictive licences?  Do we not need some form of presumption of openness being attached to the Secretary of States powers as part of this change in legislation?

On the topic of presumptions of openness, the wording of this bill contains phrases such as “unless the authority is satisfied that it is not appropriate for the dataset to be published” and “where reasonably practicable”.  It is clear that many in the public sector are not as enthusiastic about publishing data as the current government position and such vague phrases as these may well be unreasonably used by some in justifying a throttling of the stream of information.   They could easily be used to build in a bureaucratic decision hurdle for each dataset to have to jump, proving its appropriateness and practicality, before publication.  I am sure that it would not be beyond a parliamentary draftsman’s skill to produce wording that means that all will be published, unless a specific objection is raised for an individual dataset, for reasons of excessive effort or data protection reasons.

Up-dated data

Data published by an authority should be published under a scheme, the following applies here:Protection of Freedoms Bill (HC Bill 146)How should we interpret “any up-dated version held by the authority of such a dataset”? My interpretation is that once a dataset has been published is shall continue to be published as it changes.  The precedent for this is spending data – having published authority spending for January 2011, authorities should be automatically publishing it for February and following months.  But what if, in response to a request, an authority publishes the contents of a spreadsheet used to track the amount of salt applied to roads in its area during winter 2010-11 and then uses a different spreadsheet for the following winter.  Does the output of that new spreadsheet constitute a new dataset, or an up-date to it’s predecessor?  From the wording in the Bill it is not clear.

Who does it cover?

I probably need a bit of help here from those that understand the public sector better than I do, but I am suspicious that references to the organisations listed in Schedule 1 and “the wider public sector”, do not take the net wide enough to cover some of the data that is relevant to our daily lives but is delivered on behalf of some authorities by third parties.  For example I am aware that recently a large city was not able to inform citizens of their rubbish collection schedules because that data was considered as commercially restricted by their service provider.

 

So in summary, I welcome the commitment to a right to data being realised by streams of government information about what we do, how much of our money is spend doing it, and what the outcomes are.  However, I am sceptical as to how effective the measures in the current Protection of Freedoms Bill will be in delivering them.  Especially in the light of very recent comments made by the Prime Minister highlighting the "enemies of enterprise" in Whitehall and town halls across the country, attacking what he called the "mad" bureaucracy that holds back entrepreneurs.  Those enemies are just the people who might take the wording of this bill as ammunition in their cause.

mug Whilst being concerned about this topic, I have been wondering why few are commenting on it.  Are the majority just taking the press conference statements by David Cameron, and his fellow Ministers, as indications of a battle won, or am I missing something?  I promote Sir Tim Berners-Lee’s 5 Star Data as the steps towards a Web of Linked Data – if we don’t get the publishing of public sector data to at least 3 star standard (Available as machine-readable structured data – in non-proprietary format), many of the current ambitions may remain just that, ambitions.  That would be a massive missed opportunity. 

So are we getting a right to data? – or just some provisions to extend the Freedom of Information Act a bit further in the dataset direction?  I’m not sure.

Personal note: As you may tell from the above, I am no expert on the interpretation of parliamentary legislation, and I have left several unanswered questions hanging in this post.  Any help in clarifying my thinking, confirming or disproving my assumptions, or answering some of those questions, will be gratefully received in comments to this post or your own posted thoughts.

A Year of Open Government Data: Transparency, but also Innovation

Screenshot of data.gov.ukTowards the end of 2010, Wikileaks generates many headlines as it publishes information on the web, causing controversy and leading to talk about politicians hiding information from the public. Reporters and commentators express shock or admiration when telling the story of a rogue organisation making governmental information public. What has not been as mainstream is that for the past year or more, governments around the world have been doing something very similar themselves: publishing information online.

Big names like President Obama, Sir Tim Berners-Lee and the headliners at big events like the International Open Government Data Conference favour publishing public data for transparency and benefits to society. This all finally began to take off in 2010. Governments from around the world have been developing their public information strategies, with the launches of data.gov and data.gov.uk and data.govt.nz.

This is all taking place at a time of economic restraint. Dr Martin Read from the UK Cabinet Office’s Efficiency Reform Board explained in a recent interview: “If you are going to improve the efficiency of something, making that change involves risk and innovation  … If they get it wrong, they’re hauled up in front of a committee for interrogation.” (moderngov, November 2010) It may seem tricky to justify the expense of big projects like data.gov.uk, and there certainly seems to be a huge amount of pressure.

Nevertheless, governments are proving themselves committed to prioritising data publishing. Towards the end of last year, the UK Prime Minister announced that every item of governmental spending over £25,000 will be published online, and updated monthly. He emphasised the importance of this publication in terms of transparency, inviting the public to scrutinise the data. Interestingly, he also said: “This scrutiny will act as a powerful straightjacket on spending, saving us a lot of money.” So, not only is data publishing seen as a benefit to democracy, but also as a useful way to “flag up waste”.

While that press conference was taking place, developers and civil servants were gathered together elsewhere at the Open Government Data Camp (disclosure, Talis was a sponsor). At the event, much was made of the modelling and tools which have been developed with open data in mind: particularly the Linked Data API, which allows developers from just about any web background to work with data.gov.uk’s data very quickly. Visualisations demonstrated what can be done with well-structured data.

One of the things this high-level data publishing has done is raise the standard for what can be published and developed. Last year, we built a proof-of-concept app for the Department of Business Innovation and Skills (BIS) to illustrate the potential of applications of this data. A few minutes spent on DEFRA’s UK Climate Projections site shows what can happen when raw data is matched with a plan, and is designed with a citizen in mind. Anyone can check the primary source for their government’s climate policy, and it doesn’t take a climatologist to understand it. A little further development allows fully-fledged applications to be built that are instantly useful: one available on the front page of data.gov.uk lets me download an app that helps me plan my cycle route!

Open government data is probably good for transparency. But it’s also got a plenty of potential to seed ideas that add value to this information. Innovators know that there are more people with better ideas outside our organisations than could possibly be in them, so sharing means that they can be developed into products and services that are mutually beneficial to everyone. The web industry routinely works with open-source software that’s been at least partly built by others, and this open-source mentality might just be an incredibly useful piece in the public-sector machinery. Open business models work very well with ideas.

2011 promises to be the year when all this data gets put to use. I was recently invited to a press conference at which the Deputy Prime Minister confirmed the UK’s commitment to published data as a priority and even a recognised civil liberty. The story will shift to more local applications of big public data tools. January will see the publication of local authority’s spending data, and public bodies will be looking to add value to this data, bringing the headlines of open data to life in the places we live.

With a bit of thought into how data is published in the first place, and a plan for encouraging people with good ideas to work with this information, this investment in data publishing could be more than just a tick-box exercise for a political transparency agenda. I hope that this year, it won’t be Wikileaks-level events that get people talking about open data publishing. We should notice it improving services we use, and see whole new applications for the bits and pieces of information that make up our public lives.

Linked Data Meetup

On Wednesday, I had the privilege to attend the first Linked Data Meetup down in Hammersmith. The day was a storming success, with talks and presentations from all over the Linked Data community: from academia to startups. I think the organisers were slightly overwhelmed, because in the end there were nearly 200 people there, making use of the Talis-sponsored bar well into the evening. Apart from being a good opportunity to catch up with people, this meetup had the feeling of a guild-meet of Linked Data professionals—with lots of different perspectives over similar problems.

The two panel discussions gave the opportunity for quite a range of different views and topics to be covered, and seemed to well. The first was about Government Data and was chaired by Carol Tullo from the Office of Public Sector Information (OPSI) and included Sir Tim Berners-Lee on a panel of five. The topics covered a swathe of issues with public data, licensing, rights and infrastructure. This panel had a certain gravitas I wasn’t expecting from a semi-formal “meetup”, probably because it was representing the UK’s actual public sector data workers. After much discussion about what it means to “link data” and what count as “LInked Data”, I was left with the important point from the discussion: there are important and well-placed people currently working to make public data public, and I look forward to the potential benefits this will have.

The second panel covered a topic which has become very important to me, and which is strongly tied up with the first: the Future of Journalism. Although I was unable to hear much of this discussion (there were a fair few of us in that hall!), I certainly found the questions asked of the panel particularly acute. There was a particular emphasis on advertising and the future of revenue for news media in an online world. From this panel, I took the view that Journalists report on the public happenings of their nations and worlds, and often what they’re working with is made available by the very institutions “making the news”. So, the work on public data has a strong bearing on journalism and on citizens’ collective knowledge of what’s going on in their worlds. Paul Bradshaw, who chaired this panel, published his notes from the session, which will give a good overview of the topics there!

I won’t report on every talk that happened here, though the programme is still available on the Meetup site, and if anyone has any links to slides or photos they’d like to share, just ping them in the comments. I had a great time, and I left feeling hugely excited by many of the projects and trends discussed there.