Blog

Open Data consultations

by Wendy M Grossman | posted on 03 October 2011

The opening last month of twin consultations on open data - Making Open Data Real and Public Data Corporation - is spurring a number of meetings to discuss how to respond. Today it was the Open Rights Group's turn to organize a brainstorming session.

Why two consultations? The first covers open data itself – how it should be defined, how it should be licensed, who gets to use it. The second covers the creation of a Public Data Corporation along the lines of the 50 or 60 other public corporations in the UK (of which the BBC is the best-known).

The hidden point of the PDC is to resolve an internal government conflict. On the one side, you have the Transparency Board within the Cabinet Office. On the other hand, you have the Treasury department, which derives revenues from selling the more obviously commercial datasets (Met Office data, Ordnance Survey, and the like). The situation there is additionally complicated by the fact that taxpayers' money only partially subsidizes the collection and sale of such data; trading funds (which I'd never heard of before) sell these datasets, and they create a wrinkle: if the datasets are now to be given away, where is the money to come from to pay for distributing it?

In the digital world, the cost structures of the physical world are inverted. In the physical world, the more data you give someone the more it costs you: photocopying, printing, paper, postage. In the digital world, less costs more: the more selective you have to be about which data to hand over the more it costs you in time. In the digital world the cheapest thing is to hand over everything. (Which is precisely how the affair of the HMRC discs@@ happened.) But that is not to say there is no cost: data must be maintained, cleaned, updated. What happens when someone finds an error in a dataset? How is that feedback loop managed and who pays to fix it?

Almost everybody seemed to hate almost everything about the PDC proposals other than the name.

The discussion about open data itself was more complex. Some basic principles seem clear: that open data should not be confused with personal data. That no one wants a situation where only big, rich corporations can afford to buy it, so that instead of having a single mjonopolist (government) we have an oligopoly who can even more effectively disempower citizens and smaller organizations. That one of the complex issues is that, like anyone who's been using computers for a long time, has no clear idea of what data it has: it needs an inventory.

There are all sorts of other issues surrounding privacy (although open data is not personal data), anonymization (which most experts think is impossible to guarantee), what kind of framework might be needed, since the existing data protection laws are designed to protect citizens not deal with collective data, and so on. Kieron O’Hara’s report, Transparent Government, Not Transparent Citizens (PDF), is highly recommended for background reading.

The deadline for submissions is October 27, 2011. ORG's Jim Killock sets out ORG's initial views here, hoping for feedback.

Technorati tags: open data

Wendy M. Grossman’s Web site has an extensive archive of her books, articles, and music, and an archive of all the earlier columns in this series. Readers are welcome to post here, at net.wars home, follow on Twitter or send email to netwars(at) skeptic.demon.co.uk (but please turn off HTML).