Nodalities

From Semantic Web to Web of Data
Nodalities

Subscribe

  • Any Podcatcher
  • Any Feed Reader

Categories

Archives

License

Creative Commons License

Data, MetaData and Content

I’m over in Banff, for WWW2007 next week with Paul, standing in for Ian Davis on a panel about Open Data. I figured I’d better jot some thoughts down before going so I don’t forget what to say…

Open Data is great. Lots of useful, valuable, interesting and useful data; just free as the air we breathe. Or is it?

There’s a discussion happening in the library sector about the specifics of catalog copyright and how that affects libraries’ ability to share information about their collections and some of the points are really interesting. I’ll blog some more about that soon.

What I wanted to talk about now is an aside to what I presented a few weeks back on the subject of Open Data, the web and sharing over at EUSIDIC in Copenhagen (video). I was congratulated by several audience members afterwards for giving a brave message. And I hadn’t said half of what I wanted to. One of the things I did say is that if selling data is not your core mission then you need to think about whether it helps or hinders you; and that depends on what kind of data you have.

First of all we have content; original creative work that is consumed for its own sake. A good book, the new album from your favourite artist, a beautiful painting or a blog you follow. We consume content for what it gives us directly.

Then we have data. We consume data typically by aggregating it and looking at the trends, mining it for information. We consume it for what it can tell us more-or-less directly.

Then we have metadata. Metadata is different from content and data because most of the time we don’t actually want it. We step over it, often not noticing, on our way to the content we really wanted. The track listing that gets us to the song, the catalogue that gets us to the book, the tag that helps us find the photo. All just stepping stones along the way to the thing we really wanted.

This is a worthwhile distinction to make as it helps us to understand how we might license things.

For content, we have Copyright law, and useful simplifications available to everyone through Creative Commons. Creative Commons is great because it gives everyone a simple, clear way to say “Some Rights Reserved” rather than leaving things as “All Rights Reserved”.

For data the situation is more complex. In Europe we have a Database Right which protects databases purely on the basis of the investment they took to create. There is no equivalent protection in the US. However, both countries have a notion of Compilation Copyright that protects the selection and arrangement of content; things like compilation CDs or collections of short stories are protected. Working out if your data is a compilation (which requires creativity in the act of selection) or a database, and protected by database right, seems to be quite tricky. The Ordnance Survey came under scrutiny recently when Charlotte Waelde reported that geospatial data may not be protected by copyright at all.

But what of metadata? In some cases it will be possible to protect it using Database Right, in some cases it won’t. But that’s not the important decision. The trend right now seems to be that metadata, the data needed to get where you want to go, is becoming more open, more quickly than the other two.

In many cases this is because the place you want to go get to is where the business model is; you don’t pay to search iTunes, you pay for the tunes once you’ve found them. Where businesses have built on revenue models charging for access to metadata communities are bypassing them and building their own repositories: FreeDB, Open Street Map, ISBNdb. That means that if you have a pile of metadata you might want to think about how you can give it away rather than how you can keep it locked away.

Giving it away doesn’t mean leaving it unprotected though. What Creative Commons, and software licenses like the GPL have shown us is that protection of content, data and metadata is as important to keeping it open and free as people think it is for keeping it closed. That’s why I’m hoping to write some more on licensing shortly.

 

Technorati tags: , , , , , , , , ,

Leave a Reply