Panlibus

Panlibus Talis Panlibus

Subscribe

  • Any Podcatcher
  • Any Feed Reader

Updates

Follow us on:

Panlibus Podcasts

Categories

Archives

License

Creative Commons License

The Open Library, and keeping it open

A couple of days ago Richard blogged about data licensing on The Open Library.

Aaron came back to comment:

Our position is that the actual catalog data on Open Library consists of uncopyrightable facts and thus is public domain. We certainly aren’t going to assert a copyright on it. The real open question is what copyright to use for descriptions and bios and other longer textual material — should we use GFDL, like Wikipedia, or some more reasonable license?

Certainly I agree with him that the factual data cannot be protected by Copyright. Facts, titles, names, short phrases, single words; none of these can be Copyrighted in the sense of a Creative Work, but there is more to Copyright than that.

A few weeks back I was talking in Banff and then in Paris about the need to license data, not to keep it closed, but to keep it open. In that discussion I broke the world into three parts, Data, Metadata and Content. Aaron is doing the same kind of split - bibliographic data and review content. That’s the right distinction to be making.

For the creative aspects there is Copyright protection, and various licenses extend this in different ways, CC, GFDL and others. The Open Library should pick whichever is the closest match to what they want to achieve. I suspect a CC-BY license would be closest, but that’s a decision for them and the community.

But what about the data? The question isn’t “can it be protected?” but “how does it need protecting and what from?”.

Now, I trust the Internet Archive. They’re probably the only people on the internet to have a wholly untarnished halo and that’s a very good thing. But things can change. There are direct parallels between what The Open Library is doing and what CDDB did back in the 90’s. The Fez Guys have a great write up of what happened to CDDB/Escient/Gracenote, but to summarize… A large community generated database of music metadata got locked away by a corporate body. It didn’t happen because CDDB planned all along to dupe their community, it didn’t happen because anyone was ‘evil’. It happened because a commercial organization needed to make money and the community had no protection from that.

An alternative service did spring up, which is what you want to happen in that circumstance. FreeDB.org set up using the CDDB software and someone in the Gracenote extended staff leaked them a copy of the database. With the correct licensing in place - a data equivalent of the GPL - FreeDB could simply have requested a copy. The community would have been protected.

I don’t want what happened with CDDB to happen with The Open Library, and to stop it requires a clear license that protects the community from The Open Library as well as The Open Library from anyone else.

This is the area we developed Talis Community License to cover (and yes, the name is draft too, it will change). We’ve been using it to protect contributions to our platform data services for over a year. It protects contributors from us as it prevents us, or anyone else, from locking the community’s data away at a later date. It’s an Open License, anyone can use it to protect their users contributions in the same way.

Technorati Tags: , , ,

Leave a Reply

The Open Library - Open for Business

OpenLibrary From Aaron Swartz:

… today I’m extraordinarily proud to announce the Open Library project. Our goal is to build the world’s greatest library, then put it up on the Internet free for all to use and edit. Books are the place you go when you have something you want to share with the world — our planet’s cultural legacy. And never has there been a bigger attempt to bring them all together.

There have been rumors about it for a while now, its great to see it open for business, even if it is only in demo mode for now.

Backed by the Internet Archive and the Open Content Alliance - from the look of the links in the footer, this ambitious project is looking good.

It is based on the wiki approach, with every page editable, both for the interface and for the bibliographic records.  I can imagine great fun in the future comparing the revision history of a contentious record as passionate cataloguers fight in virtual space over the correct form.

Visually it is has a certain almost antique charm about it, nevertheless I like it.

Creating a site which “For the first time, we’ll have an open, public, curated, universal catalog of all books” presents several challenges.  Not least which cataloguing schema to use.  Open Library are creating their own called ‘futurelib’.

Like the MARC format, we’ll want our schema to contain all the important bibliographic information that librarians want to collect about books. But we’ll also want to take advantage of all the things we’ve learned since MARC. We’ll also want to store some information that’s of less importance to librarians, but of more importance to publishers (like the ONIX format stores) and arbitrary users. And we’ll have to figure out how to present all this data in a way that makes sense to relatively untutored users.

For those interested, here is a draft schema.

Not satisfied with proposing a new cataloging schema, they are also proposing “a new, universal book identification scheme - OLN” - the kind of stuff to keep the library mailing lists exited for months!

The only thing I can not find in my brief browse around the site is any reference to licensing for use of the data contributed to the Open Library - if I’ve missed it someone please point me at it.  For the Open Library to be truly open, little things like licensing need to be clarified.  The Talis Community Licence is an obvious candidate for addressing this aspect of openly sharing freely contributed data. I will be more than happy to discuss it with the people behind the Open Library, as we are with several other organizations and communities interested in Open Data and commons licensing.

I welcome the project and complement the people behind it, they have done a great job in a short time.  I shall be watching it closely as it develops - and I bet I won’t be the only one.

Technorati Tags: , , , ,

One Response

  1. Aaron Swartz Says:

    [Your TypeKey sign-in link doesn't work because you haven't signed up for TypeKey.]

    Thanks for the kind words. The bottom of every page says: “Some information provided for promotional purposes by the publisher. Additional information and edits added by users. All contributions are in the public domain. For more information about our data, see how you can help.” We’re hoping to decide on licensing terms in more detail with the help of the community on the lists.

Leave a Reply