Code4lib 2008 - A Wedding of Ideas
I knew we were in for a great conference when I saw that we were sharing the excellent conference hotel with the Association of Bridal Consultants.
Writing at the end of the first day, that first impression definitely seems to be on the money. Opened by a keynote from Brewster Kahle of the Internet Archive talking about the Archive in general and the Open Library in particular.
My notes from Brewster’s presentation included the following:
- Texts - 26m books in LC - 26 TB - for the cost of a house $60,000 to host.
Need a UI - “one Web Page for Every Book Ever Published”
Started with the Million Book project - getting costs down to 10c per page
Scan 1M pages a month out of one of their 9 centres
15,000 books/month - Libraries should have a scan on demand button on their catalogues $30/book - same price as an ILL
- Scan all Microfilm - Internet Archive loan a microfilm scanner for free if you can keep it running full time.
- Selection - Need to build critical mass for books in each area - need help from libraries.
- Build a Catalogue
- Talis contributing10 million records - Hooray for Talis!
Brewster appealed for help from the library community with time/code/digital materials/catalog records/selection help/labour to digitise microfilm/links on sites to openlibrary.org.
Working together we can build a great library!
Talis’ own Rob Styles followed Brewster with an excellent presentation on Finding Relationships in MARC. His really cool slides took us through extracting data from the attributes of a MARC to enable things like authors and subjects to become ‘first class citizens’ in the data. This work is the basis of the work Rob and his colleagues Nadeem Shabir & Danny Ayres have used to compose the paper Semantic Marc, MARC21 and The Semantic Web [pdf] which will be presented at the Linked Data on the Web workshop at WWW2008 in Beijing in April.
Later in the morning we had Working with the WorldCat API from David Walker. Dave had been given early access to this SRU based API and had produced a nice mashup between WorldCat searches and holdings information for his library. This talk spawned much traffic on the cod4lib IRC channel about what this API didn’t give you that you would get from WorldCat Local. Roy Tennant, on IRC but not at the conference, responded with this list: article citations, faceted browsing, citation formatting, cover art, etc., plus other things you would need to build yourself (e.g., interoperation w/local systems, etc.)
As the day has rolled out, it is clear that the world of library techno-geekary has moved on since code4lib 2007. Gone are the endless array of individually impressive, but collectively repetitive, series of ‘what we have done by putting our library catalogue in to Solr’ presentations. Replaced with a series of, on the surface unrelated except by libraries, subjects.
Sitting back at the end of the day a feel a theme coming on - by working with the many cool things that folks are doing it can be better for all - be it: libraries contributing records to OpenLibrary; or mining the MARC we already have to build Semantic; or sharing what we do in Code4lib Journal; or using WorldCat APIs to deliver data in your interface; or using Zotero to harvest your research materials and in the future share them others; or even volunteer to help train libraries in developing countries in Open Source Library systems.
The maybe individual differences in emphasis and approach between the conference attendees, but we are all wedded to the same idea of working together for the benefit of all.
I can’t wait to see what tomorrow brings….












