« The beauty is in the API of the beholder | Main | A cloud of clouds »

20 October 2006

Is there a place for P2P

Posted by Richard Wallis at October 20, 2006 01:51 PM

David Bigwood was thinking out loud the other day in his Catalogablog posting P2P OPACs

Here's an idea, not even half-baked, how about peer-to-peer (P2P) networks of OPACs? Only available items would display. I'd get to pick the institutions I'd have display and whether to display non-circulating items. Something like Limewire.

Having struggled with the effects of teenage family members installing Limewire and its predecessors on the home PC, and with how we scale the traditional search of a single library's collection up to a reliable performant query of information within overlapping ad hoc groups of library collections, I have also wondered if the P2P (peer-to-peer) technologies underpinning the former could be helpful with the latter.

David's thought, of using P2P and the music sharing application Limewire as an example, when you deconsruct it is attempting to address a few well known problems in the library domain.

  • Identifying and locating Library collections - how the collection is described, physically located, and accessed electronically are all concerns in this area which resource directories, many which have come and gone, have attempted to address. In the music sharing P2P world, the major concern is getting a copy of the file with little concern as to where it comes from.

    There are several current examples of these library directories around, often limited by project, type/size of library, geographic location, commercial constraints, etc. Then there is the Silkworm Directory in the Talis Platform, an open wiki-like in philosophy, directory in which anyone can enter any library collection and then use an open API to query that information

  • The grouping together of an ad hoc set of library collections to search within. - These could be as organized as all the academic libraries within 50 miles of a city, or as random as a student's university library, the local library near her dorm, and the library in her home town - totally logical to the student - random to everyone else

    A little known, as Paul Miller only mentioned it in his Access 2006 presentation(pdf) last week, aspect of the Silkworm Directory is its ability to create ad hoc groups and then query by the members of those groups.

  • The constant searching across many dissimilar collections. - Anyone who has used or tried to pull together a federated search across many library catalogs, traditionally using Z39.50, will always have horror tails of the way locally implemented indexing rules can make a mockery of search an results ranking.

    Now if we could consistently index, search, and rank in a single store all the holdings of the collections we are interested in, as defined in a directory, providing it was scalable and performant this problem would disappear. This is the approach successfully taken by the Googles of the world. It is also how the Bigfoot element of the Talis Platform operates. (see my recent posting for a description of how Bigfoot APIs are driving driving the recently announced Project Cenote interface)

  • Filter the results of a search by the libraries in a group that have holdings. - P2P, in the same way that Z39.50 federated search does, could help in this area by querying directly individual library collections. But I suspect that it would suffer the same problems as current federated search, the fastest response you get is based on the speed of the slowest resource. P2P addresses this with caching and by down loading from several places simultaneously, which are not really applicable where you are trying to get information from a specific collection.

    The Talis Platform's holdings stores address these issues by storing, aggregated across many collections and freely contributed by libraries, holdings statements along side bibliographic stores. This is done in such away as to enable bibliographic results to be augmented with holdings information on the fly as results are returned from an API call.

  • Filter the results of a search by libraries that have in stock items. - This final step is probably the most difficult to solve in a live situation as any store can become out of date at any time that a book is borrowed from a particular collection. P2P may well have valuable application in this area, be it filtering a results set of known holdings, or keeping stores up to date on a minute by minute basis.

It remains to bee seen as to how P2P could be used, but it should not be dismissed as only a technique used for [often illegal] music downloading

David says his thought might be 'half-baked', but there are some useful ingredients in his recipe. How well some of them would scale in the wider library environment I'm not so sure, but a hybrid of P2P with some of the high volume, scalable, performent, open data, open API, aspects of the Talis platform - now that may well have legs.

Technorati Tags: , , , , , , ,

Trackback Pings

TrackBack URL for this entry:
http://blogs.talis.com/mt/mt-tb.r280.cgi/594

Comments

Now if we could consistently index, search,
and rank in a single store all the holdings
of the collections we are interested in, as
defined in a directory, providing it was
scalable and performant this problem would 
disappear.
Or multiple stores with a consistent index and infinite merging capacity. This is an excellent use for lucene which has open and globally worked on relevancy algorithms and implementations, and really strong capacity for non-western languages as well. The good news is that the world is converging on a consistent index format, and one that is more network friendly than almost anything that has preceded it.

Posted by: art at October 20, 2006 06:59 PM

Lucene is one of many technologies that is being made use of in the inner workings of the Talis Platform.

Posted by: Richard Wallis at October 21, 2006 10:15 AM

Cool, providing an accessible Lucene index would be a great way for deduping at a collections level, "show my everything in collection x that is not in collection y" and so on.

Posted by: art at October 21, 2006 11:45 PM

You might want to look at the BRICKS project, it's about P2P digital libraries.

http://www.brickscommunity.org/

The software is open, I know they are interested in more people setting up bricks nodes.

I blogged about it and some other distributed DL projects at

ECDL 2006 - tutorial - Distributed Infrastructures for Digital Libraries

Posted by: Richard Akerman at October 25, 2006 04:21 PM

Post a comment




Remember Me?

(you may use HTML tags for style)