Panlibus

Panlibus Talis Panlibus

Subscribe

  • Any Podcatcher
  • Any Feed Reader

Panlibus Podcasts

Categories

Archives

License

Creative Commons License

Archive for the 'Web 2.0' Category

Connections, Connections, Connections

It’s a little disconcerting when your own words from months ago are quoted back at you from a distance.  That’s the trouble with the blogosphere, it is so easy for connections to what you have said to be linked in to the conversation in ways you never expected.  Trouble? - No it is one of its major benefits - disconcerting or not!

Recently Mark Dahl quoted something I said a while back.  I was discussing how we must stop developing destination applications and start delivering the information and functionality that users want, to where they are working - for instance inside the Learning Management System/eLearning System/VLE (or whatever you call them down your way) - apparently I boasted that the new Reading List (Course Reserves) application Talis are working on "doesn’t even have a user interface".  The reason I gave, at the time, was that students don’t need yet another destination to go to to find the information they need - so why build one. 

Providing the functionality to link resources to courses in a way that adds value well beyond the simple attempts to be found in ILS/LMS systems, and their course management system counterparts, is an obvious development.  What is less obvious, at first, is that you don’t need to build a user interface for it - the student is already in a library system, or a learning management system, or a portal, or FaceBook, or whatever - why can we not deliver the functionality directly in to that environment?  Well today the answer to that question is that those applications are not very good at embedding Web Services directly in to their interfaces.

This is why Talis development team member Julian Higman (featured in the February issue of the Library Platform News) was very quick to comment on Mark’s post "I’m working on the reading list application at Talis that you mention, and it certainly does have a user interface!"  - Having calmed Julian down (I jest), we both agreed that the fact it was necessary to build a user interface for this product is symptomatic of the inability of most applications, in the University domain, to consume web services and usefully integrate their functionality in to a user’s work flow.

As I commented previously, the online university today is a collection of many silos that the user [student, professor, researcher] is expected to know how to navigate, let alone be able to identify the connections between data in those silos.  I expect that this comes as a bit of a shock to the average new student. -  I thought I had come to this university to learn about my chosen subject, not to spend a significant amount of time and effort becoming an expert in the use of a multiplicity of different applications and services that are supposedly here to help me.

Peter Brantley was on the money for Mark in his post, about building a Flickr-like system for academia, when he said "However, what will make the application ultimately successful is the availability of open services that permit re-use: mashups that encourage integration with other services and content."

I heartily agree, but only as an interim step.  Most of today’s systems are not integrated in any way, so mashing their outputs, exposed via APIs, together in a Web 2.0 way will be a major step forward.  Doing this still misses the underlying links that are usually only apparent as connections in the eye of the user, if they happen to appear on the screen together.  When we can follow those links between data across silos we will remove the false barriers, imposed by technology thus far, and expose our users to the world of linked data.  

Below is a diagram I am working on to hopefully help people visualise what I mean.  Utilising Web 2.0 technologies we bring together [mashup] the output from various application silos in to one interface.  A great improvement over Web 1.0 where each application would present its data on it’s own independent, and different, screen.  Utilising Web 3.0 [Semantic Web] technologies, links between data in separate silos can be identified and presented as connections and relationships in a single Web of Data - much closer to a representation of the real world.

2.0vs3.0

I would be interested in feedback on this diagram.  Does it help, or does it make things more confusing?

Megaphone picture published by Paul Keleher in Flickr.

Technorati Tags: , , , , , , , ,

Silos, Silos, Silos

JISCLogo I spent yesterday in Birmingham’s excellent ICC convention centre at the JISC Conference 2008.

As with all multi-track conferences, apart from the keynote sessions, it is always difficult to get an overall view of the mood, and themes of concern, for the large group of people that congregate at such an event.

Looking back over the day, the word that came to the surface, appearing in most presentations and conversations accompanying refreshments, was silos.  

Describing the way academic libraries have to deliver services to their users, the phrase "providing seamless access to the silos of data" showed up a few times.  In the same way the silos of information held in the VLEs, VREs, Archives, Repositories, and the Library, need to be made easily available to the students/academic staff/researchers that use them.

Academia has a problem, well at least in it’s online presence.   The many excellent efforts to draw together sets of data and resources utilising pre-Web 2.0 technologies have inevitably resulted in the creation of many silos of data that users have to interact with on a silo-by-silo basis. 

Take for instance the average library web site with its many and varied sources of data on offer for you to search - and often that is without taking in to consideration repositories, archives, and the like held outside of the library’s direct influence.  Why on earth should a consumer of university information and learning services have to know which virtual box data is hidden in, before they are able to search for it?  OK perhaps I’m being a little disingenuous with that last remark, as I know the answers to my own question.  Firstly, until very recently, systems were never designed on the assumption that they would would sit alongside other [peer] systems and users would want or need to search them in parallel with those peer systems.  Secondly, the data, and metadata, standards used for holding information in these systems often differ greatly from each other.

There have been many projects over the years, producing a federated search across either disparate data sets inside an institution, or similar data sets across disparate institutions - these have had varying degrees of success , but none really solving the problem.  Another approach has been to just solve the single-sign-on problem so that at least users can get to the individual resources without having to negotiate various login hurdles, so at least it feels like the university owns all the resources.

In a couple of yesterday’s sessions Web 2.0 was presented as the solution to these problems.  [Subject to solving the identity and single-sign-on access problems which are generic in any integration project] Mashup access to our resources so that all the data-sets can appear as one on a single [portal] interface.  Yes, that will be better from the end user point of view - single access point, single user interface, single style, integration with other social tools like blogs, wikis, Facebook, etc. - Web 2.0 can offer much to make the user’s life and interaction simpler and more pleasurable.

But, is Web 2.0 solving the basic problem?  I contend that it is not.  No matter how you provide access to silos of data, delineated by technical, data standard, financial, and political practices - they are still silos.   The real value of providing access to more than one set of data is the links and associations between the individual elements of that data.  

Two  scenarios….

It is useful to know, by searching a course management system, that Prof Joe Bloggs takes lectures on a particular course.  It is useful to know, by searching an electronic resource management system, that Prof Joe Bloggs has published several papers on the subject.  It is useful to know, by searching the library catalogue,  that Prof Joe Bloggs has written a book on the subject which is in the library.  It is useful to know, by searching Technorati, that Prof Joe Blogs has blogged about the subject.

Whereas….

It would be really valuable that, the University System [knowing what course you were on] would provide a link to your lecturers, one of those being Prof Joe Bloggs.  By following links provided by the system you would be able to see the blog posts he had made on the subject of your course; the books in the library that he has authored on the subject; the lectures he has/is giving on your and possibly other courses; the papers he had published on the subject; the co-authors of those papers; the courses those co-authors are associated with; and on across the graph of relationships between people and things inside and outside of the university.

The former is what most seem to be working towards today.  It would be of great benefit to achieve it, and Web 2.0 principles and technologies will be a great help in achieving it.   The latter scenario should be what we should be striving for - not only delivering the data and information held, or licensed, by the university, but also extracting the massive value in the links and references between that data. 

The emergence of Semantic Web technologies holds out the possibility of being able to deliver on the ambitions implicit in that second scenario.  Whilst getting to grips with Web 2.0 we should look beyond it to Semantic Web (often labelled Web 3.0) techniques and technologies and release the value locked up in the links between the data we create and hold.

I the bags of those that attended JISC 2008 there was a card inviting those that are interested in sharing their thoughts, experience, and ideas about how we step beyond the current confines of the VLE, portal, repository, and library, to register their interest in being invited to attend a Talis Research Day on the subject.  If you have a keen interest in learning, teaching and research and are excited by emerging technologies and would like to attend - register your interest.

Silo photo published by Zesmerelda in Flickr.

Technorati Tags: , , , , , , , , , , ,

Will this one be the right ID

OpenID, to quote the web site, is an open, decentralized, free framework for user-centric digital identity.

OpenID starts with the concept that anyone can identify themselves on the Internet the same way websites do-with a URI (also called a URL or web address). Since URIs are at the very core of Web architecture, they provide a solid foundation for user-centric identity.

By OpenID-enabling a web site it can accept your login credentials from your chosen OpenID Provider (which could even be your own system). The outcome being that if all sites that you use were OpenID enabled you would only ever need to use one set of credential to login to all of them - the Holy Grail of Internet - no more notepad documents or whatever to keep track of all those account names and passwords!

To find out more try this 5 minute informative screencast on Simon Willison’s blog, and Wikipedia.

I’m getting an attack of Déjà Vu whilst writing this [no not the movie which looks fun by the way, or the or the fascinating the web as we remember it site that I tripped over whilst looking up the term]. We been here before. Remember the launch of Microsoft’s Passport, or i-Names, or our first Talking with Talis podcast with Dick Hardt, Founder and CEO of sxip Identity.

These and many other peaks of web excitement over the last few years have tried to address the tricky problem of trying to tell all the sites on the web who you are in a secure, reliable, and trusted way. Testament to this so far intractable problem being the way that so far nobody can even agree a standard scheme for what a password prompt will accept - I have yet to work out a password which will satisfy the criteria for upper/lower alpha/numeric min/max length on all the sites I visit. (And it drives me wild!)

All the initiatives to provide a solution for a single shareable identity, rely upon the fact that some central web presence, that all the other sites will reference, will hold your actual credentials. This is not necessarily a single central source, OpenID and others envisage that you could choose from many.

From my point of view this is the problem for all of them. Passport failed to take off because of this - ‘Let Microsoft become the arbiter of all Internet identity - Yeah right!!” Others have tried to avoid this by distributing the ability to host these identity stores across many organizations, but the fundamental problem still remains - trust. Who is going to trust some third party to hold your identity or to provide validation of an identity for login and or single sign on functionality. A service provider may trust an organization like a bank, but would you want your bank acting as the validater of your ID - what happens when you go overdrawn? An individual may trust an open source community site, but would a service provider?

I wish OpenID, which builds on much that has gone before, well but I have a feeling that even this will not gain critical mass. I wish I did know the answer - I could put my feet up and retire on the proceeds! But brains far bigger than mine still don’t appear to have found this particular silver bullet.

Pessimistically I think there is a possibility that this will not be solved in a globally accepted way for a long long time or until we all get fitted with a personal MAC Address at birth. The present technically unsatisfactory situation is, unfortunately, just good enough to enable the wheels of Internet commerce to keep turning. If we could find a way to make the acceptance of something like OpenID a business critical issue for the likes of Amazon, eBay, and the rest, well things may well be different.

Afterthought
Of course Libraries are universally trusted organizations which are used to handling peoples identity information. Now what if we could some how enable all those borrower/patron records to be used to underpin something like OpenID, that might create a critical mass of data that would provide some momentum. Problem currently is that there is no standardly implemented way to get at that information - same old [library] story - what we need is a Platform!

Take your data with you….

As InfoWorld reported earlier this week: Google CEO Eric Schmidt, at the Web 2.0 Summit in San Francisco said Google wants to make the information it stores for its users easily portable so they can export it to a competing service if they are dissatisfied. He went on:

Making it simple for users to walk away from a Google service with which they are unhappy keeps the company honest and on its toes, and Google competitors should embrace this data portability principle

If you look at the historical large company behavior, they ultimately do things to protect their business practices or monopoly or what have you, against the choice of the users.

The more we can, for example, let users move their data around, never trap the data of an end-user, let them move it if they don’t like us, the better.

I wonder what Google’s opinions are on sharing data. Its one thing for you to be happy for your leaving customers to take their data with them [in a usable format] it is another to be happy to share the data of your current customers [with their permission] with your competitors to add value to your customers lives.

Obviously Schmidt’s comments are aimed at individual users of Google’s hosted software-as-a-service applications. Will this attitude cover aggregations of broader data - digitized book contents for instance? One key to the open movement of open data between those organizations who hold and allow access to it, is licensing. Discussion around licensing in the Open data world is a topic increasing in volume.

In presentations I attended at the recent Stellenbosch Symposium it was made clear that the research community should be discouraged from signing over all rights to their publications to publishers - some right to hold Open access copies in their institution’s repository should be retained. Then there is the constant justification by Google around what they are doing with the digitization of books. There is the Free Our Data campaign in the UK, and many other examples.

There is also the discussions around not only how things are licensed, but what can be the subject of a license.

The Talis Community License (TCL), which has received some Web 2.0 Summit coverage of its own on TechWeb, addresses a hole in the current spectrum of open data licensing which is not covered by the Open Source licenses such as GPL at one end, or by the Creative Commons movement at the other. Both of these cover creative output - source code and creative works respectively.

The problem comes when you try to protect [or enable] the use of ‘an aggregation’ of either facts [which in themselves can not be copyrighted] or individually protected/licensed elements in a data set. In Europe this access to an aggregation is covered by something called Database Right, but this is not a Global phenomenon.

To many this hole in the spectrum of Open Data Licensing is not obvious, and only becomes apparent after working through some examples. As the realisation for the need of something like the TCL spreads across the community we hope that it represents a useful contribution to the evolution of Open Licensing.

In a separate but associated discussion that is emerging around the Open availability of bibliographic records, LibraryThing’s Tim Spalding made the following comment on the Code4Lib listserv:

As I’ve been saying at conferences, anyone who wants to build an open-source repository of MARC records, with or without wiki-like access, will get my (and LibraryThing’s) direct support. I think it’s going to happen. I only we had the time to do it directly. Maybe we’ll get to it if no one else does….

…An open-source alternative to the current system is going to happen. The only question is when. The project is doable, and would be of enormous importance.

So where do the non-libraries and small libraries who do not want, or more likely cannot afford, to pay expensive fees to get at bibliographic records go at the moment? This has to change.

One of Tim O’Rielly’s original key aspects of Web 2.0 is ‘Data as the driving force’ - its been a slow boiler but that is starting to become more obvious by the day.

Technorati Tags: , , , , , , , , , ,

Why Nodalities?

I read the Panlibus blog - I note Talis has another house blog called Nodalities - why is this and why/who should be reading it??”

One of the major recurring themes from myself and others in Panlibus postings is Library 2.0 and its more general cousin Web 2.0. If you followed the links I provided to their descriptions in Wikipedia you will have discovered that they are both labels for a collection of attributes as against specifications.

I have yet to read a complete concise definition of what Web 2.0 or Library 2.0 ‘is’ [and probably never will], nevertheless it is far simper to look at an application or service and pronounce to the world that it is very Web 2.0 and be fairly confident that people will understand what you mean.

Web 2.0 is virtually all about technology, Web Services, Service Oriented Architecture, Social Networking tools, etc. etc., whereas it’s Library relative mixes all of that with a heavy dose of using those Web 2.0 tools and the customer handling & social skills of the library community to provide a better service to library users. - Debates about the use of mobile phones, and the provision of coffee, in a Library environment are often found in the Library 2.0 world.

We at Talis are the ‘Technology Guys’ in the Library equation, and although interested in all that is debated, our motivations are all about how new and emerging technologies [currently labelled Web 2.0] can be beneficially applied in the Library world. To this end you will find me and my colleagues evangelising on the subject both here and at conferences around the world such as these: Access2006, Internet Librarian International, Stellenbosch Symposium, Internet Librarian 2006, and the Charleston Conference.

The Talis Platform is an excellent example of applying Web 2.0, Semantic Web [to mention another ‘label’], SOA, and other technologies to provide innovative solutions to the liberating of library data, functionality, and services for the benefit of all.

In the process of proposing and delivering those [currently library specific] solutions, we are pushing both the theoretical and practical boundaries of web technologies and the theories and standards that are behind them - especially in the World Wide Web Consortium where you find Talis involved with several comittees. In doing this we are very active members, with much to contribute and say, of the world community driving forward these technologies.

This is where Nodalities comes in. You will note [today] that there is a posting from me picking up points from the blogs of Ian Davis and Sam Tunnicliffe, from our Platform Team, who are currently at the Web 2.0 Summit in San Francisco. If you are interested, like I am, in the way that all things Web are [and are being predicted to be] moving, you will find what they are reporting most engrossing.

Reading between the lines of what is being presented it is clear that the advances already being demonstrated by the Talis Platform are only the first step in a massive change in the way large sets of data and metadata (often only linked by semantics), can be marshalled, related together, and combined to change the way information is used in the future.

Dependant on the context, you will find Talis people attending and/or speaking at both Library and more general conferences across the world. Our knowledge, and understanding, of the issues surrounding the library and information industries is very valuable input into the wider technology world. As we have demonstrated this is a two way street. It is absolutely certain that our knowledge and understanding of the Web 2.0 world is already adding unique value to the world of libraries.

So to answer the question at the start of this posting…..

If you are in the library community and want to keep abreast of technology advancements - read Panlibus. If you are in the wider web community and are interested in what we are doing, and have to say about, applying these technologies as a Platform in real world situations - read Nodalities. I suspect most people, although with concentration on one, will find postings of interest in both Panlibus and Nodalities.

Technorati Tags: , , , , , ,

Get somebody else to do it!

37,000 feet above what I think should be the Sahara desert (not that I can tell as it is pitch black outside the window of this South African Airways 747) in a mini power cut.

How smug did I feel, after listening to Paul Miller’s complaints in his Access 2006 presentation (podcast here) that he had no seat-back screen on his flight to Canada, to find just the thing in my seat on this flight to Johannesburg heading towards the upcoming Stellenbosch Symposium. My smugness bubble was soon burst upon discovering that I was in the middle of a block of twelve seats with power failure - no reading light, no music, no personal entertainment system ! ;-{ So me, and the group of ten Belgian tourists I seem to have ended up in the middle of, have had to resort to that traditional participative pastime of conversation - there are some traditions that are worth maintaining.

There are some things though that benefit from technological advances. From my earlier postings you would quite rightly get the impression that I think some of the things Amazon are doing with their utility web services (S3, SQS, EC2, MT) are pretty damn cool. I already personally use a nifty tool called JungleDisk to back up the 4Gb of data on my home PC (when do they get the time to listen to all that music, and will they ever stop storing their mp3’s in with their documents and spreadsheets) in the Amazon Simple Storage Service (S3) for less than $2 per month.

S3 came to the rescue on another front. Because I like using images to liven up my presentations the PowerPoint file for my keynote in Stellenbosch runs to a whole 22Mb. Getting something that size to a couple of people in advance is not the easiest of tasks as it would give many of the most accommodating email systems indigestion. Whilst scratching my head about this problem, I suddenly had one of those well durrr moments that we get from time to time. Upload the file to S3, make it publicly visible, and let Amazon and the recipients web browsers do the work for me - simple. So, with the aid of another bit of nifty software I can recommend - John Spurlock’s NS3, thats exactly what I did. Another knock-on benefit that didn’t initially occur to me, is the piece of mind that if I loose the memory stick in my pocket, and the back up CD goes missing with my luggage, at the same time as my laptop has a nervous breakdown, all I need is access to a browser and I can get my presentation on line in a few minutes.

I don’t think I’m alone in having a recent well durrr. I think the technical team behind Second Life had one too:

The client you download may just seem like a 5-minute nuisance to you. Magnified ten thousand times, it becomes a severe issue for our webservers on days when we release a new version- tens of thousands of people all rushing to download them at the same time. An average of 30 MB per download, multiplied by however many folks who want to login to this Second Life thing, comes out to a lot of bits

Rather than continue to pile on webservers just for this purpose, which has somewhat diminishing returns, we have elected to move the client download over to Amazon’s S3 service, which is basically a big file server.

How many teams behind academic/library projects, startups etc., must there be out there worrying about sizing their servers, backing their data up, and guesstimating the bandwidth required if they become popular? If I was in their position I would be seriously considering offloading the job to someone else for a few dollars a month on my credit card. - Ah no credit card, that is a massive obstacle for many an institution!.

This is starting to sound like a sales pitch for Amazon, Its not intended to be. (but if Jeff is listening - remember your friends at Christmas time) If you want raw compute power, or storage and distributing of files, heavy lifting done for you, you could do you self a favor and take a look at what Amazon are doing.

But what if you need a more specialized heavy lifting. What about the storage, indexing, and searching of bibliographic data? What about the augmenting of such data with book-jacket images; links to disparate but related information such as articles in Wikipediea, reviews, etc; library holdings records; links in to those libraries’ OPACs? All doable individually by many a project team, but all of it without compromising your response to deliver it yesterday with a new cool user-interface? And without having to create yet another updated version of the last application you built, from scratch?

The Talis Platform, or more specifically its component services Silkworm (open directory for Collections, Locations, OPAC deep-link definitions, Collection Groupings, and potentially much much more), Bigfoot (highly scalable large data stores, designed to hold, index, search, and augment generic data), Symphony (possibly a new one for you Talis project name spotters out their - orchestrates the interaction between other platform services), is getting ready to saddle up an deliver a few well durrr moments in our world.

I say getting ready, as we are still putting a few things in place like expanding the API documentation in TDN to cover the Bigfoot APIs (mind you based on the play with it and discover how to use it yourself approach that I blogged about recently, its questionable how much documentation you need), but as demonstrated by Project Cenote there is plenty there already.

Like it or hate it, the Cenote interface is very different in its look. It is also very different in its construction - its all UI and no application. By that I mean, all the Cenote team had to worry about was capturing user input and displaying bibliographic results in a stunning interface. How the data behind it was collected, stored, indexed, and searched was never a concern for them - they got somebody else to do that. The platform is doing all the heavy lifting for them. It is, can and will do it for others well durrr.

Want to know a bit more? - Just ask, either here or in the TDN

Technorati Tags: , , , , , ,

A cloud of clouds

Let me start with a question - what is the collective noun for clouds? In trying to dream up a catchy title for this post, which you will discover once I’ve stopped waffling is about Word Clouds, I tried to discover from colleagues and places like answers.com what you call a collection of clouds. Answers received so far: a host, a storm, a front, and the one I chose - a cloud. I’m sure someone out there will be able to put me right on this, I’ll be monitoring the comments with interest.

Anyway, why am I so interested in [word] clouds all of a sudden? Well its is not all of a sudden, I’ve been interested word/tag clouds as a device for serendipitous browsing through a set of meta data based upon the popularity of words within, or tags associated with, information, for a while.

Flickr, Technorati, and LibraryThing, are all well know examples of the use of these clouds in a user interface. More examples are appearing almost daily.

The thing that triggered me to write this post was the appearance of a word cloud on the site for the BBC’s radio station Radio 1. Scroll down to the bottom of the page and you should see a display of the most popular words contained in SMS text messages sent to the station. This is refreshed every couple of minutes or so, so gives an insight in to what the station’s audience is thinking about. With the station receiving often in excess of 1,000 messages per hour, the theme behind the words displayed is an aggregate of a fair amount of input. The tool that displays this also checks for well know words, like the name of a group or DJ, and makes them a clickable link to more information.

The thing that struck me about this implementation is that the BBC just put it there with no explanation or hints, expecting that their online audience will understand that words in larger fonts are more popular than others in smaller fonts and the ones in blue are clickable. Not that many months ago I remember having to explain those concepts to those seeing Flickr and del.icio.us tag clouds for the first time.

The Web 2.0/Library 2.0 world is one where new user interface metaphors appear and become accepted very rapidly. Although, I am still aware of some libraries who shy away from making changes to their OPACs until ‘there has been training‘. All I can say to such organizations is that I think you will find your online audience is more astute and open to change than you think. By all means offer some ‘How to get the most from the new features’ sessions, but if you have to train in the basics you have probably got your interface wrong.

Another thing that made me think about word clouds today, was a comment that somebody made in a telephone conversation about the Aquabrowser OnLine trials of libraries, such as Islington Libraries, who have contributed to the Talis Platform, that I posted about the other day. The comment passed on from a further education college was that the word cloud in the Aquabrowser OnLine interface could be of great help to those with dyslexic problems identifying different spellings etc. Another good example of how offering access to data by using new and innovative user interface metaphors, in addition to the traditional ones, can have unexpected beneficial consequences.

Technorati Tags: , , , , , , , ,

Is there a place for P2P

David Bigwood was thinking out loud the other day in his Catalogablog
posting P2P OPACs

Here’s an idea, not even half-baked, how about peer-to-peer (P2P) networks of OPACs? Only available items would display. I’d get to pick the institutions I’d have display and whether to display non-circulating items. Something like Limewire.

Having struggled with the effects of teenage family members installing Limewire and its predecessors on the home PC, and with how we scale the traditional search of a single library’s collection up to a reliable performant query of information within overlapping ad hoc groups of library collections, I have also wondered if the P2P (peer-to-peer) technologies underpinning the former could be helpful with the latter.

David’s thought, of using P2P and the music sharing application Limewire as an example, when you deconsruct it is attempting to address a few well known problems in the library domain.

  • Identifying and locating Library collections - how the collection is described, physically located, and accessed electronically are all concerns in this area which resource directories, many which have come and gone, have attempted to address. In the music sharing P2P world, the major concern is getting a copy of the file with little concern as to where it comes from.

    There are several current examples of these library directories around, often limited by project, type/size of library, geographic location, commercial constraints, etc. Then there is the Silkworm Directory in the Talis Platform, an open wiki-like in philosophy, directory in which anyone can enter any library collection and then use an open API to query that information

  • The grouping together of an ad hoc set of library collections to search within. - These could be as organized as all the academic libraries within 50 miles of a city, or as random as a student’s university library, the local library near her dorm, and the library in her home town - totally logical to the student - random to everyone else

    A little known, as Paul Miller only mentioned it in his Access 2006 presentation(pdf) last week, aspect of the Silkworm Directory is its ability to create ad hoc groups and then query by the members of those groups.

  • The constant searching across many dissimilar collections. - Anyone who has used or tried to pull together a federated search across many library catalogs, traditionally using Z39.50, will always have horror tails of the way locally implemented indexing rules can make a mockery of search an results ranking.

    Now if we could consistently index, search, and rank in a single store all the holdings of the collections we are interested in, as defined in a directory, providing it was scalable and performant this problem would disappear. This is the approach successfully taken by the Googles of the world. It is also how the Bigfoot element of the Talis Platform operates. (see my recent posting for a description of how Bigfoot APIs are driving driving the recently announced Project Cenote interface)

  • Filter the results of a search by the libraries in a group that have holdings. - P2P, in the same way that Z39.50 federated search does, could help in this area by querying directly individual library collections. But I suspect that it would suffer the same problems as current federated search, the fastest response you get is based on the speed of the slowest resource. P2P addresses this with caching and by down loading from several places simultaneously, which are not really applicable where you are trying to get information from a specific collection.

    The Talis Platform’s holdings stores address these issues by storing, aggregated across many collections and freely contributed by libraries, holdings statements along side bibliographic stores. This is done in such away as to enable bibliographic results to be augmented with holdings information on the fly as results are returned from an API call.

  • Filter the results of a search by libraries that have in stock items. - This final step is probably the most difficult to solve in a live situation as any store can become out of date at any time that a book is borrowed from a particular collection. P2P may well have valuable application in this area, be it filtering a results set of known holdings, or keeping stores up to date on a minute by minute basis.

It remains to bee seen as to how P2P could be used, but it should not be dismissed as only a technique used for [often illegal] music downloading

David says his thought might be ‘half-baked’, but there are some useful ingredients in his recipe. How well some of them would scale in the wider library environment I’m not so sure, but a hybrid of P2P with some of the high volume, scalable, performent, open data, open API, aspects of the Talis platform - now that may well have legs.

Technorati Tags: , ,