There be nuggets in them there data!
Synchronicity - thats the long word for it - that feeling that once an idea has lodged in your brain, there is a whole queue of similar ones waiting around the corner to pop up and say I’m another example of that!
My latest bout of this started last week with John Battelle’s book The Search: How Google and Its Rivals Rewrote the Rules of Business and Transformed Our Culture, which I am reading at the moment. His analysis of search engines’ current role as the “database of our intentions”–the repository of humanity’s curiosity, exploration, and expressed desires. By analyzing what we do and where we go with our millions of clicks every hour, it becomes possible to identify trends and attempt to predict what we are searching for, the holly grail of the search business.
Next was Dave Pattern’s posting ‘Lending Paths‘ about the work he has been doing analyzing the loan information from the University of Huddersfield’s library system.
I wondered if you could really predict the future borrowing pattern of a user based on a specific book — in other words, if they borrow book X will they then go on to borrow book Y and then book Z?
Anyway, I’ve knocked together a basic script that will extrapolate the most likely lending path (both past and future) for a specific book.
For example, here’s the lending path for “Learning SQL: a step by step guide using Oracle”: http://library.hud.ac.uk/perl/lendingpath/bib.pl?418925
So the Library System as a “database of our intentions” - seems on the surface appears to have legs - interesting work Dave.
Then [in the synchronisity stakes] was Swivel as previewed by Techcrunch.
Swivel Co-founders Dmitry Dimov and Brian Mulloy start off by describing their company as “YouTube for Data.” That’s a good start for someone trying to understand it, because the site allows users to upload data - any data - and display it to other users visually. The number of page views your website generates. Or a stock price over time. Weather data. Commodity prices. The number of Bald Eagles in Washington state. Whatever. Uploaded data can be rated, commented and bookmared by other users, helping to sort the interesting (and accurate) wheat from the chaff. And graphs of data can be embedded into websites. So it is in fact a bit like a YouTube for Data.
But then the real fun begins. You and other users can then compare that data to other data sets to find possible correlation (or lack thereof). Compare gas prices to presidential approval ratings or UFO sightings to iPod sales. Track your page views against weather reports in Silicon Valley. See if something interesting occurs.
And better yet, Swivel will be automatically comparing your data to other data sets in the background, suggesting possible correlations to you that you may never have noticed.
Hmm - an automatic correlation possibility checker now I’ve always wanted one of those. I wonder what nuggets of intention will be thrown up from mining all that data - if Swivel takes off that is. Time will tell
Of course making sure that the data in all these examples is Open will make it more valuable.
[Update] Synchronicity strikes again! - having posted this, I then flipped over to read Nodalities [Panlibus' wider web sister blog] to see that my colleague Paul Miller had also posted about Swivel - take a look at what he said - especially around licensing and quality of data issue
(Pan photo taken by Digital Retina displayed in Flickr)













December 7th, 2006 at 7:37 am
The Google book is really good. The Database of Intentions concept is very true… hmm. I smell a post for tomorrow.