Nodalities

From Semantic Web to Web of Data
Nodalities

Subscribe

  • Any Podcatcher
  • Any Feed Reader

Categories

Archives

License

Creative Commons License

Author Archive

What is it with all this Library stuff?

Talis is starting to get quite a reputation as a thought leader and innovator in the Semantic Web space. What with being featured as one of the 10 Semantic Apps to Watch on ReadWriteWeb, and everything. A quick glance through that list soon highlights that Talis is different type of company. Different to the start-ups behind most of the applications and developments in this space. Talis is nearly 40 years old.

At first glance, taking an excursion around the Applications area of www.talis.com, or reading the posts on our sister blog Panlibus, or listening to the many podcasts hosted by myself that are published there, you could be forgiven for thinking that all we did was sell software and systems to Public and University Libraries.

So what is this old library company up to with their Semantic Web based Platform? A question that Microsoft Evangelist Jon Udell probed, when he interviewed me for his Interviews with Innovators series on IT Conversations. Worth a listen, as you will see how our thinking has evolved from some of the challenges to be found in the metadata rich world of libraries.

Listen Now:

Download MP3 [32 mins, 31Mb]

In November I gave a presentation at the Talis Insight Conference. This conference was for library professionals in the UK, they were interested in what we were doing with Platforms & Semantic Web, and how it could be relevant to libraries. Placing the library technology world in context against the background of the waves of technology in the wider world of the web, it helped the understanding of the future benefits that this work will bring in to the library technology sphere. Take a look, you might find it interesting.

With a long history working with rich metadata, it is clear that libraries are fertile grounds for identifying problems that Semantic Web technologies could help solve. For instance by evolving librarian’s cataloging practices to embrace RDF, some of their unfulfilled ambitions, in the area of simply linking together authors and their works, could be realised.

SemanticMarc Three of my colleagues (Rob Styles, Danny Ayers, and Nadeem Shabir) have published a paper Semantic Marc, MARC21 and the Semantic Web [pdf] on Rob’s blog. This paper takes you through the process of translating MARC21 [the lingua franca of library metadata]records directly in to RDF, then building on that basic RDF representation in to a more readable form and then on in to identifying and walking the graph of relationships between authors, their pseudonyms and their books. A very readable and enlightening paper, well worth a read.

So, hopefully having browsed through that lot you will have a better idea why a library company with a nearly 40 year long heritage is actively engaging with Semantic Web stuff - because libraries provide concrete examples, lots of rich metadata that many want to link to.

Libraries have given us an insight in to issues to be found when dealing with large quantities of rich and variable data and metadata - no wonder we are so at home with the Semantic Web.

Technorati Tags: , , , , , , , ,

Second Life Client Source released under GPL

As I broadly hinted at last month, Linden Lab has announced today the release of the source code for the Second Life client under the GNU GPL v2. license.

Linden see this as embracing the inevitable:

In 1993, NCSA released their liberally licensed, but proprietary, Mosaic 2.0 browser with support for inline images arguably heralding the start of the web as we know it today. In an act of either acceptance of the inevitable or simple desperation, Netscape Communications released the bulk of the Netscape Communicator code base to form the foundation of projects as Mozilla, Firefox, and Thunderbird.

We are not desperate, and we welcome the inevitable with open arms.

They also provide a vision of the way forward beyond this initial release of client source

A lot of the Second Life development work currently in progress is focused on building the Second Life Grid — a vision of a globally interconnected grid with clients and servers published and managed by different groups. Expect many changes and updates in the coming months in support of this architecture. Much of the recent work has centered on securing the code against potential threats. More recently and still in development, we are moving more of the communications to reliable and cryptographically strong secure channels.

Definitely a platform play.

Unlike the few doom laden end-of-Second-Life-as-we-know-it prophets sprinkled through the comments on the announcement, I share the majority view and welcome this announcement. Having had a browse around the segments of the Second Life Wiki dedicated to this release, I have a few concerns that Linden are new to the support of Open Source projects - no live access to a version control repository yet for instance. They will soon learn that letting the code out of the door is just the start.

Now lets see what the world can do with it…

Technorati Tags: , , ,

The 3D Internet

If you have been reading Nodalities’ sister blog Panlibus recently you cannot have failed to have missed that we at Talis have not only been watching Second Life with interest, but are actively involved in aiding the library community experiment with it and understand what it is all about.

Most of the chatter in those postings, and many others on the subject, was around how you interact with people, represent your organization inside a virtual world. I attended a Second Life event in London yesterday to hopefully find out more about it and how it might influence what Talis is doing in Second Life. The presentations were slanted at the marketing fraternity, educating the audience on how/why they can represent their brands within Second Life. The main speaker was Glenn Fisher, Director, Marketing Programs for Linden Lab [The organization that runs Second Life], followed by a presentation from Rivers Run Red [a London branding agency helping to shape Second Life] ‘Creating successful brand presence in Second Life - Integrating Second Life in to your marketing offerings’ - see what I mean.

Nevertheless it was an interesting session which reflected the experience of introducing Adidas, BBC Radio 1, and others in to the virtual world. It was unfortunate that a demonstration of how to create stuff inside Second Life was spoiled by a combination of lack of Mac experience from the demonstrator and what looked like a fairly serious system crash. - The joys of live demonstrations, tell me about it!

What I found most interesting, and prompted this posting, can be summed up in a throw away line from Glenn near the end - “We want Second Life to become The 3D Internet” (my emphasis)

There was much questioning and discussion in the session about the quality of the experience for users of Second Life, and the addition of new capabilities such as the integration of mobile phone messaging, VOIP, and TV channel streaming in to the world. After a while a theme started to emerge, when the fact that Linden had announced that the Open Sourcing of the client and the Second Life APIs - soon was used as the answer to many questions both in and after the sessions.

  • Are you working on the graphical side of SL, to compete with things like World of Warcraft? - No, but once we have OS’d the client and the APIs we fully expect others to build both cut-down and enhanced clients.
  • Will it be possible to simply scan documents in to software in the real world, and have them appear as objects in SL? - With access to the OS Client & APIs, many things like this will be possible.
  • Could SL actions, purchases etc., be fully integrated with a user’s real world? - Yes, by using the OS client and APIs ……..

In answers to questions about current performance, or lack of it, issues it was clear that Linden are concentrating upon the robustness and scalability of the SL Platform - not just as a short term strategy but as a business model. By opening up the Platform APIs and the client source code, we will attract others to build on our platform to add value for themselves and the community within SL. - not sure if those were the exact words, but you get the gist of it. With the odd word changed here and there, you could hear that same statement from, Amazon, eBay, Salesforce.com, the WindowsLive division of Microsoft, Talis and many others, each providing a Platform approach to facilitate the delivery of solutions which add value to all solutions built on that Platform.

One thing is different though with Linden’s proposition. Their Platform is one on which you can build an equivalent of every major Internet user interaction, but in a 3D world. Search, Chat, Messaging, Information dissemination, Blogging, Social Tagging/Networking/Interaction, Entertainment, Media streaming, Advertising, Retailing, eCommerce, Customer support, are all thing that could be replicated in [and in most cases be enhanced by] a 3D equivalent. Instead of clicking on your web browser favorite to be taken to view the local electrical retailer’s web site, you could activate your avatar’s favorite and be teleported to the retailer’s virtual storefront where you could not only satisfy your query but also opt to interact with other customers you find there. Suspend your disbelief (resulting from using Linden’s current UI) for a moment, and project forward a few years - I might just be describing the new metaphor for human Internet interaction

Yeah, so what you say - its only like projecting forward Tim Berners-Lee’s Web vision from the mid nineties to now. Well yes it is BUT, up until now nobody as such has owned the Platform people use to interact. Once you have paid providers to access the Internet’s resource’s is mostly open and free, and definitely not in the hands of one controlling body. Whereas the Linden Platform that runs Second Life, complete with its own currency, is owned by them. From a financial point of view Second Life is more like a Country than a commercial software application. Project it forward and we could all end up online-living in the same country under the governance of the owners of that Country

So today I think I might have seen a vision of one possible online future - The 3D Internet. [Now that would really warrant the use of the Web 3.0 label]. One which may cause some science-fiction writers to take the Google logo off their State Police uniforms and replace it with a Linden one

Or maybe I’ve had too much coffee…

(Second Life - Platform image taken by Eric RiceEr displayed in Flickr)

Technorati Tags: , , , , ,

If all you have is a hammer….

A previous boss of mine was well known for coming up with a suitable saying for many a situation. One of his favorites was “When all you have in your toolkit is a hammer - everything starts to look like a nail

This came to mind whilst reading Ian Davis’ latest report on events at the Web 2.0 Simmit in San Francisco. He was in the session from MySql’s Mårten Mickos. - Read what he has to say and you’ll see what I mean:

I’m baffled.

I’ve just watched Mårten Mickos from MySQL give a 10 minute talk on what he terms the “Great Database in the Sky” almost exactly describing the our community’s vision of a “web of data” while remaining completely ignorant of the semantic web.

To start, he characterised Google as giving unstructured people access to unstructured data whereas MySQL gives structured people access to structured data, meaning that MySQL is targeted towards developers who understand how to structure data “properly”. A strange polarisation in my view, but I guess he’s trying to put clear blue water between the Google approach and the traditional database approach. At Talis, we don’t see this distinction at all and our core platform technology, Bigfoot, unifies structured and unstructured data.

He went on to describe his vision of a skype for database access, combining my data, your data and public data into the next generation OLAP, running a trillion transactions per day. An example could be weather data and he asked what if you could run a SQL statement across all the data sources in the world, something like SELECT CurrentWindDirection, CurrentWindSpeed FROM AllTheWorldsWeatherStations, MyOwnWeatherStation, MyFriendsWeatherStation.

It’s a noble goal, but he’s not the first to suggest it. It’s also not a future vision because you can do it today with Sparql. It’s at the heart of Bigfoot and there are many other public services that can be used to learn and experiment. You can even query across HTML pages containing embedded structured data.

He followed it up by saying if this were achievable then a whole new generation of web 2.0 applications could be possible. Nothing controversial there, we share the same vision! But we think it’s closer than he does.

What else? Oh yes, he said “we may need a DNS of SQL servers” and that “routing may be an issue”. Another point of agreement, that’s why we built a directory of data collections and services and built web services to route straight into that content.

Then, “how do you make data definitions understandable to others?”. That’s almost like a problem statement for RDF! And yet he didn’t mention it in his list of technologies that might be candidates for the solution: RSS, Atom, Jabber, HTML, HTTP, XML, SQL and SMS.

He concluded his talk with the tagline “The data is the platform” and then took a question from the audience: “How is this different from the semantic web?”.

This is where it became evident that there is a deep disconnect between the traditional database community and the semantic web community. Mårten’s response was rather vague, that this wasn’t as broad as the semantic web and that the semweb includes unstructured data so wasn’t appropriate.

What a shame and what a failure of the semantic web community if the CEO of MySQL AB cannot see how his vision for an interconnected web of data is the same as ours! We must try harder and demonstrate at all levels the value of the semantic web approach to people like Mårten. SWEO and SWIG will help, but the convincing arguments will come from the practical applications of the semantic web being developed to solve real world problems.

Which is why I’m at Talis.

See what I mean.

Ian is right though, Mr Marcos’ limited vision may equally be because of a failure of communication from the semantic web community as his conviction that database technology (but bigger and more distributed) is all we need. I suspect that it would not be too difficult to find some semantic we fundamentalists with with equally strong but opposite opinions.

As with most real world issues, the best solutions occur when [often previously apposing] communities work together.

But if all you think you need in your toolkit is a hammer….

(Hammer image taken by darkmatter uploaded to Flickr)

Technorati Tags: , , , , , ,

Web 2.0 Summit

Two of my colleagues Ian and Sam are taking in the renamed Web 2.0 Conference Summit in San Francisco.

With the impressive set of speakers on the programme, its hardly surprising that they are churning out an interesting set of postings from the sessions, on their personal blogs.

After arriving in SF from the International Semantic Web Conference in Georgia they immediately got stuck in to the Summit programme.

So far Sam has posted from the following sessions:

  • The Next Internet Infrastructure

    An open services archictecture. needs to be freely licenced, hostable, extendable and be capable of supporting a emergent ecosystem.
    The web contains plenty of open content, but islands of authentication. Authentication needs to be first order in next generation architecture that is becomes possible to extend the way we do things on the web now to things that we currently can’t because of the lack of inbuilt trust mechanisms. i.e. you can’t apply web principles to healthcare, finance etc yet.

  • Advertising 2.0

    The panel predicts the emergence of a new network not of content producers but of content distributors.
    Shifting dynamics of web advertising - away from measuring success in terms of direct response to advertising and toward brand building which has been more prevalent in traditional media, but which has the potential to be much more accountable & measurable as user behaviour becomes more visible across the web and through time. To me, this needs to dovetail with the ideas explored in the previous session around making identity data available while preserving anonymity.

  • Enterprise 2.0 Mashups

    Marc Benioff, CEO of Salesforce.com is on stage now. He makes the claim that Salesforce are doing for enterprise services what Amazon is doing for infrastructure, i.e. removing the “muck” and enabling innovation. He characterises one aspect of salesforce as an “Elastic Database, that scales” and in that regard, I can see a lot of overlap with some facets of the platform we’re building. If I were in hyphen overdrive, I might describe Bigfoot , as ultimately-flexible-data-storage-discovery-and-retrieval-as-a-service (but with added semantic goodness, of course).
    AppExchange is a marketplace for business services built on the salesforce platform, and I see parallels between it and the Content Orchestration components of our platform, like Silkworm and Symphony. These sorts of components are all really about making it easy to compose applications by plugging together bits of data and functionality from all over the web.

  • Ning

    Marc Andreessen and Gina Bianchini from Ning are up now. Gina says they launched their first products a couple of months ago, but I remember checking out Ning at least a year ago.
    Oh dear, they’ve ground to a halt, looks like the presenters aren’t immune from the connectivity problems either.

Whilst Ian has so far commented on:

  • Yahoo!’s Web 2.0 Strategy

    The first workshop of the first day. I got into the room early and am right at the back in the corner… next to the power!! Why, oh why don’t these conferences ever sort out the power? That’s what comes of choosing a 150 year old building to host the conference, beautiful though it is. Onto the workshop…
    Brief notes only…
    First up is a set of slides describing Flickr’s building blocks of participation which is very similar to our thinking at Talis:
    user generated content - not licensed from providers but contributed by users
    user organised content - tagging, categorising
    user and publisher distributed content
    user developed functionality - exposed api etc
    The discussion moves onto tagging and how it gives social context particularly through the recent introduction of geocoding.

  • Whose Data?

    I’m now sitting in the Whose data is it? workshop which is just starting. It turns out that the workshop now has a new title “Open Data Workshop” which sits very well with our work on open data licences
    First up is Marc Hedlund who is referring to the O’Reilly open data quote that Paul blogged on a little while back. Hmm, he’s even referring to open data licences, but only mentioning Creative Commons by name. Points out that all the big map providers use MapTech data and then moves on to describe the OpenStreetMap project, one of my favourite examples of the new open data movement.

  • Hmmm SOA?

    Being more resource-oriented than service-oriented, I approach this session with trepidation.
    First up is Carol Jones from IBM to talk about a trio of software patterns for Web 2.0. The first is “Software as a Service” which has the following characteristics:
    Service, not software
    User-driven adoption
    Value on demand
    Low cost of entry
    Public infrastructure
    Most importantly… tight feedback loop between providers and consumers

  • It’s All About the Infrastructure

    One thing that strikes me about all the talks and presentations at this conference is that they all assume ubiquitous net access. Kind of ironic then that the wireless access here has gone the way of oceanic flight 815. So, since this is the web and I like to link in my posts, having two out of three page requests fail makes for very little blogging from me at the moment. Even though I’m sitting right under what looks like a huge wifi access point bolted to the ceiling and have great signal strength, it’s completely wasted when DHCP and DNS are out. You’d think at the Web 2.0 conference they’d actually have wireless that worked, wouldn’t you?

    By some ultimate form of serendipity we just had Debra Chrapaty from Microsoft with a 10 minute presentation which gave me the inspiration for this post’s title. The presentation was a rather interesting tour of the new data centres that Microsoft are building. Truly awesome investments. It also illustrated the depths of competition that Google and MS find themselves in - literally competing for electricians to kit out their data centres.

I look forward to many other good posts from the intrepid duo on the last day of the ConferenceSummit.

Technorati Tags: , , ,

d.Construct Rocked in Brighton

Well I’m glad I dragged myself away from the sunny beach and in to the Brighton Corn Exchange for a great day at d.Construct 2006.

A ‘cozy’ conference with 350 attendees and a great set of speakers. Opening with Jeff Barr, doing his usual good job of promoting Amazon Web Services, and closing with Jeffrey Veen talking passionately about designing for the user experience; the day was filled with people with interesting things to say stood up in front of an audience interested in what they were hearing. The idea recipe for a successful conference. Congratulations to the team that organized it.

Something new for the conference was the supporting d.construct06 backnetwork site, provided by conference sponsors Madgex.

This excellent site, with which the vast majority of attendees registered, provides that linking up with those you meet facility that is often missing at these sorts of conferences. With feeds from attendee’s blogs and flikr pictures tagged with dconstruct06, the site provides an extra aid to the social networking side of a conference that others could learn from.

The after conference event, in which free booze and food supplies are rumored, calls…….

Technorati Tags: ,

Jeff Barr: SAAS + HAAS = AWS

4 of 5

0.3
event

Jeff Barr - SAAS + HAAS = AWS

At d.Construct 2006 Amazon’s Jeff Barr gave an interesting insight in to the [many] services now provided by Amazon Web Services.

The new buzz word that stuck in my brain was ‘Hardware as a Service’ which describes what I’ve been going on about for a while now. - Unless you are really big, who needs to invest in hardware any more!

Review by
Richard Wallis
(Richard) on 08 Sep 2006

License CC Attribution 2.0 License.


<!–

–>

Technorati Tags: ,

Life’s a beach in Brighton

In Brighton for d.Construct 2006. Looks like its going to be a great day with a great set of speakers.

I must admit that a pre-breakfast stroll over the road from the hotel, to an empty beach next to flat calm sea, under a cloudless blue sky, has tested my resolve to go a few hundred yards in land to sit in a conference centre for most of the day.

Those speakers better fulfill their promise of a great day!

Subject to wifi and/or 3G coverage I’ll let you know.

Technorati Tags: ,

A pain relief for Cross-Domain Scripting?

The Developers who have tried their hand at Web 2.0 AJAX, that I have spoken to, have almost without exception reported to me their journey through states of both joy, frustration, and concern around the use of that wonderful tool the XmlHTTPRequest.

The joy starts early on - “You mean it is that easy to get data from a server and dynamically insert it in to my web page without doing a refresh!“.

Rapidly drops to frustration - “You mean to say that I can only get data from the server from whence the page came :-( “.

Moves in to creative mode - “If I create this little php script as a proxy on my server, my requests can see the world!“.

Followed by a worrying - “I wonder if my server will cope with the proxy load, what if its address becomes public knowledge?“.

Then comes the blinding light of JSON - “Absolutely magic! In a script tag I can insert a call to a JSON service anywhere without any of these Cross-Domain scripting issues to worry about“.

Finally the question of trust occurs - “This JSON stuff being injected in to the browser has full access to everything - do I trust the people that coded the service to only do what they say they were going to do, or will the viewers of my page become unwitting accomplices in a distributed denial of service attack?“.

On the final point, trust is an issue of who you get your service from. Users of the JSON versions of the Talis Platform APIs can rest assured that our services will only do what we say they will. So there! - problem solved. Well it is until some high profile instance of a very useful JSON delivered service is found to be a front for a password capturing scam - then see all and any trust, for all and any JSON service, dissolve like mist on a hot summer’s morning.

So in summary, although useful XmlHTTPRequest it is too restrictive to be really useful in orchestrating cross-domain web services together in the browser client; Although using JSON in a script tag is powerful and bypasses the limitations of XmlHTTPRequest, it is not widely supported and where it is can I trust it.

Greater minds than mine have been thinking about this. Jeff BezosBarr reports a lunchtime conversation with Peter Nixey of Web Kitchen around this subject. Peter in his posting Why XHR should become opt-in cross-domain relates the tail, of how the owner of a fictitious pub is having issues with his local council, to highlight what I have been describing.

Paul goes on to propose a solution to the XmlHTTPRequest cross-domain restriction problem, by allowing servers to opt-in to receiving cross-domain requests and then informing the browser of that opt-in with a new HTTP header.

I think this idea has much going for it. What do you think? Peter is asking for feedback. Let him know what you think.

Maybe this is something that the W3C Web APIs Working Group should consider for inclusion in to their Draft XMLHttpRequest Specification.

Update:
My thanks to my colleague Ian Davis, a member of our team who is involved with the W3C Web APIs Working Group. He has pointed me at a proposal for a cross-site XMLHttpRequest, which starts an interesting conversation on the Working Group mailing list.

Technorati Tags: , , ,

Let the loose coupling take place in the cloud

A posting over on the Amazon Web Services Blog about the Openfount Queued Server attracted my attention last week.

The Openfount Queued Server looks really interesting. It took me a little while to understand the architecture, but this was time well spent.

Just like Jeff, it took me a while to get my head around what was different about what Openfount were doing. The description on their site to be fair does describe what is happening, but the light bulb above head moment didn’t really occur until Jeff’s posting, supplemented by Bill Donahue’s comments had been read.

So what is going on? In simple terms the Web Service client is talking to its server via a message queue. So what is radical about that then you may ask - we’ve had message queue based client server communication for years. Well the difference is that the message queue is hosted by an open third party, Amazon’s S3 Simple Storage Service.

The client never talks to the server directly. Instead, it uses APIs from the Queued Server toolkit to write messages into a queue for processing by the server. The server processes messages and then writes return values into another queue, where the client picks them up and displays the results.

Because the server is actively polling the queue for messages, it need not have any public interface at all. No domain name, no IP address, and no vulnerability to well-known generic attacks. In fact, the server could be running behind a cable modem with a dynamic IP address and no one would be the wiser, since it simply reaches out to the world via standard HTTP requests over port 80. In this model the server never accepts incoming calls directly. It reaches out and pulls in requests, and can be very, very choosy about the requests that it accepts.

The clients and servers are isolated and protected from each other. It appears that the client need not even know the domain name or the IP address of the server!

Because everything is handled via a queue, there is no reason why multiple servers could not be used handle the load with Internet bandwidth & S3 being the choke points. Looking at the blurb on the S3 site it is a very wide choke point indeed. As Bill Donahue says in his comments, they actually want to use Amazon’s SQS (Simple Queue Service) to do the actual queuing, but whilst it is in Beta they are have cross domain scripting issues which prevents them using both S3 & SQS.

Back in January 2005 I mused about the use of and reasoning behind SQS. In that posting I was equating SQS with Sun’s $1 a CPU Cycle Utility Computing Service and looking back I still believe I was right to.

The SOA Operating Platform is starting to emerge. Get your CPU cycles from a supplier like Sun, get your network attached storage and queuing infrastructure from someone like Amazon, get your mapping application services from someone like Google, get your payment services from someone like PayPal, get your Library Domain specific Web Services from someone like Talis. Who, other than the core utility processing, storage, and queuing service providers, needs to invest in infrastructure anymore?

Globally distributed SOA seems to be looming out of the mist, and it is a different shape to what was envisioned a few years back. No sign of the network of UDDI servers scattered about the planet, just lots of loosely coupled applications sitting above utility core services and domain specialist platform providers.

There is one other massive benefit of what is being provided by Openfount - all the calls to the message queues from the servers and clients originate from those servers and clients.

Anyone who has tried to implement a solution requiring a client to access a service that is behind a firewall will immediately grasp the importance of this. Agreeing the security policy with a firewall administrator for access out to an external service can at times be difficult. Agreeing a security policy for requests from outside a firewall to an internal server is more often than not impossible.

Technorati Tags: , , , ,