Import/export and the Web
A post on the Open Data Definition list from Ben Werdmuller asks an interesting question - is syndication an easier sell than import/export?
Ok, background first: Open Data Definition is a proposed format for transfer of data between systems, with DataPortability in mind. In many respects it’s a ‘lite’ reinvention of RDF, targeted at the average Web developer. While I and others might question the underlying assumption that RDF is too difficult for typical Web developers, and perhaps express a little gut-reaction pushback, there’s nothing inherently wrong with something like this if it fills a (possibly significant) niche, and plays nicely with other Web standards. Design-wise, there is a sanity check which can be applied, the Test of Independent Invention :
If someone else had already invented your system, would theirs work with yours?
Does/could RDF work with ODD? - well, nearly. Yes, because it should be reasonably straightforward to map between RDF graphs and ODD’s format (there’s an interesting little complication in its indirection of metadata that’d take a bit of figuring, but bashing it with SPARQL & XSLT for a while would probably suggest a good approach). It fails right now because ODD doesn’t as yet allow for transparent interpretation, not having an XML namespace, hence not really placing itself on the global Web. Any automatic conversion would have to be done by sniffing the content - an agent needs complete prior knowledge. [If the ODD folks are willing to give the format a namespace, I'll volunteer to sort out the mappings & GRDDL bits]. Hmm, I wonder if they’ve tried nesting ODD in other XML formats yet…
Anyhow, back to Ben’s question. I think he has a point - syndication should be a relatively easy sell these days because of RSS/Atom. But marketing aside, there are several different ways to get the data from system A to system B:
- import/export where the data is transferred through an intermediary (i.e. the desktop)
- one-off direct transfer (system B does a GET to system A)
- polling - traditional syndication, periodic transfer
- linkage - lazy polling, any transfer happens on demand
At this point in time, the first of these isn’t exactly Web-friendly, typically requiring a human intermediary for its operation. In future, with smarter clients maintaining a local cache of data, something like this might make more sense. Such clients could be acting as proxies for any of the other modes of connection. But let’s assume this kind of capability’s already here. If you stand back, the same thing is happening in all these cases - the receiver will be given an identifier for the resource of interest (the profile data or whatever) and can use HTTP on it as appropriate. This is completely independent to what’s in the data itself - even though RSS/Atom formats contain a series of time-stamped entries, the way they get processed is up to the consumer. These different modes are orthogonal to authentication/authorization and privacy or copyright issues. Each is, in its own way, using linked data. To get more information about something, the consumer follows its nose and dereferences the URIs. ‘Course if you bring message content into the equation and/or allow an arbitrary number of agents in the interaction, the number of possible modes explodes.
So yeah, ok, what point am I trying to make here…dunno, it just seems somehow significant that questions like “syndication or import/export?” should arise, given the underlying infrastructure. More telling of the silo nature of many current Web systems - themselves generally products of a pre-Web mindset - than anything to do with the Web itself. This too shall pass, as they say.
See also: Walled gardens: mapping the parties
PS. Reminds me - in my little DP video I had a mockup of a “Connect!” button. It was only a mockup because of the deadline for videos, the implementation I had in mind being essentially OpenID + HTTP GET + SPARQL CONSTRUCT…





