Nodalities

From Semantic Web to Web of Data
Nodalities

Subscribe

  • Any Podcatcher
  • Any Feed Reader

Categories

Archives

License

Creative Commons License

Archive for the 'Tech Talk' Category

OPO: modelling dynamic online presence

At Talis, we’re very interested in the development of the Semantic Web, and we’re always happy when other members of this space share what they’re doing with us. I was contacted a couple weeks ago by Milan Stankovic, a member of the Good Old Ai research from Belgrade. He’s been working on the OPO (Online Presence Ontology), which aims to model the dynamic aspects of a user’s presence online: taking a leaf out of twitter’s book, but tying it in semantically with the rest of the web. I’ve asked him to share a bit about their project with us.

So, Milan, what is “online presence”, and in what way is it “dynamic”?

I think that expansion of socialising services, like social networks, Twitter, lifestreaming services, etc. has significantly changed the way we socialize. When our friends publish custom messages on social networks, send tweets or set their IM statuses, we become more aware of their current activities and thoughts. When we assemble all that information we get a rich image of their presence in the online world.

Since the data that forms this image is spread over different services (and often repeated) we came up with the idea that it could be useful to make a model for its semantic representation and meaningful exchange. So we created an ontology – the Online Presence Ontology (OPO) to enable the integration of those pieces of information about a user’s online presence. Apart from that, OPO also enables the transfer of online presence related data from one service to another without the loss of semantics.

We believe that with the expansion of internet-enabled mobile devices, as users are more and more online, the topic of online presence will gain even more importance. Maybe even new ways to express your state of being present online will arise in this context. For this reason we did our best to make OPO flexible and extensible enough to survive the evolution of the online presence concept itself.

So, does this have anything to do with the already-existing FOAF ontology?

For understanding OPO and the notion of online presence itself, a comparison to FOAF might be essential. It is very important to distinguish the static and more persistent properties modeled by FOAF (like name, gender, homepage, etc.) from frequently changing properties addressed by the OPO (like custom message and IM status). The OPO is actually meant for representing dynamic aspects of user profiles, and we may say that it complements FOAF in a way. It is therefore quite natural that OPO is connected to FOAF trough some properties.

How do you see this actually being implemented?

Apart from facilitating the integration of online presence data from various sources, OPO can also be beneficial for transferring data from one service to another. I personally know users who copy-paste their custom messages from gTalk to Facebook. This manual work is an annoyance we can easily relieve users from by introducing a meaningful data exchange between services. The first thing we need is a semantic representation and then the exchange mechanisms can be built on top of the ideas outlined by the Data Portability initiative.

The domain where we consider OPO’s contribution to be of greatest importance is the exchange of IM statuses. Currently different IM platforms use different status scales, and when users from different platforms meet in inter-platform chat (on services like Meebo, Digsby, etc.) their statuses are exchanged over XMPP protocol by mapping them all to a very poor status scale used in XMPP. In those mappings the semantics of original statuses is largely reduced. To face this issue OPO allows precise descriptions of IM status characteristics so that they can be meaningfully exchanged between platforms.

So, where are you taking this next?

We are currently working to extend the ontology with new features. One of the improvements will be the ability to add geographical location to your Online Presence. This will support travel twitting and will have its applications in recently emerged location based social networks.

Another interesting extension will be the support for describing current music track that users sometimes state on IM platforms. Compared to the existing possibility to see the name of the song my IM contacts are listening to, semantic representation of music should bring the functionality to a higher level, by allowing IM programs to find and let me play that music. The infrastructure for this is already provided by the Music Ontology project as well as DBTune; we just have to connect it with OPO.

We will soon put this new version of the ontology for public review on the project website and we hope to get community comments and attract the community to participate in making the ontology even more usable.

In parallel we are working on plugins for some social networks and IM programs in order to bring the enabled interoperability to life.

Thanks, Milan.

If you’d like to check out the ontology yourself, or to read more about it, you can find it here:
OPO Website : http://www.milanstankovic.org/opo/
OPO URI : http://ggg.milanstankovic.org/opo/ns/

A passing observation on SaaS

Back in January, I noticed an intriguing idea from Jeff Jarvis : @twitcrit: instareviews. Basically to use the Twitter microblogging tool to post mini reviews. I couldn’t resist having a quick go at an implementation of what Jeff described. Fast mover that he is, Dave Winer got an implementation together ahead of me - see Jeff’s subsequent post.

Now programming skill doesn’t really come into this, the application is pretty straightforward, only took me a couple of hours to write my code. I assume Dave used his own platform based on Frontier, the service being maintained by himself. I used the Talis Platform. Although I work for Talis, I have nothing to do with the maintenance of the service - if effect I’m a 3rd party coding against a Web API (one based entirely on standard HTTP, but that’s another story).

Five months later, the twitcrit idea didn’t really catch on, and to be honest I’d pretty much forgotten about it. But checking back, my app is still live. Also in the meantime it’s been happily aggregating the data that’s passed through. I never got around to a proper search interface, but because the store is SPARQL-enabled, it is all searchable. Now check Dave’s version.

So my passing observation on SaaS is that in delegating infrastructure maintenance, you can just write your app and forget about it.

Google App Engine and the Joy of WebArch

Google App Engine Logo
Responses to the announcement of the Google App Engine have been mixed, from Tim Bray’s somewhat negative Sharecropping, to an awful lot of “very cool“s, with Niall Kennedy’s tech description providing a reasonably neutral common ground. I’ve been meaning to post about it, but I’ve a couple of pressing deadlines and haven’t had time. I didn’t think “Python - great! But this thing really isn’t forward-looking” would be doing it justice. However this morning I ran across a couple of blog posts on which I felt obliged to comment, and I just realised that most of my main points about the Google App Engine leaked out into those comments. So with apologies in lieu of better treatment, here goes -

Comment on Gabe Wachob’s Google App Engine: Its the Architecture Stupid! :

Nice post! The first I’ve seen to highlight the significance of the architecture.

While I think your analysis is generally on the nail, I’m not so sure about the conclusions. The thing is, App Engine architecture isn’t Web architecture.

As you point out there are nice reusable abstractions (like events etc), but the primary interfaces are all down at the code level.

“If you build your app on the Google App Engine architecture, it will scale to unlimited levels without any extra effort.” - yes, but only on the Google App Engine.

Rather than hoping for open source implementations of similar toolkits, if a HTTP facade were put over things like BigTable, the specific implementation wouldn’t matter - to change that you’d only have to change a few URIs, not all your code. (One for the LazyWeb).

Commoditization (commodification?) works best where there are common standards. A railroad engine isn’t a commodity if you have to build your own track :-)

See also: Cloud: commodity or proprietary?

Comment on Swaroop C H’s Web dev frameworks vs RIA :

[On the question of how one develops both client- and server-side with frameworks] I’d suggest that if Web standards are used as a common interface, it really doesn’t matter!

Ok, an example. A while ago I needed an easy personal activity tracker. I wanted it in my face a bit on the desktop, which called for something RIA-ish. I wanted the data available in a form that’s reusable, and I want a straightforward view on the web (so my colleagues could see what I was working on).

So I wrote a little desktop app in Java. It’s essentially MVC, with a fairly trivial domain-specific model - I have activity items with title, description and tags.

Server-side I have a Talis Platform store. The desktop app communicates with the server by POSTing a chunk of the domain-specific information expressed as an RDF/XML doc - the stores have this kind of interface out of the box.

For my simple Web view of the data, I have a little bit of PHP which does a SPARQL query on the store (standard SPARQL-over-HTTP endpoint also comes out of the box) and uses XSLT to transform it into the JSON consumed by SIMILE’s Timeline viewer.

Unfortunately I broke the Timeline viewer bit of the app (I think I got out of sync with SIMILE’s scripts). But hopefully you get the idea - small domain-specific components, loosely-coupled using a standard general-purpose protocol (HTTP) and standards general-purpose data model (RDF).
For reuse, I can query the store however I like.
[I got distracted and forgot to link to implementation note: More Dogfood]

Ok, I’m showing my bias towards a data-oriented shared model in these comments. But if you wanted to narrow things down a little and be more content-oriented (and maybe placate Mr. Bray a litte), swap out the RDFisms and replace them with Atom/AtomPub. The key point is providing a common interface based on standard models, message formats and protocols. (Interop between Atom, RDF and any other systems which respect WebArch is generally doable because of that common interface).

One other point I’d like to add which I suspect speaks volumes about Google’s mentality is the difference between a real aeroplane and Google App Engine’s snazzy logo. Compare and contrast with the image above:

ecojet.png

A Chat with Dave Beckett

Today’s podcast is an interview with Dave Beckett (blog), Software Architect at Yahoo!

dajobe

Dave’s been a contributor to the Semantic Web initiative since before it had that name, originally coming from a background in parallel computing. As well as having worked on many of the key specifications around RDF, he’s responsible for the Redland toolkit, a comprehensive set of open source libraries for RDF. Dave maintains Planet RDF, an aggregation of Semantic Web blogs, as well as various tools in support of Semantic Web Interest Group (SWIG) communications. Until the quantity of material got out of hand, his RDF Resource Guide was the definitive collection. He derived the human-friendly RDF notation Turtle, which recently appeared as a W3C Team Submission, co-authored with Tim Berners-Lee. It was Dave, as a member of the Data Access Working Group (DAWG), that coined the acronym SPARQL - SPARQL Protocol and RDF Query Language (which incidentally solved another a naming problem).

The topics covered include how he got involved in these technologies in the first place, Redland and a couple of Dave’s experiments: the triplr service (“Stuff in, triples out”) and Flickcurl, a C library for the Flickr API. He offers his thoughts around some of the technologies and specifications he’s been involved in, along with other developments around the Web - check the list of links below. While having limits on what he could say in public, he also mentioned the use of RDF inside Yahoo! (more announcements on the way apparently).

There are a couple of quotes I can’t resist pulling out. I asked Dave about how well he thought the Semantic Web was coming along, and he pointed out that, like the Web, there wouldn’t be any specific point in time at which one might say it was a success. But he added:

For me, in the work we’re doing with Yahoo! internally, it’s already a success…we’ve done work better, faster and we’ve done things we couldn’t do before because we were using this style of technology. It’s not always publicly visible because it’s a kind of data technology…but it’s a success for Yahoo! content and metadata problems I’ve been working on.

Dave also talks a little about open data, a nice line being:

The reason I got involved with the Semantic Web was…I wanted control of my data.

If you want to hear more, Dave will be speaking at the Semantic Technology Conference in San Jose in May, where he plans to go deeper into why Yahoo! is using RDF, the benefits and more detail of their projects.

One final quote:

Have fun with the Semantic Web…it’s about connecting things together,
about getting the jobs done.

 
 Standard Podcast [43:30m]: Play Now | Play in Popup | Download (114)
Creative Commons License

During the conversation, we refer to the following resources;

Nitpicking Alex’s Semantic Web Patterns

Alex Iskold just published quite a lengthy blog post called Semantic Web Patterns: A Guide to Semantic Technologies. Overall it’s good stuff, and Alex has been doing a great job of promoting the Semantic Web over on Read/WriteWeb and elsewhere. He’s also one of the Semantic Gang featuring in the latest podcast series from oor Paul. (I’ve not listened to that yet - I’ll try it with a dogwalk shortly).

Because of all this I feel a little disloyal in being critical, but without clarification some of the points in Alex’s post could lead to misconceptions, the bane of Semantic Web outreach. One thing I can’t disagree with Alex about is the way the Semantic Web means different things to different people (cue elephant analogy). So with that proviso and all due respect etc, here we go:

1. Bottom-Up and Top-Down
Alex says:

“The bottom-up approach is focused on annotating information in pages, using RDF, so that it is machine readable. The top-down approach is focused on leveraging information in existing web pages, as-is, to derive meaning automatically.”

Ok, while one could (and I will) quibble the content of these definitions, they do make a pretty clear distinction. The only thing is, the phrases “bottom-up”/”top-down” have already been used fairly extensively already in the Semantic Web context to describe at least two different (but related) distinctions.

The first of these is with regard to decision-making, in the same sense as within the management hierarchy of an organization. The naive stereotype for this distinction would give, say, top-down = “those in power in standards orgs call the shots” versus bottom-up = “grassroots developers determine the direction”. Given that specifications can appear as authoritative rules, it’s easy to see how this perception might emerge. (This is a naive distinction, because it fails to consider the influence of the community that goes into defining specifications and in determining which survive the natural selection of deployment in the wild).

The second usage of “bottom-up”/”top-down” is more technical, in regard to how you arrive at your world/domain model. Top-down would be starting your model from a generalized level and works towards more specific levels, bottom-up the reverse. Clearly if there’s to be global interoperability, taking the top-down approach would imply there’s one true model that everyone follows. In the past this has led to some awful misconceptions around RDF, where people have assumed that the models (i.e. vocabularies, RDF Schemas, ontologies) are created on high - probably by the W3C. Quite the opposite is true. While RDF is a framework (and hence might be viewed as a top-level language), it’s essentially neutral on who, where and how domain models are created. Because things, classes of things, relationships between things and so on are identified using URIs, anyone can create their own vocabularies. This retains a base level of global interop, and enables web-scale independent development. (I once saw a list email containing a line like “the namespace begins with http://purl.org, so it must be something to do with RSS 1.0 people at the W3C” - no, no, no!).

So basically while Alex’s “bottom-up”/”top-down” may be internally consistent, it’s a little idiosyncratic.

2. Annotation Technologies: RDF, Microformats, and Meta Headers
There’s quite a bit I could quibble with in this section, but I’ll stick to the one point I think is most significant. It can be very misleading to think of RDF merely as an annotation and/or metadata tool. While it can be, and very often is, used for annotation (typically descriptions of documents) and metadata (descriptions of data) purposes, it is also used to talk about things directly. Alex provides an example: “Alex IS the father of Alice, Lilly, and Sofia”. This is plain old data. The same data could be expressed in an database table called “fatherOf” with “Alex” appearing three times in the left-hand column with the right-hand column containing “Alice”, “Lilly”, “Sofia”. RDF is a data technology, one big difference from traditional RDBMSs is that relations (tables, properties, “fatherOf”) can only two values - the subject and object of the relation (2 columns, “fathers”/”children”). Another big difference is that both things and the relationships between things are generally identified using URIs, which enables the Web part of the Semantic Web.

3. Consumer and Enterprise
I think it’s good that Alex highlights consumer/enterprise and vertical/horizontal aspects of the Semantic Web, they are worthy of discussion. But regarding the “killer app” of the Semantic Web - one might equally well ask “what is the killer app of the Web?” (this is Tim Berners-Lee’s own response in the 2001 Sci Am article).

There’s another source of misconceptions in this section: “RDF offers a way to communicate using XML-based language…”. While strictly speaking that’s probably correct, it gives the impression that RDF is XML-based, which it isn’t. RDF is a data model, an abstract language. Formats and serializations (of which there are several, both XML and non-XML) are secondary. Given the recent work around GRDDL, it’d be more accurate to say “XML offers a way to communicate using RDF-based language…”.

This confusion around XML messes up Alex’s arguments on scalability somewhat - I’m sure someone somewhere is using an XML DB for RDF, but most I’ve seen are either built on top of RDBMSs or are RDF-native. (Non-generic, domain-specific data can be stored pretty much any way you like - if semweb interfaces were exposed I suppose you could call it an RDF store of sorts…). Also while RDF storage technology isn’t any where near as mature as those of RDBMS, they do draw on essentially the same foundations - and sometimes the same people - so the picture isn’t as bad as one might imagine. Genuinely large RDF stores are starting to appear, and even then it’s worth remembering (as Alex points out) the aim is for the big database to be the Web itself. (My own standard line on this is that triplestores are just local caches of chunks of the Semantic Web).

4. Semantic APIs
As Paul Downey put it, Web APIs Are Just Web Sites - the same goes for the Semantic Web. Alex talks about some of the online APIs for extracting RDF from natural language. While these are nifty, potentially any Web site or service could with appropriate tweaking be a Semantic API. The original RSS was a Semantic API - descriptions of news-like items delivered using RDF over HTTP. While the latest syndication format, Atom, might not be RDF, it’s good Web-friendly data that can be mapped to RDF (work is in progress on conventions for that).

Semantic Web technologies also have an ace card up their sleeves here, in the form of SPARQL. RDF stores and (with the appropriate wiring) any online RDF can be queried using a straightforward SQL-like language, operating over standard HTTP. A seriously powerful addition to the Web API toolkit.

Right now the ability to make mashups (client- or server-side) is limited by the effort needed to integrate across different APIs (the n-squared thing). RDF can make integration trivial. Even without RDF/SPARQL being available, a lot of the pain of integration can be alleviated if the data is mapped to RDF then integrated.

I don’t think we’ll ever see every single service offering Semantic Web-friendly APIs. But to the Web 2.0 style sites, the Web is a competitive environment. Services which do support RDF and/or SPARQL will be able to benefit from the lowering of the integration barrier, and over time increasingly tend to have a commercial advantage over services which don’t. The ball is rolling and the field is wide open.

5. Search Technologies
“Perhaps the first significant blow to the Semantic Web has been the inability thus far to improve search.” - er, well, no. Search, at least as we know and love it today, is an artifact of the document Web. Success for the Semantic Web wouldn’t be improving search, but marginalizing it.

The information carried by the document Web, the stuff we’re interested in, is generally expressed in human-readable text inside the documents. There’s a semantic air gap between the protocols and languages of the current Web (HTTP, HTML…) and the information that’s being conveyed. Search engines bridge that gap through the use of heuristics based around string matching on queries and indexed documents. Semantic Web technologies offer a couple of ways of minimizing the gap. Through the increased use of metadata, more explicit matching can be made. Before anyone throws the metacrap arguments at me, consider the improvements already brought by metadata-rich syndication feeds and folksonomy tagging.

The other way of reducing the gap that comes to mind is…not to create gaps in the first place. Take an online train timetable. Right now it’ll likely be contained in a database somewhere, exposed through HTML with a form or two. To access the data we are at the mercy of whatever specific front-end the service provider has offered. To make a mashup with it we’d be making site-specific calls, at best through a RESTful API. But if the data was also available without the document Web-oriented intermediation, say as RDF/XML documents, or perhaps better still a SPARQL endpoint, mashups would be trivial.

Incidentally, I remember the train timetable scenario coming up on the microformats list a while back, at the time it seemed nonsensical to me to follow the suggestion over there of having e.g. one microformatted-HTML page for each record in the database. In retrospect I think that was potentially a very good solution - assuming the microformat followed best practices, using a profile etc, then this would be equivalent to publishing all the data as linked RDF. A GRDDL-aware consumer would in fact see it that way. The bonus advantage is having the (inherently in sync) HTML material available too.

Anyhow, back to search. The current Web does contain one notable kind of explicit, machine-readable semantics: the link. This page is related to that page. I don’t think it’s coincidence that the most successful search heuristic to date - Google’s PageRank - is based on this data source.

My standard line on search is “search engines act as indexes of the Web, the Semantic Web is its own index”, or more succinctly “the best way to find things is not to lose them in the first place”.

6. Contextual Technologies
I don’t really disagree with what Alex says in this section, but would add that Semantic Web languages make it much easier to deal with contexts - which can be expressed directly, without the need for interpreting natural language. There are already a few pretty neat faceted browsing tools around, I reckon these things are going to get a lot neater over the next few years.

7. Semantic Databases
See above about triplestores in Consumer and Enterprise.

Twine and Freebase are really nice applications, although I believe Freebase’s connection to the rest of the (Semantic) Web is still pretty suboptimal. Twine’s still in beta, but has already come an awful long way (I put it in my open-in-tabs-regularly bookmarks). What they both demonstrate is that something which looks to the end user like a regular shiny Web 2.0 application can be built at a significant scale using RDF/RDF-like technologies. Where these things have an opportunity to get much more interesting than similar traditional products is in exploiting the Semantic Web angle. I do hope they hook up to the Linking Open Data cloud soon.

Conclusion
The Semantic Web does mean different things to different people, and maybe I’m being overly orthodox in seeing RDF+HTTP as the distinguishing features of these particular Semantic Technologies. But I’m glad I got that off my chest. Now for that dogwalk with Semantic Gang.

A Chat with Richard Cyganiak

Latest recording on technical matters is a chat with Richard Cyganiak, who’s currently working on the Sindice Semantic Web search engine, though is probably best known for his leading role in the Linking Open Data project (maintaining the cloud diagram :-)

In the podcast Richard describes various technical details of these projects, and talks about the nature of data on the Web in the wild, as RDF, microformats and increasingly RDFa. He also discusses some of the practical issues in mapping existing databases to the Semantic Web (the kind of techniques Tim Berners-Lee mentioned in his podcast
with Paul a few weeks ago).

Richard naturally mentions the principles of Linked Data :

  1. Use URIs as names for things
  2. Use HTTP URIs so that people can look up those names.
  3. When someone looks up a URI, provide useful information.
  4. Include links to other URIs. so that they can discover more things.

Listen Now

Download MP3 [47 mins, 44Mb]

A Chat with Tom Morris

Today’s verbal delight features Semantic Web hacker (and philosopher) Tom Morris, initially talking about using XML to describe real-world things, mentioning the advantages of RDF. He then describes his experiences with the Ruby programming language, and offers thoughts on practical aspects of working in the distributed environment of the Web. Tom tells of ideas he has around using Bluetooth with RDF, before giving his opinion of platforms like Facebook, and related novel aspects of online gaming. He concludes by talking about his recent experience of organizing SemanticCamp London, and encouraging other people to try the BarCamp approach to conferences.

Listen Now

Download MP3
[52 mins, 48Mb]

During the conversation, we refer to the following resources:

A Chat with Uldis Bojars

On Friday I had the pleasure of a chat with Uldis Bojars (also known as CaptSolo), who’s recently being developing social network-oriented applications using Semantic Web technologies. His main area of work over the past few years has been the SIOC project (Semantically-Interlinked
Online Communities), and in the podcast he discusses how the SIOC project anticipated (and fulfils many of the requirements of) the DataPortability initiative. As you can see from the list of links below, there’s a lot happening in this space.

Uldis used two phrases I don’t recall hearing before, but expect to hear a lot more in future. When discussing the recently updated Semantic Radar Firefox plugin he described how it passes data along the “Semantic Web food chain“, and then regarding SIOC Explorer, mining social networks based on “object-centered sociality” (this is described in the paper). Good stuff.

Listen Now

Download MP3
[49 mins, 46Mb]

During the conversation, we refer to the following resources:

Semantic Web…in a nutshell?

PS. John has just posted a *video* : DataPortability and me, JB

John Breslin has just posted Semantic Web for Dummies, suggesting (after Stefan Marti):

XML customised tags, like:
<dog>Nena</dog>
+ RDF relations, in triples, like:
(Nena) (is_dog_of) (Kimiko/Stefan)
+ Ontologies / hierarchies of concepts, like:
mammal -> canine -> Cotton de Tulear -> Nena
+ Inference rules like:
If (person) (owns) (dog), then (person) (cares_for) (dog)
= Semantic Web!

I have a bone to pick with this. While it’s a very nice summary of theSemantic, where’s the Web?

It would be seriously unfair of me to pick on John too much, especially since the list above probably still is the world view shared by most in the Semantic Web community (and I admit that’s how I saw things myself, until relatively recently). It also corresponds to the traditional layer cake representation of the Semantic Web stack of technologies. I should also mention that John’s been a driving force behind the SIOC (Semantically-Interlinked Online Communities) project, which is seriously Web, and is addressing an area now very much in focus in the Web community at large, social data. (I’m also really pleased John was able to attend the DataPortability telecon, that initiative really needs semweb folks to explain the work that’s already been done in the field…and I’m afraid a 6am start is simply beyond me).

It’s often stated that the Semantic Web is an extension of the existing Web. What isn’t always clear is the correspondence between the two. The
current Web is built of documents and hyperlinks between them. If we generalize ‘documents’ to ‘things’ and ‘hyperlinks’ to ‘relationships’, we get Tim Berners-Lee’s Giant Global Graph (I went into a bit more detail on this in Evolving the Link). This abstract model has a lot in common with various conventional ideas from software architecture: object-orientation, entity-relationship modeling and even the relational model behind most databases. There are plenty of differences in detail between these models, but the biggest difference of all in the Semantic Web perspective is that the model is overlaid onto the Web.

So as a first pass at bringing the Web back into the picture, try the following as a Semantic Web 101:

  1. A uniform naming scheme for every kind of thing: documents, people, real-world objects, concepts etc.
  2. A data model which allows you to express relationships between named things
  3. Formats and other data structures which allow you to express information in this data model
  4. A protocol which enables related data to be discovered
  5. User tools which support the above

#1 is Uniform Resource Identifiers, the most significant subset of which is HTTP URLs of the Web.

#2 is RDF, as necessary augmented with ontological and/or rule-oriented techniques.

#3 is pretty much anything in which data can be expressed: obviously RDF formats like RDF/XML and Turtle, but also HTML through the use of RDFa,microformats and Embedded RDF; virtually any XML can be transparently interpreted as RDF through GRDDL; custom translators are available for formats like iCalendar or even CSV data; mapping tools are available for relational databases and systems like LDAP. Basically it usually isn’t necessary to rewrite any application to take advantage of Semantic Web techniques.

#4 is the HTTP protocol, and what Tim Berners-Lee has called “the basic follow-your-nose way the Web works“.

#5 is a side that’s taken a back seat while developer tools like APIs have been in development. Existing applications can usually be made Semantic Web-aware, but there’s a whole lot more can be done in this area in regards to tools for manipulating generic data, and the development of new applications that would be difficult or even impossible without the (Semantic) Web and its technologies to draw upon.

I think it would be fair to say that Semantic Web evangelism has had its share of wrong turns. Way too much time has been spent in arguments over data formats, and the relative complexity of the layers further up the stack have no doubt caused many to reject the technologies after a cursory review. While the stack does has an appealing consistency, there’s little obvious relevance for regular developers. When thinking in terms of ontologies and so on, it’s hard not to slip into designing things top-down and schema-first, which is pretty well the opposite of what is emerging as a more effective approach. The RDF model makes it straightforward to design systems data-first, and when working with an existing, deployed Web, this has definite advantages in terms of allowing incremental development and all-round flexibility.

Anyhow, the Web has been rediscovered in this context - the realisation that what we’re talking about, first and foremost, is Linked Data. Whether that data is concerned with the documents of the traditional Web, the people of social networks or whichever aspect of the world drifts into the limelight next, the same standard technologies can be used to look after it.

Come to think of it, all the above is effectively summarised in Tim Berners-Lee’s Linked Data rules:

  1. Use URIs as names for things
  2. Use HTTP URIs so that people can look up those names.
  3. When someone looks up a URI, provide useful information.
  4. Include links to other URIs. so that they can discover more things.

A Chat with Benjamin Nowack

Just before the weekend I (Danny Ayers) had the pleasure of a telephone chat with Benjamin Nowack, in which he described his views on rapid Semantic Web development in PHP, providing a Unique Selling Proposition to web design agencies and his ARC RDF Classes for PHP among a few other things - see the list below. Particularly timely was his reference to using an ARC-based scutter (RDF crawler) with WordPress for gathering blog-commentator’s social graphs.

Here’s the recording (mp3, 37 minutes, 34MB).

Notes: