Nodalities

From Semantic Web to Web of Data
Nodalities

Subscribe

  • Any Podcatcher
  • Any Feed Reader

Categories

Archives

License

Creative Commons License

The Semantic Web - is everyone confused?

The Economist. Tim O’Reilly. Nova Spivack. Danny Ayers. Read/Write Web’s Alex Iskold. Kingsley Idehen. Brad Feld. Over the last few days all of them have been amongst those writing to clarify their understanding of the Semantic Web and where it’s going.

Each piece is thoughtful, each piece is well worth a read, and each differs somewhat from the others in outlook as they delve into ‘ontologies’, ‘classic approaches’, ‘machine intelligence’, ‘SPARQL’, ‘Turtle’ and other geekiness [meant in the nicest possible way]. I do wonder, though, if all of them are bypassing some fundamental points as they seek to clarify their own perspectives to themselves, to one another, and to the world; points with which I suspect that each may actually agree.

First, I definitely don’t think that a company, technology or approach can only be either ‘Web 2.0′ or ‘Semantic Web’. Sure, some companies will see themselves (or pitch themselves) in one space or the other, but there’s going to be an ever-increasing number that reside firmly in both. Ultimately, of course (and figures in the FT this week, suggesting that

“The pull-back was particularly acute in Silicon Valley, as big Web 2.0 investors such as Benchmark Capital, Kleiner Perkins Caufield & Byers and Omidyar Networks, the private financing vehicle of Ebay founder Pierre Omidyar, cut back on their investments.”

might more logically be interpreted as supporting this argument) companies won’t be Web 2.0 or Semantic Web. They will be companies that solve a particular set of problems for a particular set of audiences. Some of the tools in the toolbox they use to do this will be Web 2.0-ish, some will be Semantic Web-ish, some will be both, and some will be neither. Those things that currently differentiate us - and to which we apply labels in order to reinforce the differentiation - will become mainstream, run of the mill, mundane, and simply expected. That’s progress, and it’s a good thing. Web 2.0 won’t go away. The Semantic Web won’t go away. Shouting about either might, and it doesn’t have to mean that their importance has diminished.

Second, ‘collective intelligence’ applies equally to both. Tim O’Reilly’s absolutely right that it’s been a key differentiator of many Web 2.0 darlings;

“By contrast, I’ve argued that one of the core attributes of ‘web 2.0′ (another ambiguous and widely misused term) is ‘collective intelligence.’ That is, the application is able to draw meaning and utility from data provided by the activity of its users, usually large numbers of users performing a very similar activity. So, for example, collaborative filtering applications like Amazon’s ‘people who bought item this also bought’ or last.fm’s music recommendations, use specialized algorithms to match users with each other on the basis of their purchases or listening habits. There are many other examples: digg users voting up stories, or wikipedia’s crowdsourced encyclopedia and news stories.”

It’s also front and centre in Semantic Web work, though. For example that from ourselves, Radar Networks and others. See this white paper [PDF] for one, and watch here and here for public sight of internal developments… soon. The connections that RDF makes so manifest are a perfect way to express, traverse, and mine the habits, behaviours and desires of the collective.

Third, ‘a formal ontology’ is not a requirement, and nor is pushing structure in the face of the user.

Tim makes a good point here;

“The Semantic Web is a bit of a slog, with a lot of work required to build enough data for the applications to become useful. Web 2.0 applications often do a half-assed job of tackling the same problem, but because they harness self-interest, they typically gather much more data. And then solve for their deficiencies with statistics or other advantages of scale.”

I’m not sure, though, that SemWeb/ Web 2.0 is the dichotomy here? Rather, it’s a split between purist, all-encompassing, and hugely flexible on the one hand and pragmatic and ‘good enough’ on the other. I would agree that stereotype would often place Semantic Web developers on one side of that divide and Web 2.0 startups on the other. The technology is not the point there, though, so much as the mindset. Believe me, we can do some great stuff to harness self-interest, gather much more data, and solve the deficiencies with statistics and other advantages of scale in a Semantic Web-ey Platform… :-)

“But I predict that we’ll soon see a second wave of social networking sites, or upgrades to existing ones, that provide for the encoding of additional nuance. In addition, there will be specialized sites — take Geni, for example, which encodes geneaology — that will provide additional information about the relationships between people. Rather than there being a single specification capturing all the information about relationships between people, there will be many overlapping (and gapping) applications, and an opportunity for someone to aggregate the available information into something more meaningful.”

Too right, Tim. But I’d definitely suggest that those building the second wave should be talking to Talis, to Radar Networks, to Metaweb and to some of the other proponents of a new and far more Web 2.0-inspired Semantic Web paradigm. There are way too many synergies there to ignore…

Dan Brickley’s comments in response to one aspect of Danny’s argument are also interesting;

“Let me clear something up. Danny mentions a discussion with Tim O’Reilly about SemWeb themes.

Much as I generally agree with Danny, I’m reaching for a ten-foot bargepole on this one point:

‘While Facebook may have achieved pretty major adoption for their approach, it’s only very marginally useful because of their overly simplistic treatment of relationships.’

Facebook, despite the trivia, the endless wars between the ninja zombies and the pirate vampires; despite being centralised, despite [insert grumble] is massively useful. Proof of that pudding: it is massively used. ‘Marginal’ doesn’t come into it.”

Too true. I’ve complained about Facebook, too [for example here and here]. But I use it, and millions of others use it. And it serves a purpose. That doesn’t mean it can’t be better.

Turning, finally, to Alex’ post;

“The first problem is that RDF and OWL are complicated. Even for scientists and mathematicians these graph-based languages take time to learn and for less-technical people they are nearly impossible to understand. Because the designers were shooting for flexibility and completeness, the end result are documents that are confusing, verbose and difficult to analyze.”

Well, yes and no. That’s what tools are for. And in a large number of cases the RDF may actually be auto-generated as part of some process of aggregation or value addition of which the data creator or manager need have no explicit awareness. The RDF may very well be generating an aggregation of tiny snippets of data from large numbers of transactions; the interaction of a single user with a single resource doesn’t have to result in a whole RDF document of its own. More on that later.

And, also from Alex;

“Going back to John Markoff’s example of a computer booking a perfect vacation, one can’t help but think of a travel agency. In the good old days, you would go to the same agent over and over again. Why? Because just like your friends, your doctor, your teacher, the travel agent needs to know you personally to be able to serve you better.

The travel agent remembers that you’ve been to Prague and Paris, which is why he offers you a trip to Rome. The travel agent remembers that you’re a vegetarian and orders the pasta meal for you on your flight. Over time people learn and memorize facts about life and each other. Until machines can do the same, knowledge of semantics, limited or full is not going to be enough to replace humans.”

Exactly. And that’s where network effects, collective intelligence, behavioural observation and all the rest kick in. The knowledge comes from observation of an awful lot of behaviour; not from having the traveller fill in some long-winded and tedious form detailing an RDF graph representation of their travel preferences for all situations. Context matters. I, for example, want a window seat on short-haul flights, and an aisle seat on long-haul flights. It’s not a simple preference one way or the other. I don’t have a preferred airport to depart from, as so many other factors come into play. I’ll go to a more distant departure airport for a better departure or travel time, for example. I won’t always travel with the airlines I’ve got frequent flier cards for… but they don’t have to be cheapest before I can or will. It’s more complex than that. Current systems don’t understand.

“Perhaps the worst challenge facing the semantic web is the business challenge. What is the consumer value? How is it to be marketed? What business can be built on top of the semantic web that can not exist today? Clearly the example of instant travel match is not a ‘wow.’ It’s primitive and, in a way, uninteresting because many of us are already quite adept at being our own travel agent using existing tools. But assuming that there are problems that can be solved faster, there is still a question of specific end user utility.”

Talis. Radar Networks. Joost. Metaweb. Garlik. Need I go on? (I can… :-) )

“The way the semantic web is presented today makes it very difficult to market. The ‘we are a semantic web company’ slogan is likely to raise eyebrows and questions. RDF and OWL clearly need to be kept under the hood. So the challenge is to formulate the end user value in ways that will resonate with people.”

Absolutely right! SWEO is part of the answer. Companies like ours getting out and showing what can be done, and why it’s valuable is crucial too… and we’re getting there.

And to answer my initial question; No, I don’t think everyone is confused by or about the Semantic Web. We do, though, have a lot of different niche views of value (or lack thereof), clamouring for attention. These overlapping - and not necessarily incorrect - perspectives certainly could appear to be a result of confusion, if viewed from the outside. Language is a complicated thing, and these are complex ideas. Describing one with the other requires a number of iterations to arrive at clarity, but we’re getting there.

There’s a lot more to say, but this post has now gone on long enough (especially as I initially meant to simply point you at some interesting blog posts…).

Technorati Tags: , , ,

2 Responses

  1. Paul Gearon Says:

    In terms of breaking down the perceived dichotomy between the Semantic Web and Web 2.0, I’d like to go beyond stating that formal ontologies are not required in the Semantic Web, and claim the technologies of the Semantic Web can help enable Web 2.0.

    The Semantic Web is based on an “Open World” model. This means that that there may always be more information that is currently unknown, and that all Semantic Web systems must be able to accept any new information, no matter the form. New information may be basic data: eg. the new books a user just bought; or it may be data describing the structure of other data: eg. I want the system to recognise the Chinese form of a person’s name, as well as the English form. In the latter example, the Chinese forms of a name may well have been in the system already, but not being used in automated operations.

    Not only does the Semantic Web accept all new information, but importantly, it does not require any specific information. For instance, I may stipulate that all people must have a first name, but the system must be capable of working with people who’s first names have not been revealed.

    So while the Semantic Web provides the capability of describing (via ontologies) the data you may have, it requires nothing of you beyond basic consistency. This allows you to have as large or as small an ontology as you want. You can then choose only to use data that fits the ontology neatly, or use data that occurs mostly outside of the ontology.

    How does this work with Web 2.0? RDF is a mechanism for linking information together. It lets you form a “Web” linking anything, including people, communities, documents, relationships, and web pages. This is the perfect foundation for any kind of web! RDF also lets you grow your system continuously. OWL then allows you to describe those portions of the new web that you understand and recognise, enabling intelligent automation on that part of the data. If everything is being done by people (such as on a purely social network) then perhaps there is no need for automation. If you want to do it all by computer, then it becomes worthwhile to use OWL extensively. But at no stage are requirements forced on users.

    The Semantic Web enables, it should never enforce nor impede. Web 2.0 applications can only benefit from the functionality derived from using Semantic Web technologies.

  2. Mills Davis Says:

    I prefer to market using the term “web 3.0″. The message is that we’re talking about something beyond web 2.0.

    It’s not about the technology. The most important arguments to advance concern value. Be clear about who benefits and what s/he cares about. The good news is that there are plenty of examples that can be cited, and it is possible to differentiate the web 3.0 value proposition.

Leave a Reply