Nodalities

From Semantic Web to Web of Data
Nodalities

Subscribe

  • Any Podcatcher
  • Any Feed Reader

Categories

Archives

License

Creative Commons License

Archive for September, 2007

Carlos González-Cadenas Talks with Talis about ExperienceOn Ventures and transforming travel

In our latest Talking with Talis podcast, I talk with Carlos González-Cadenas about ExperienceOn Ventures, the technology they are developing, and their first ‘product’; the transformative travel site BeFogg.com.

Carlos has written about his particular take on ‘Web 3.0′, and we explore his perspective further during the call.

Carlos is Founder and CEO of ExperienceOn Ventures, based in the Spanish city of Barcelona.

Listen Now

Download MP3 [45 mins, 21Mb]

This conversation was conducted using Skype on Friday 14 September, recorded with Ecamm Network’s Call Recorder for Skype, and edited on a Mac with Garageband.

For further Talking with Talis podcasts on the emerging Web of Data, see here.

Technorati Tags: , , , , , , ,

Rules for a Realistic Semantic Web?

In one of my linkblog entries earlier this week I made the following claim:

IMHO OWL isn’t part of the petatriple future of the semweb. Nor is SPARQL…

A recent post by Chimezie touched on this too:

I’ve been spending quite a bit of time on FuXi mainly because I am interested in empirical evidence which supports a school of thought which claims that Description Logic based inference (Tableaux-based inference) will never scale as well the Logic Programming equivalent - at least for certain expressive fragments of Description Logic (I say expressive because even given the things you cannot express in this subset of OWL-DL there is much more in Horn Normal Form (and Datalog) that you cannot express even in the underlying DL for OWL 1.1). The genesis of this is a paper I read, which lays out the theory, but there was no practice to support the claims at the time (at least that I knew of). If you are interested in the details, the paper is “Description Logic Programs: Combining Logic Programs with Description Logic” and written by many people who are working in the Rule Interchange Format Working Group.

It is not light reading, but is complementary to some of Bijan’s recent posts about DL-safe rules and SWRL.

A follow-up is a paper called “A Realistic Architecture for the Semantic Web” which builds on the DLP paper and makes claims that the current OWL (Description Logic-based) Semantic Web inference stack is problematic and should instead be stacked ontop of Logic Programming since Logic Programming algorithm has a much richer and pervasively deployed history (all modern relational databases, prolog, etc..)

I’m not a DL expert but, based on my research, it seems to that DL based inference for OWL isn’t going deliver for the semantic web any time soon. Of course, by this I mean it’s not going to scale in such a way that makes real-time inferencing over petatriples viable. Besides, OWL and its variations are still very limited in their expressivity and not particularly useful for many classes of applications. Maybe rule systems can deliver instead?

Ning grows…

Marc has the figures…

Technorati Tags: , ,

Gartner says ‘No’ to Semantic Web?

Gartner136

The wonder of RSS alerts mean that I, of course, caught sight of this at the time, but the trials and tribulations of living with someone as they turned six (and the party isn’t until this weekend - will we survive?) pushed it toward the back of my mind. Reading Joe McKendrick today returned my attention to the matter at hand.

Writing in NetworkWorld, Jon Brodkin reports that Gartner are

“…avoiding the temptation to give a new label to the latest technologies such as virtual worlds and the semantic Web, saying they’re not providing the same kind of fundamental change as blogs, wikis and social networking tools.

‘It’s not going to be another era like Web 2.0,’ Phifer said. ‘However, there will be some very interesting innovative things coming out. If you’re in love with numbering schemes, maybe it’s Web 2.1.’”

Aside from obvious comments about the silliness of the point release debate, and the ludicrousness of mungeing ‘virtual worlds’ with the ’semantic Web’, I can’t help feeling that they miss the point. The amalgamation of Semantic Web capabilities with point-in-time technologies such as today’s blogs, wikis, and social networking tools will be far more transformative than that which we’ve seen over the past few years.

As I keep arguing, though, it’s not an either/or type of debate. The participative elements of Web 2.0’s ‘collective intelligence‘ are ripe for taking to a whole new level with the Semantic Web-inspired capabilities of the Talis Platform and related pragmatic instantiations of the Semantic Web dream in the real world.

Maybe Gartner are looking in the wrong places. Might I suggest they spend time talking to one or two of our Advisory Group for a far more informed perspective?

Technorati Tags: , , , , , ,

Web 2.0, the Semantic Web, Web 3.0 and San Francisco

125X125

Polarising the distinctions between ‘Web 2.0′ and the ‘Semantic Web’ would appear to be flavour of the month in the world of the tech blogs, despite the obvious synergies to be realised at the interface between the two.

It’s good, then, to see ‘Semantic Web’ mindshare in that bastion of the Web 2.0 world, Tim O’Reilly and John Battelle’s annual Web 2.0 Summit.

Talis Platform Advisory Group stalwart (and podcast victim) Nova Spivack will be talking about his work at Radar Networks and, I hope, giving his audience a first look at the product his team have been quietly building for the past couple of years.

Danny Hillis will also be talking about Metaweb (and their first ‘product’, Freebase) in the same session. I can’t claim Danny as an Advisory Group member, but I can say that his Minister of Information, Jamie Taylor, is doing a sterling job in that regard. He was a good podcast victim, too… :-)

The third speaker in that slot is Barney Pell, who isn’t an Advisory Group member, a podcast victim, or the boss of such. Hmm… I’m clearly remiss there and will have to do something to remedy that particular oversight!

Talis CTO Ian Davis and I are hopping on the claustrophobic aluminium tube over to San Francisco for the event, and will be trying to do our usual bit to join dots, see synergies, and generally munge (it’s a technical term here, trust me) Semantic with 2.0 to see what falls out the other end. I last attended this particular event way back in the dim mists of 2005, so look forward to seeing how perceptions at, of, and from it have changed in the intervening years. I certainly know that we have changed, and that our vision of the interface between the best of the Semantic Web and the best of Web 2.0 grows clearer and stronger every day…

If you’re in or around San Francisco 16-21 October and fancy a chat, feel free to get in touch via any of the usual channels [irc paul_miller on freenode, Skype napm1971, AIM/iChat talis_paul@mac.com, email paul.miller@talis.com, mobile/cell +44 7769 740083, carrier pigeon/owl to Castle Talis].

Technorati Tags: , , , , , , , , , , , ,

This Week’s Semantic Web

Selected links related to Semantic Web technologies for the week
ending 2007-09-24

In the Media

Docs

Software News

Events etc.

Calls for Papers

Miscellany

Thread of the Week

Quote of the Week

best thing we ever did, was make those tshirts!

- danbri
on education and outreach

~

Sources include Planet RDF, various other blogs, Semantic Web Interest Group IRC Chatlogs & Scratchpad, ESW Wiki, SemWebCentral, Sweet Tools, W3C Semantic Web Activity, mailing lists, personal emails etc etc. If you see anything suitable
this coming week, please mail meor
use the del.icio.us tags “semweb weekly” -
thanks!

Seeking a licence for Open Data

20 December 2007 Update: search over… licence found!

In the world of creative works, notions espoused by Lawrence Lessig and others over a number of years are becoming increasingly well understood. A Creative Commons license, for example, is recognised as giving the holder of rights an ability to prospectively grant certain permissions rather than limit use of their work by expecting all comers to request these permissions, again and again. Those rights are not cast aside, removing all opportunities to protect your work, your name, or your potential revenue stream. Rather, you are provided with a means to explicitly declare that your work may be used and reused by others in certain ways without their needing to request permission. Any other use is not forbidden; those uses must simply be negotiated in the ‘normal’ way… a normal way that also applied to those uses covered by Creative Commons licenses before the advent of those licenses.

Creative Commons licenses are an extension of copyright law, as enshrined in the legal frameworks of various jurisdictions internationally. As such, it doesn’t really work terribly well for a lot of (scientific, business, whatever) data… but the absence of anything better has led people to try slapping Creative Commons licenses of various types on data that they wish to share. It will be interesting to see what happens, the first time one of those licenses needs to be upheld via a court!

At Talis, we have an interest in seeing large bodies of structured data available for use. Through the Talis Platform, we offer one means whereby such data may be stored, used, aggregated and mined, although we clearly recognise that similar data may very well also be required in similar contexts.

Recognising that contributors of such data need to be reassured as to the uses to which we - and others - may put their hard work, we spent some time a couple of years ago drafting something then called the Talis Community Licence. This draft licence is based upon protections enshrined in European Law, and has been used ‘in anger’ for a while to cover contributions of millions of records to one particular application on the Talis Platform.

There has been plenty of talk around ‘open data‘ here on Nodalities, and on our sister blog Panlibus. See, for example, this recent post from Rob Styles. There were also fascinating discussions at the WWW2007 conference earlier this year.

Despite interest in open (or ‘linked‘) data, licenses to provide protection (and, of course, to explicitly encourage reuse) are few and far between. Amongst zealous early adopters, there does seem to be a tendency to either (mis)use a Creative Commons license, to say nothing whatsoever, or to cast their data into the public domain. None of these strategies are fit for application to business-critical data.

Building upon our original work on the TCL, we recently provided funding to lawyers Jordan Hatcher and Charlotte Waelde. They were tasked with validating the principles behind the license, developing an effective expression of those principles that could be applied beyond the database-aware shores of Europe, and working with us to identify a suitable home in which this new licence could be hosted, nurtured, and carried forward for the benefit of stakeholders far outside Talis.

Today, Jordan posted the latest draft of this license (now going by the name ‘Open Data Commons‘), some rationale, and pointers to various ways in which he - and we - are seeking input and further validation.

As my colleague Rob (again!) has argued, curators of data need an option on the permissions continuum between free-for-all and locked down. The Open Data Commons, née Talis Community Licence, offers that option.

Take a look. Think about how you would use it. Consider what sort of administrative framework you would want behind such a license. Join the conversation.

Technorati Tags: , , , , , , , , ,

The fourth platform… ?

Interesting. I wrote on Monday about Marc Andreessen’s latest essay on Internet Platforms. Talis Platform Advisory Group member Jon Udell chips in with a brief but interesting take on this today.

“There’s a level 4 platform waiting in the wings. At level 4, the cloud of storage and computation is partly centralized in a handful of intergalactic clusters, and partly distributed across a network of humble peers. Microsoft’s forthcoming Internet service bus is one example of a level 4 platform. I hope, and expect, we’ll see others.”

Have a read, and follow Jon’s links. As usual, it’s worth it…

Technorati Tags: , , , , ,

Read/Write Web paints a prettier picture of the Semantic Web

New Logo Bg

Alex Iskold’s ‘Semantic Web: Difficulties with the Classic Approach‘ for Read/Write Web was one of the posts rolled up into yesterday’s outpouring here on Nodalities.

He’s been busy during the (my) night, and I woke this morning to ‘Top-Down: A New Approach to the Semantic Web

There’s some good stuff in here, although Alex’ opening gambit doesn’t gel with my impression of today’s Semantic Web, or with the exemplars I pointed to yesterday. Alex writes;

“While the original vision of the layer on top of the current web, which annotates information in a way that is ‘understandable’ by computers, is compelling; there are technical, scientific and business issues that have been difficult to address.

One of the technical difficulties that we outlined was the bottom-up nature of the classic semantic web approach. Specifically, each web site needs to annotate information in RDF, OWL, etc. in order for computers to be able to ‘understand’ it.

As things stand today, there is little reason for web site owners to do that. The tools that would leverage the annotated information do not exist and there has not been any clearly articulated business and consumer value. Which means that there is no incentive for the sites to invest money into being compatible with the semantic web of the future.”

The sorts of work that I referred to yesterday, and that Nodalities often discusses, is very much geared towards many attributes of Alex’ “Top Down” approach, even if I don’t wholly recognise his dividing up of the world of which we’re a part…

“But there are alternative approaches. We will argue that a more pragmatic, top-down approach to the semantic web not only makes sense, but is already well on the way toward becoming a reality. Many companies have been leveraging existing, unstructured information to build vertical, semantic services. Unlike the original vision, which is rather academic, these emergent solutions are driven by business and market potential.

In this post, we will look at the solution that we call the top-down approach to the semantic web, because instead of requiring developers to change or augment the web, this approach leverages and builds on top of current web as-is.”

However, it also addresses some of his concerns with his “Top Down”;

“Despite being effective, the somewhat simplistic top-down approach has several problems. First, it is not really the semantic web as it is defined, instead its a group of semantic web services and applications that create utility by leveraging simple semantics. So the proponents of the classic approach would protest and they would be right. Another issue is that these services do not always get semantics right because of ambiguities. Because the recognition is algorithmic and not based on an underlying RDF representation, it is not perfect”

As with so many other areas, the path forward lies in the middle, somewhere between the extremes typified by Alex’ Top Down and Bottom Up. Neither is sustainable or effective in the long term, and I doubt that any of the companies or products that he places in each bucket would consider themselves as staying there; even if they recognise that description of themselves as accurate today.

Yes, we’re seeing quite narrow verticals adopting semantically-assisted approaches to delivering a patently better service. But to have the silos grow better at the same time as they grow more absorbing and sticky is not necessarily a good thing for the end user. As Danny suggests in comments, we need to move from ‘on the web’ to ‘of the web’

Technorati Tags: , , ,

The Semantic Web - is everyone confused?

The Economist. Tim O’Reilly. Nova Spivack. Danny Ayers. Read/Write Web’s Alex Iskold. Kingsley Idehen. Brad Feld. Over the last few days all of them have been amongst those writing to clarify their understanding of the Semantic Web and where it’s going.

Each piece is thoughtful, each piece is well worth a read, and each differs somewhat from the others in outlook as they delve into ‘ontologies’, ‘classic approaches’, ‘machine intelligence’, ‘SPARQL’, ‘Turtle’ and other geekiness [meant in the nicest possible way]. I do wonder, though, if all of them are bypassing some fundamental points as they seek to clarify their own perspectives to themselves, to one another, and to the world; points with which I suspect that each may actually agree.

First, I definitely don’t think that a company, technology or approach can only be either ‘Web 2.0′ or ‘Semantic Web’. Sure, some companies will see themselves (or pitch themselves) in one space or the other, but there’s going to be an ever-increasing number that reside firmly in both. Ultimately, of course (and figures in the FT this week, suggesting that

“The pull-back was particularly acute in Silicon Valley, as big Web 2.0 investors such as Benchmark Capital, Kleiner Perkins Caufield & Byers and Omidyar Networks, the private financing vehicle of Ebay founder Pierre Omidyar, cut back on their investments.”

might more logically be interpreted as supporting this argument) companies won’t be Web 2.0 or Semantic Web. They will be companies that solve a particular set of problems for a particular set of audiences. Some of the tools in the toolbox they use to do this will be Web 2.0-ish, some will be Semantic Web-ish, some will be both, and some will be neither. Those things that currently differentiate us - and to which we apply labels in order to reinforce the differentiation - will become mainstream, run of the mill, mundane, and simply expected. That’s progress, and it’s a good thing. Web 2.0 won’t go away. The Semantic Web won’t go away. Shouting about either might, and it doesn’t have to mean that their importance has diminished.

Second, ‘collective intelligence’ applies equally to both. Tim O’Reilly’s absolutely right that it’s been a key differentiator of many Web 2.0 darlings;

“By contrast, I’ve argued that one of the core attributes of ‘web 2.0′ (another ambiguous and widely misused term) is ‘collective intelligence.’ That is, the application is able to draw meaning and utility from data provided by the activity of its users, usually large numbers of users performing a very similar activity. So, for example, collaborative filtering applications like Amazon’s ‘people who bought item this also bought’ or last.fm’s music recommendations, use specialized algorithms to match users with each other on the basis of their purchases or listening habits. There are many other examples: digg users voting up stories, or wikipedia’s crowdsourced encyclopedia and news stories.”

It’s also front and centre in Semantic Web work, though. For example that from ourselves, Radar Networks and others. See this white paper [PDF] for one, and watch here and here for public sight of internal developments… soon. The connections that RDF makes so manifest are a perfect way to express, traverse, and mine the habits, behaviours and desires of the collective.

Third, ‘a formal ontology’ is not a requirement, and nor is pushing structure in the face of the user.

Tim makes a good point here;

“The Semantic Web is a bit of a slog, with a lot of work required to build enough data for the applications to become useful. Web 2.0 applications often do a half-assed job of tackling the same problem, but because they harness self-interest, they typically gather much more data. And then solve for their deficiencies with statistics or other advantages of scale.”

I’m not sure, though, that SemWeb/ Web 2.0 is the dichotomy here? Rather, it’s a split between purist, all-encompassing, and hugely flexible on the one hand and pragmatic and ‘good enough’ on the other. I would agree that stereotype would often place Semantic Web developers on one side of that divide and Web 2.0 startups on the other. The technology is not the point there, though, so much as the mindset. Believe me, we can do some great stuff to harness self-interest, gather much more data, and solve the deficiencies with statistics and other advantages of scale in a Semantic Web-ey Platform… :-)

“But I predict that we’ll soon see a second wave of social networking sites, or upgrades to existing ones, that provide for the encoding of additional nuance. In addition, there will be specialized sites — take Geni, for example, which encodes geneaology — that will provide additional information about the relationships between people. Rather than there being a single specification capturing all the information about relationships between people, there will be many overlapping (and gapping) applications, and an opportunity for someone to aggregate the available information into something more meaningful.”

Too right, Tim. But I’d definitely suggest that those building the second wave should be talking to Talis, to Radar Networks, to Metaweb and to some of the other proponents of a new and far more Web 2.0-inspired Semantic Web paradigm. There are way too many synergies there to ignore…

Dan Brickley’s comments in response to one aspect of Danny’s argument are also interesting;

“Let me clear something up. Danny mentions a discussion with Tim O’Reilly about SemWeb themes.

Much as I generally agree with Danny, I’m reaching for a ten-foot bargepole on this one point:

‘While Facebook may have achieved pretty major adoption for their approach, it’s only very marginally useful because of their overly simplistic treatment of relationships.’

Facebook, despite the trivia, the endless wars between the ninja zombies and the pirate vampires; despite being centralised, despite [insert grumble] is massively useful. Proof of that pudding: it is massively used. ‘Marginal’ doesn’t come into it.”

Too true. I’ve complained about Facebook, too [for example here and here]. But I use it, and millions of others use it. And it serves a purpose. That doesn’t mean it can’t be better.

Turning, finally, to Alex’ post;

“The first problem is that RDF and OWL are complicated. Even for scientists and mathematicians these graph-based languages take time to learn and for less-technical people they are nearly impossible to understand. Because the designers were shooting for flexibility and completeness, the end result are documents that are confusing, verbose and difficult to analyze.”

Well, yes and no. That’s what tools are for. And in a large number of cases the RDF may actually be auto-generated as part of some process of aggregation or value addition of which the data creator or manager need have no explicit awareness. The RDF may very well be generating an aggregation of tiny snippets of data from large numbers of transactions; the interaction of a single user with a single resource doesn’t have to result in a whole RDF document of its own. More on that later.

And, also from Alex;

“Going back to John Markoff’s example of a computer booking a perfect vacation, one can’t help but think of a travel agency. In the good old days, you would go to the same agent over and over again. Why? Because just like your friends, your doctor, your teacher, the travel agent needs to know you personally to be able to serve you better.

The travel agent remembers that you’ve been to Prague and Paris, which is why he offers you a trip to Rome. The travel agent remembers that you’re a vegetarian and orders the pasta meal for you on your flight. Over time people learn and memorize facts about life and each other. Until machines can do the same, knowledge of semantics, limited or full is not going to be enough to replace humans.”

Exactly. And that’s where network effects, collective intelligence, behavioural observation and all the rest kick in. The knowledge comes from observation of an awful lot of behaviour; not from having the traveller fill in some long-winded and tedious form detailing an RDF graph representation of their travel preferences for all situations. Context matters. I, for example, want a window seat on short-haul flights, and an aisle seat on long-haul flights. It’s not a simple preference one way or the other. I don’t have a preferred airport to depart from, as so many other factors come into play. I’ll go to a more distant departure airport for a better departure or travel time, for example. I won’t always travel with the airlines I’ve got frequent flier cards for… but they don’t have to be cheapest before I can or will. It’s more complex than that. Current systems don’t understand.

“Perhaps the worst challenge facing the semantic web is the business challenge. What is the consumer value? How is it to be marketed? What business can be built on top of the semantic web that can not exist today? Clearly the example of instant travel match is not a ‘wow.’ It’s primitive and, in a way, uninteresting because many of us are already quite adept at being our own travel agent using existing tools. But assuming that there are problems that can be solved faster, there is still a question of specific end user utility.”

Talis. Radar Networks. Joost. Metaweb. Garlik. Need I go on? (I can… :-) )

“The way the semantic web is presented today makes it very difficult to market. The ‘we are a semantic web company’ slogan is likely to raise eyebrows and questions. RDF and OWL clearly need to be kept under the hood. So the challenge is to formulate the end user value in ways that will resonate with people.”

Absolutely right! SWEO is part of the answer. Companies like ours getting out and showing what can be done, and why it’s valuable is crucial too… and we’re getting there.

And to answer my initial question; No, I don’t think everyone is confused by or about the Semantic Web. We do, though, have a lot of different niche views of value (or lack thereof), clamouring for attention. These overlapping - and not necessarily incorrect - perspectives certainly could appear to be a result of confusion, if viewed from the outside. Language is a complicated thing, and these are complex ideas. Describing one with the other requires a number of iterations to arrive at clarity, but we’re getting there.

There’s a lot more to say, but this post has now gone on long enough (especially as I initially meant to simply point you at some interesting blog posts…).

Technorati Tags: , , ,