Nodalities

From Semantic Web to Web of Data
Nodalities

Updates

Follow us on:

Categories

Archives

License

Creative Commons License

Linked Data In(ter)action

By Benjamin Nowack

| This article will feature in Nodalities Magazine, Issue 6

During the recent months, the Semantic Web community is accelerating its progress around web-enhanced information and knowledge management. Specifications such as RDF and SPARQL are increasingly applied by developers and organizations, RDF software is maturing. Even the initial chicken and egg problem around data and applications has now been solved by the Linking Open Data (LOD) project, which is bringing dataset after dataset online, each following recommended practices for simplified information access and repurposing. The time has finally come to move on and create the distributed data applications we have been dreaming of for so long.

Just like the Web’s true innovation was not hypertext as such, but freeing it from isolated CD ROMs, the Semantic Web’s value proposition is not information integration per se, but doing it on a global scale. Network effects will play an important role and have to be considered by application developers. Mashups on a semantic web are not one-off combinations of existing sources and APIs. They will feed their added value back into a self-enforcing Linked Data Ecosystem, thus enabling chains of applications, with each reaping the benefits of the previous one. RDF developers these days often use terms like “Meshup” or “Hyperdata” to describe the direction they are headed.

Linked Data is all about portability and off-site use: The more a respective application attracts users, the more will it let them take their data with them and also integrate external sources. With a bit of luck, we will see not one, but a wealth of killer applications, where the “unique selling proposition” is personal and defined by each user individually.
Despite the ongoing advances, some pieces to the puzzle are still missing. This becomes clearer when we correlate the current state of the Linked Data market to a typical information life cycle classification. While we can name solutions for each value-increasing process (Creation, Organization, Utilization, Distribution, Discovery), the Utilization and Application stage represents a bottleneck. Products start to benefit from Linked Data, but few are also re-distributing their internally enriched information. Additionally, the Creation phase today is mostly driven by dedicated efforts such as the LOD project, although data manipulation and enhancing should also be possible right while people are interacting with semantic web content.

Linked Data Value Spiral

A few months ago, Talis researcher Tom Heath wrote an inspiring IEEE Internet Computing essay titled “How Will We Interact with the Web of Data?” where he described the upcoming challenges and opportunities in the context of human-computer interaction. He suggested that on a web where the granularity is increased from documents to arbitrary things, user interfaces should treat individual objects as first-class citizens, ideally providing context-specific functionality, direct manipulation, and coherence across personal usage scenarios. Application models that go beyond browsing and which are both universal and user-friendly are an ongoing challenge.

A system that aims at finding a sweet spot between simplicity and standardized interaction is Paggr (paggr.com). The basic idea is to combine successful Web 2.0 solutions and trends with Tim Berners-Lee’s concept of an “RDF Clipboard” for polymorphic data exchange between desktop applications. The required technical trick for copy-by-reference across desktop and web applications was introduced by Ray Ozzie three years ago through his “Live Clipboard”. Around the same time, AJAX and converging browser capabilities mass-enabled interactive HTML elements, and personal portal builders such as Netvibes brought widgets and drag and drop to end-users. The amount of open datasets and technical possibilities finally led to a first prototype for building Linked Data Dashboards a few months ago.

The system used Netvibes-like pages with three resizable colums that could be populated with so-called Sparqlets. A Sparqlet is a SPARQL-powered widget, defined by a set of queries and result templates. The output consists of machine-readable HTML which addresses three essential requirements:

  • Widgets can easily be copied to other dashboards, their complete definition is retrievable via HTTP (by de-referencing the widget identifier).
  • Individual items in a widget can be interactively linked to other items, as each element is associated with a URI. This makes semantic drag and drop possible, such as dragging a person representation on a map or an address book widget.
  • Being able to instantly feed augmented data back into the personal or public data cloud.

Architecture

The prototype received encouraging and very helpful feedback at the International Semantic Web Conference (and even won a prize). We are clearly not ready for the mainstream user yet, but building on established interaction models seems to be a promising acceptance strategy. The next iteration of Paggr is now almost finished and we are looking forward to putting it online. The first public applications will be limited to focused use cases (such as an organizer for conference attendees) as we are still working on certain interface behaviors, but a private alpha phase with less restrictions is planned, too.

Linked Data Dashboards face a number of usability challenges. The big question is how to tie the wealth of possibilities to a generic user interface without sacrificing work efficiency. Application convenience often boils down to feature reduction and contextual options, possibly combined with shortcuts for common tasks. To reduce complexity, Paggr lets the user (or app creator) break the theoretically infinite possibilities down into separate dashboards, where options and relations can be further spread across widgets.

The more complicated part starts at the widget level. Semantic drag and drop is often multi-modal. Dragging an event on a calendar does not necessarily mean “Add”, there are many ways to link two persons to each other, etc. Also, working with Linked Data is sometimes like having a backstage pass for a concert: very exciting, but also a bit rough, easily overwhelming, and if you open the wrong door, you can quickly find yourself getting kicked out. Raw data (or equally ugly RDF/HTML dumps) are always just a link away, application designers will try to carefully shield non-developers from being exposed to things like DBPedia pages. For developers, on the other hand, this equivalent to the early Web’s “view source” feature can be very valuable.

Now, what exactly are the requirements and nice-to-haves, and (how) can they be implemented through widgets without leading to cluttered screen estate? As mentioned above, in order to support drag and drop as well as copy and paste between different browser tabs or even at the operating system level, we can use a technical trick introduced by Live Clipboard: transparent form fields that natively provide “right-click / paste” and similar functionality. For a consistent user experience, this means that we need distinguishable (but unobtrusive) fields for each interactive element. In Paggr, small Semantic Web icons next to widget items and title bars signal the availability of advanced options. They enable:

  • widget filtering
  • copying widget or item identifiers
  • removing items from and adding items to widgets
  • interlinking individual items
  • custom contextual menus

Paggr Widget

The approach of using dedicated interaction zones has desirable side-effects. Non-expert users are less likely to get confused, as the general markup keeps its expected behavior. It also becomes possible to disable the semantic extensions simply by deactivating and hiding the icons. A public dashboard or shared meshup may look and feel just like a normal website.

There are still several unresolved issues left and future iterations could well require a complete re-design, but Paggr is just one of a growing number of consumer-oriented Linked Data systems. After years of hard infrastructure work, the Semantic Web community is finally starting to benefit from the investments. Data-wise, we have probably reached the tipping point already. Even former critics start to make their information available in RDF, efforts like microformats, once regarded as competitors, have become accessible from SPARQL, and services like OpenCalais, Yahoo!’s SearchMonkey, or the Zemanta API are constantly reinforcing the network effects of structured open data. It should only be a matter of months until we are going to see the first fully-fledged Linked Data applications for end-users.

Benjamin Nowack is the developer of Paggr. He runs semsol, a tiny Semantic Web agency in Düsseldorf, Germany.

A conference comes of age: a review of the 7th International Semantic Web Conference (ISWC2008)

| This post will feature in Nodalities Magazine, issue 5.

What are the factors that indicate a coming of age? An increased
self-awareness perhaps, or an acceptance and understanding of a broad
range of views, even if they contradict your own? If these factors do
indicate a certain maturity, then I would argue that the International
Semantic Web Conference series has come of age.

Last year’s event in Busan, Korea felt like a watershed moment, with
an increasing focus on practical applications that exploited Semantic
Web technologies, in addition to the highly theoretical papers
typically seen at events of this sort. This year’s conference in
Karlsruhe, Germany, and the seventh in the series overall, maintained
this momentum. But more so than previous years I detected a subtle
change in the mood of the conference. In addition to a tangible sense
of excitement that the Semantic Web was getting ready for the
mainstream, I detected a certain pluralism within the community,
manifested as a greater openness to divergent views and an increase in
attention to topics that might have previously been overlooked.

This willingness to express and accept divergent views was apparent to
me no more so than in the panel titled “An OWL too far?”. This
discussion saw senior members of the Semantic Web community openly
challenge each others views on the proposed second version of OWL, the
Web Ontology Language. Perhaps the views held by the likes of Stefan
Decker, Frank van Harmelen and Ian Horrocks have always been divergent
on this issue, but seeing the differences of opinion aired so openly
was a new experience for me. Far from indicating a damaging lack of
unity in the field, I read this as a clear sign that the community can
engage in open and constructive debate without throwing the toys out
of the pram.

Earlier in the week I had sat on a similarly provocatively titled
panel in the OWL Experiences and Directions workshop – titled “How
might OWL fail?”. As a relative outsider I decided to focus on the OWL
community’s need to improve its marketing and demonstrate its
relevance to the wider world, and expected a degree of hostility to
this message. Instead I sensed a slight deflation at the criticism
that was quickly followed by a desire to engage with the problem and
actively address it.

Perhaps the most powerful sign of how far the Semantic Web community
has come was in the entries to the annual Semantic Web Challenge. This
year the contest had two tracks: the Open Track, which is analogous to
the regular challenge in previous years and has a more established set
of judging criteria; and the Billion Triples Track, an attempt to
stimulate people to generate value from and add value to increasingly
large data sets, with the definition of what constitutes “value” being
more open-ended.

The quality in both tracks was exceptionally high, but one feature
that ran through most of the finalists struck me in particular – the
emphasis on the user experience. Previous challenges have always
attracted user-oriented applications as well as backend technologies,
but this year felt different. Whether the application was supporting
personal aggregation of one’s distributed information, as in the Open
Track winner Paggr; enabling location-oriented browsing of the
Semantic Web on a mobile phone, as in DBpedia Mobile, which took
second place in the Open Track; or providing structured browsing over
billions of RDF triples, as in SemaPlorer, winner of the Billion
Triples Track; the vast majority of entries recognised the need to
both add value to the data *and* provide a compelling user experience
over this.

For me this indicates not just an awareness but an acceptance on the
part of the Semantic Web community that no amount of research and
development at the backend will make a difference if clear user
benefits are not delivered. If this serves as evidence that the ISWC
series has come of age, then I would argue that along with it so has
the Semantic Web community at large. It may have taken some time, but
I have no doubt that this maturity has been earned.