Nodalities

From Semantic Web to Web of Data
Nodalities

Updates

Follow us on:

Categories

Archives

License

Creative Commons License

Author Archive

Wikileaks and the Guardian

I spoke with the Guardian’s Simon Rogers, editor of the Data Blog, about their decision to publish thousands of facts from the Wikileaks Afghan War Diary. In this podcast, Simon introduces Wikileaks and its use by journalists, an reiterates the Guardian’s strategy of publishing raw data alongside stories and comment. During the conversation, Simon explained his perspective on publishing these leaked data and what people can do with it, pointing out that the Guardian doesn’t put any restrictions on reuse of the facts.

One of the major applications of these raw data, especially anything containing geographical information, is the ability to visualise them. One of the first things the Guardian produced from the leaked data was an interactive map of Improvised Explosive Device incedents affecting troops and civilians.

The opening up of the data behind such applications could prove to be a powerful catalyst for wider visualisation and applications built around the presence of authoritative journalistic facts. Putting the raw data in the hands of the web’s hackers has been a bold move from the Guardian, and I hope to see new and better stories come from the tools made possible by a supply of useful information.

Open Day: Linked Data and Health

We’ve seen and reported on the rise of Linked Data from concept to practice, and our Open Days have been a great opportunity to explore and explain Linked Data very broadly. The broad discussions have allowed many people to imagine using semantics with their own data, as publishers, developers, information architects etc. across many different industries and applications. But one area in which we are particularly interested is health.

Biomedical science is full of structured and semi-structured information, much of which crosses the organising boundaries we’ve created for it. Every aspect of medical practice, research and policy makes use of (and in most cases creates supplementary) information, and it’s become plain that much of this data is stored, hidden and often unaccessible.

I attended some sessions on biomedical semantics at SemTech last month, and was hugely intrigued by the state of health data world-wide. There are many usable ontologies for medical science, for example, which show the relationships among biological knowledge and clinical use; but much of the data used on the front line is not part of this structure. There seems to be much that could be gained from taking a Linked approach to these data!

Mark Birbeck and Dr Michael Wilkinson, in last month’s Nodalities Magazine introduced the idea of “A Linked Data Platform for Innovation,” a project of the National Innovation Centre for joining clinicians to linked visualisations through a widget-like, Linked Data platform:

The NIC is committed to using Semantic Web technologies as a way to significantly improve the speed and quality of decision- making in the area of health technology innovations.

So, we’ve decided to join forces with some of these minds and host an event to explain and explore biomedical data. We’ll be at No 76 Portland Place on 19th August from 10AM to 4PM. We’ve invited Dr Nigam Shah from Stanford University to talk to us about the state of global health data, and to suggest several ways in which linking can be done in the very near future. We will also cover the topic of Linked Data (what it is, and how it works), as well as taking a quick look at how it’s being used across the web already. The people behind the NIC’s clinical widget platform will also be there to introduce their project.

Places are free of charge, but limited so make sure to sign up to reserve your place.

We’d very much like to keep the spirit of an Open Day. This event is open for discussion, examination and exploration of using the Semantic Web in life sciences, so come armed with ideas, questions and problems!

Talis will be putting on lunch, and we will also have a ready supply of coffee on hand to help the discussions.

Image: “Science is Knowledge” by Zach Beauvais, is a mashup of “3D Stone Cells” by BlueRidgeKitties, and “Glass Bottles I” by Tim O’Brien via flickr. They are used under CC: BY, NC, SA licenses.

Facebook: David Recordon talks with talis about the Social Graph

We’ve covered the launch of Facebook’s Open Graph protocol in Nodalities Magazine, discussing its potential impact on Linked Data. So, I invited David Recordon—Facebook’s Senior Open Programs Manager—to talk with Talis about Facebook and the Open Graph Protocol. We ended up talking all about the protocol, how developers can make use of it (and why), as well as touching on Facebook’s view of social networking as a graph.

The Open Graph Protocol page has information about the protocol itself. Facebook’s f8 developers’ conference site also has links with more information for developers.

SemTech quick notes

I’m here in San Francisco at the Semantic Technology conference with a cohort of Talisians and a bunch of the world’s Semantic Web companies and thinkers. I’ll pull together some various posts from the things happening here, but if you wanted to follow my more raw and unpolished notes, mostly from sessions I’m attending, you can have a look at my tumblog. I’m not promising to cover everything, but I’ll have a go ;)

Push-Data: Alex Passant talks about sparqlPUSH

Photo of Alex PassantThis year, Talis sponsored the final Scripting for the Semantic Web challenge at the Extended Semantic Web Conference (ESWC). The winners of the challenge were Alexandre Passant and Pablo Mendes for their sparqlPUSH project. sparqlPUSH brings an element of real-time to working with Linked Data. Instead of needing to poll for new data periodically, sparqlPUSH works alongside PubSubHubbub to effectively push data out to where you need it. I spoke with Alex Passant at the conference about the challenge, real-time data, and sparqlPUSH. Alex also wrote about the scripting challenge on his blog.

Extending the Semantic Web (from Crete, with love)

This is my first year attending the ESWC (formerly “European Semantic Web Conference” now the “Extended Semantic Web Conference,” cleverly, the acronym still works) near Heraklion on Crete. It’s only a couple days in, but I thought it’d be a good time to report back to the Nodalities readers. ESWC is a gathering of some of the world’s most influential Semantic Web thinkers, and for me It’s been a few days of meeting people in the flesh with whom I’ve been in touch online for years. As one bloke put it: “What’s kept you away?”

Well, I’m extremely glad I’ve not been kept away this year, and have been excited to see what’s been built recently. ESWC is a very academic conference; indeed I’m quietly auditing the PhD Symposium as I type this. There are papers, PhD symposia, demos and expositions on topics covering anything from ontology development to MapReduce processing of RDF triples. It seems a very fertile seedbed, with many of these ideas having the potential of growing into projects, startups, papers and possibly industries.

I’ve made a subtle and largely subconscious transition by blogging mostly about projects that are up and running. This has been important because the Semantic Web world is no longer one of “someday,” but a world of current and continuous activity. So, I’ve talked about visualisations of data, products running on Linked Data, data.gov.uk et.al.; and I’ve held back on discussing purely possible. It’s been exciting and uplifting to see the conceptual evolve to the proven and working. But this is a reflection of progress—of moving from hypothesis to implementation. It doesn’t mean the concepts have stopped flowing. It’d be a very short story in the history of human communication if the Semantic Web has used up all of its possibilities in ten years!

ESWC is a little microcosm of the wider research going on in Linked Data and related fields. It seems to me that Big Ideas need the traditional frameworks of academic investigation. Questions need to be asked and answered and debated and tried and broken and rebuilt. Much of this science will not become technology, and this is wholly acceptable because it gives the Big Ideas a lot of scope to be refined.

ESWC is just such a place. PhD students and researchers fill the schedule with proposals and reports, and many possibilities are being constantly debated around coffee, beer, and the beach. It’s been a thoroughly fascinating few days, and I’m very much looking forward to more over the next few.

As a quick note, Talis sponsored the Scripting for the Semantic Web challenge for this, its final year. Alexandre Passant and Pablo Mendes won the prize with SPARQLpush.

Open Day Roundups

Well, we’ve had scores of people attend Platform Open Days now. Some have come to the Talis Offices in Birmingham, and others have joined us in Manchester and London. We’ve had a lot of fun, and some fascinating discussions, and I’m very much looking forward to the next one (16th June, in London).

Many people have asked whether the full slides can be found anywhere, so I thought I’d do a quick round-up of the slides, and share them as images on flickr to make it even easier to follow along.

Just follow the links from the images below to a slideshow of the talk.

Here’s the Introduction to Linked Data, covering who Talis is, RDF, and how to Identify, Describe and Respond:

Here’s the Overview of the Talis Platform, explaining our RESTful API, data storage and SPARQL endpoints:

And here are the slides for our introduction to SPARQL—complete with spaceships:

Richard Wallis’ talk about Linked Data in Action can be seen over here, with more details and a dedicated Screencast.

Facebook and the Open Graph: good for Linked Data?

|This post will feature in Nodalities Magazine issue 10.


In April, I was watching the twitterverse explode during the Facebook’s f8 conference, as a steady stream of links and gasps and applause and intentions to delete profiles poured out. My initial reaction to quickly-scanned third-hand reports was essentially: “Oh no.” The message I was getting was that personal information would be be made more public, and that more places would start sporting the little fb: “like” box you see on sites using Facebook Connect. I was concerned because there have been many conflicting messages around facebook and privacy, and this movement to include a wider presence online would essentially pull more people into a huge walled garden.

Watch the f8 video sessions, though, and some interesting things begin to emerge. The main announcement at f8 is an update to Facebook’s API. Indeed, it wasn’t so much an update as a rewrite, moving from an older and complicated SOAP architecture with Facebook Connect to a more RESTful approach, giving services a simpler and more straightforward way to interact with content within Facebook using http. OK, so this isn’t particularly ground-breaking, nor is it very exciting in itself. What is far more interesting, is that Facebook’s engineers are using this word graph to describe their ecosystem with the launch of the Open Graph API.

Firstly, their new Open Graph API is built to enable social plugins which let users on other sites pull content into facebook. So, little “like” boxes will let someone authenticated on facebook but viewing, for example, a movie site click to identify a film they, well… like. This information is recorded on their Facebook profile. But the interesting thing here is that the social plugin is identifying items and objects within pages, and the engineers who introduced the plugins are talking about linking to these individual things. They identify the fact that when someone indicates that they like a movie, that’s exactly what they’re doing. They’re not “liking” the page which contains the review, but the film itself. These social plugins tell Facebook that a person is expressing some kind of relationship with objects from the wider web. They talk about these people (Facebook users), and the things they’re interacting with (content, objects… things), as existing like points in a graph. There are people are objects, and there are relationships between these nodes. In essence, they’re talking about linking data.

To make this possible, they’ve written the Open Graph Protocol, based on RDFa. Site owners can begin marking-up their content, flagging their pages as Open Graph objects, that Facebookers can start to like. The Protocol contains a vocabulary of object types, and addresses physical location and contact information. So, I can now type in a few lines of metadata in my header, and start declaring the objects in my content. It’s all starting to sound very Linked Data, isn’t it? The Tetherless World blog even has a post showing a mapping between the Open Graph Protocol and RDF, exposing metadata to the wider Linked Open Data cloud. The long and short if this is that anyone who wants people to be able to join their content with the ecosystem of facebook users can do so using a very simple semantic markup process.

So, Facebook’s nearly half-a-billion users will soon start to make use of semantic links, and millions of sites will begin to mark-up their content using Linked Data. Indeed, they had a reported 50,000 sites implement social plugins within the first week! This is properly exciting, because it will dump billions of triples out there on the web, and give more developers a boost in dealing with machine-readable information. It hasn’t, however, completely negated my initial feeling about Facebook and the sprinkling of the web with thousands and thousands of likes.

But…

Many people have renewed their concerns of Facebook’s stance on privacy. Some of the Open Graph API-accessible fields are now defaulted to be public until a user opts-out. Marshall Kirkpatrick talks about the vulnerability of this centralisation. Despite the more “open” direction in which the Open Graph points, it’s very clear that all this data—users’ own graphs of likes and relationships—will be a valuable asset, and facebook holds the keys to this personal data. They’ve already begun partnership deals with Microsoft, Yelp and Pandora; so users’ data will start to flow more freely between Facebook-selected organisations.

Liz Gannes over on gigaom points out that facebook is making itself a single point of failure for the web, and illustrates in another post that a facebook outage on 23rd April also took down partner-site plugins. Robert Scoble, while admiring their ambition also points out that the move requires a lot of trust in Facebook. We need to trust facebook with our own personal information, and trust it to look after the information we’re feeding it about our interests, relationships and tastes. It also raises questions about security: the stakes are higher if an account is hacked, or (as happened to the Scobleizer himself) disabled.

My Thoughts:

So, the main impetus for using the Open Graph Protocol is to tie in with the Facebook ecosystem. This is not a Linked Data evangelism project, or the combined efforts of thousands of Semantic Web developers, but the logical move of a huge company to better manage their data. They’ll be creating billions of nodes on a huge, social graph; and for developers the initial purpose will be to join a group, and cash in on a quick win (if you happen to have use for social networking in your app/service/site.) We’ll see that little “like” button appear all over the place, even on sites which seem a bit odd (Share your next auto check-up on Facebook!) as the bandwagon sets off.

This means we’ll start seeing a lot of RDFa-like semantic metadata on pages all over the web, and this increase will be almost exclusively using JUST the Open Graph. But it will mean more folks will be asking for RDFa, and more developers will begin learning it as a strings to their bows. I wonder how long it’ll take before they start asking what else they can do with all this graph-data? Teaching people the value of machine-readable data (through a popular, specific application like Facebook) has the benefit of increasing developer knowledge and inquisitiveness.

This could be catalytic: allowing a rapid change in the direction of the Semantic Web. From a Linked Data perspective, though, I think a lot of this RDFa will be “wasted” as it’s implemented only for the purpose of joining in with the FB sphere, and is under-utilised. But, I think the interesting stuff will emerge as more innovators quickly find the limitations of Facebook’s controlled vocabulary and data-hoarding ambition and begin to see the potential the bigger Graph brings to their repertoires. What happens when thousands of developers are taught something that’s by definition boundless?

So, we’re left with a question of what we’ll build, and what the Linked Data community does in reaction. For my part, I think the most important message to raise is to mix your data freely. When people begin to see the existing ecosystem of Linked Data, and that it’s not just Facebook’s own-branded metadata, we’ll start to see innovative mashing, and thousands of new services. What will you build?

Talking with Pro Tsiavos

Pro Tsiavos Opening up data in the scientific community is something that’s become increasingly important, and getting the licensing and rights in order is a matter of urgent attention. With more and more researchers needing more and more data, there is an increased need for there to be clear information on what can be done with which data.

I spoke with LSE Research Fellow Dr. Prodromos (Pro) Tsiavos, who is researching open licensing, data sharing and publishing data in the public sector and who is also a Legal Project Lead for Creative Commons England and Wales. Pro’s main focus has been with the Cultural commons, and we discussed culture-shifts in science, and how the expectation that data are shared and licensed changes the way research may be done in future.

Enhanced by Zemanta

Open Day… Manchester

LeighSo, the Platform Open Day Roadshow has now begun. We will be doing our first non-Birmingham day up in Manchester on 14th May. It’ll be at the University of Manchester Visitors’ Centre on the penultimate day of Future Everything.

We like to keep the Days Open, meaning we want you to take what you need from them, so make sure to leave us feedback on what you’d like to learn. As a rough overview, we’ll be covering Linked Data including what it means to make Data into “Linked Data”. There will be an overview of RDF, and a tutorial of SPARQL: the query language of the Semantic Web. We will also show you examples and demonstrations of Linked Data in action—apps, mashups and visualisations built on Linked Data.

The Open Days are free of charge, limited to 30 folk (discussion doesn’t seem to happen with larger groups), and we’re putting on lunch. So, make sure to reserve your place here.