Subscribe

Augmenting Last.fm Data with BBC data on the Talis Platform

A short while back, I created a Linked Data wrapper on the Last.FM API for Events and Artists. The artist data links to the BBC’s data about each artist using owl:sameAs.

Now that the BBC RDF is available in a Talis Platform store, I can put some of my Last.FM data into a store (it’s currently generated on the fly from the Last.FM API), search on it, and then augment it with data from the BBC.

So I put some Last.FM data into the Sandbox1 store.

Now I can search on it with the items query endpoint like:

http://api.talis.com/stores/sandbox1/items?query=Black

This gives us the results as RSS 1.0, which is also RDF/XML, and contains a graph with 12 resources in it.

We can now pass the URI of this (or any RSS 1.0) document to the BBC-Backstage store’s Augment Service like this:

http://api.talis.com/stores/bbc-backstage/services/augment?data-uri=http%3A%2F%2Fapi.talis.com%2Fstores%2Fsandbox1%2Fitems%3Fquery%3DBlack

The Augment service will look at the URIs in the RSS results, and add DESCRIBEs for any of those URIs that it finds in its own store, giving you back the RSS augmented with BBC data.

So the graph we get back now contains 15 resources, where the BBC-Backstage store has found descriptions for 3 of the URIs in the original RSS.

For further information, see Leigh Dodd’s slides on Getting Started with the Talis Platform.

Using Twinkle to SPARQL the Platform

A few years ago I wrote Twinkle, a simple GUI interface for working with SPARQL. While its not the most polished of user interfaces and its in sore need of an update, it’s still serviceable and has been successfully used as a development tool by teams of engineers I’ve worked with in the past.

I gave a short talk on Twinkle at an Oxford SWIG meeting, so you can flick through the slides if you want a quick overview of the functionality. I also moved the code to a google code project to start the process of updating it

Twinkle has the capability to work with a range of different data sources and includes a full SPARQL client, so you can use it to work with any SPARQL endpoint that is accessible from your desktop. Out of the box Twinkle is already configured to work with the Govtrack and DbPedia endpoints, but you can easily add more by changing the configuration.

If you download and unzip the distribution into a directory you should end up with an etc/config.n3 file. This file contains all of the configuration that drives the user interface, including a section that configures remote SPARQL endpoints, e.g:


<http://dbpedia.org/sparql> a sources:Endpoint
    ; sources:defaultGraph "http://dbpedia.org"
    ; rdfs:label "DBpedia.org".

<http://www.rdfabout.com/sparql> a sources:Endpoint
    ; rdfs:label "GovTrack.us".

The above snippet configures two remote endpoints, and applies labels to them so that they appear in the Twinkle UI, under the “Remote Services” section on the left-hand menu. Because some endpoints, such as DbPedia, require to specify a default graph in the SPARQL protocol request, you can also specifiy that in the configuration if necessary.

If you have a Platform Store, or just want to access some data held in the Platform, then you can use Twinkle to perform your SPARQL queries. For example I have a store containing NASA space flight data. The SPARQL endpoint for this store is at:

http://api.talis.com/stores/space/services/sparql

So to register this in Twinkle, I can edit the configuration file and include the following snippet:


<http://api.talis.com/stores/services/sparql> a sources:Endpoint
    ; rdfs:label "NASA Space Data".

Once you’ve restarted the UI you should now be able to click on the Remote “NASA Space Data” service and open up a window into which you can start executing SPARQL queries.

If you’re new to SPARQL, or are interested in playing with the above space data, then you can look over the following slides from a recent SPARQL training session that I ran:


By rob

The slides contain a number of sample queries that should help get you started. Unfortunately some of the diagrams don’t look great in slideshare, but you should be able to download them for a closer look.

Authoring RDF data with SPARQL

Yesterday Yves Raimond and I presented a tutorial at WOD-PD where we created some turtle data and used my online semantic converter tool to convert the data to RDF/XML and POST it to the platform store we set up for the tutorial (wod-pd-sandbox).

In fact though, every SPARQL endpoint that supports CONSTRUCT is already a turtle -> rdf/xml converter. You can write Turtle with no variables in the CONSTRUCT graph, leave the WHERE graph pattern empty, and you will get back RDF/XML.

eg:

PREFIX ex: <http://example.org/>
CONSTRUCT {
  ex:Jimmy ex:eat ex:World .
}
 WHERE {}

returns

<rdf:RDF
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:ex="http://example.org/" >
  <rdf:Description rdf:about="http://example.org/Jimmy">
    <ex:eat rdf:resource="http://example.org/World"/>
  </rdf:Description>
</rdf:RDF>

You can also use CONSTRUCT to create new data inferred from existing data. For instance, I wanted to add some triples about the conference, and I knew that everyone in the store with a URI in the store’s own namespace had been following the tutorial, and so was also attending the conference. So I made this query, and then POSTed the results into the store:

           PREFIX schema: <http://api.talis.com/stores/wod-pd-sandbox/items/Schema/>
	PREFIX sandbox: <http://api.talis.com/stores/wod-pd-sandbox/items/Things/>
	PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
	PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
           PREFIX owl: <http://www.w3.org/2002/07/owl#>

	CONSTRUCT { 

		schema:Conference a rdfs:Class ;
		rdfs:isDefinedBy schema: ;
		rdfs:label "Conference" .

		schema:startDate a rdf:Property ;
			rdfs:isDefinedBy schema: ;
			rdfs:label "start date" .

		schema:endDate a rdf:Property ;
			rdfs:isDefinedBy schema: ;
			rdfs:label "end date" .

		schema:attendee a rdf:Property ;
			rdfs:isDefinedBy schema: ;
			rdfs:label "attendee" ; owl:inverseOf schema:attended .

		schema:attended a rdf:Property ;
			rdfs:isDefinedBy schema: ;
			rdfs:label "attended"; owl:inverseOf schema:attendee .

		sandbox:WOD-PD a schema:Conference ;
		           rdfs:label "Web of Data" ;
		           schema:startDate "2008-10-22" ;
		           schema:endDate "2008-10-23" ;
					   schema:attendee ?person .
		?person schema:attended sandbox:WOD-PD .
}  WHERE
{
	?person a <http://xmlns.com/foaf/0.1/Person> .

           FILTER(REGEX(STR(?person), "sandbox/items/People/"))
}

I used PREFIX to declare a prefix for a couple of namespaces with the store’s contentbox URIs - this meant that these URIs would dereference and work as Linked Data - 303ing to their RDF descriptions. This is a really nice feature of the platform, and makes it easy to mint new URIs that will play nice on the semantic web.

You might also have noticed that there are some new properties and classes defined there in the CONSTRUCT. This isn’t absolutely ideal - there is no documentation, and the terms are unlikely to be used again - but on the other hand, the descriptions are dereferencable according to the principles of linked data, and just as persistent as the data they describe. Moreover, as Richard Cyganiak said today - if you worry about doing RDF ‘right’ to the extent that it stops you doing RDF, you’re not doing it right.

Tutorial: jQuery and the Talis platform

We will use the jQuery.Talis plugin to create a simple html+js interface to a talis store.

the Talis plugin is a small wrapper that simplifies retrieving json from the platform remotely (via jsonp). It allows you to query the platform, and specify callback functions for dealing with the retrieved data.

We’ll have a text box to type a search string into; this will retrieve results (of matching resource descriptions) from the platform store, and display them in a list of links. Clicking on the links will display the resource description.

1. The HTML:

We are going to need three elements for this:

  1. A text input for typing the search strings into:
    <label for="search">Search<input type="text/submit/hidden/button" name="search" id="search"/></label>
  2. A list to insert the search results into.
        <ol id="results"></ol>

    and:

  3. A div to display the resource descriptions in:
        <div id="description"></div>

2 The Javascript

At the command line, switch to the directory you saved your HTML file in, and do:

    svn co http://n2.talis.com/svn/playground/kwijibo/js/Talis.jQuery.plugin/trunk/ js/

Now we link to the javascript files from the bottom of the <body> of our html page:

<script type="text/javascript" charset="utf-8" src="js/jquery.js" mce_src="js/jquery.js"></script>
<script type="text/javascript" charset="utf-8" src="js/Talis.jQuery.js" mce_src="js/Talis.jQuery.js"></script>
<script type="text/javascript" charset="utf-8" src="js/jsRDF.jquery.js" mce_src="js/jsRDF.jquery.js"></script>

(jsRDF.jQuery.js is just a small, nascent library for manipulating RDF/JSON )

Now open another script tag, and we’ll write some javascript to connect our html with the platform:

First, we’ll declare some variables we’ll want to use:

var RSS_ITEM = 'http://purl.org/rss/1.0/item';
var RDF_TYPE = 'http://www.w3.org/1999/02/22-rdf-syntax-ns#type';
var MY_STORE = 'schema-cache';

For this tutorial, I’m using the schema-cache store, which contains many RDF and OWL vocabularies.

Now, what we want is to query the platform when we type in the text box, so:

$("#search").keyup(function(){
    var query = $("search").val();
    $.Talis.Store(MY_STORE).items(query, function(json){
        /*  we do something with the json data from the platform in here ... */
    });
});

What’s happening here, is we are taking the text that has been typed in the textbox (#search), and querying the items service of our store with it. The second parameter of the Store.items method is a callback function, in which you can specify what to do with the data when it is retrieved.

The platform items service returns the results in an RSS feed, which the jQuery.Talis plugin fetches for us in rdf/json, however, for this, we only want the items of the feed, not the RSS feed resource itself, so we need to filter in only the resources that have rss:Item as a value of their rdf:type property:

    var RDF = $.jsRDF(json);
    var rss_items = RDF.filter({p:RDF_TYPE, o:{value:RSS_ITEM}});

Here, we are loading the data into a jsRDF object, which has methods for manipulating it. We’re using the filter method to select the resources that have an rdf:type of rss:Item. Now we want to render them in the page inside our #results list:

$.each(rss_items, function(uri, properties){
    $("#results").append('<li><a href="'+uri+'" mce_href="'+uri+'">'+RDF.get_label(uri)+'</a></li>');
});

OK. We want clicking on those links to show the resource description, so we’ll define a function for retrieving that description from the store, and rendering it, then we’ll bind it the onclick event of the links in the results list:

function browse(uri){
    var uri = this.href;
    $.Talis.Store(MY_STORE).lcbd(uri, function(data){   

        var RDF = new $.jsRDF(data);
        $('#description').html( RDF.to_html(uri) );
        $('#description dd a').click(browse);
    });
    return false;
}

We get the URI of the resource from the @href attribute (which we set when we were rendering the search results), then we call the lcbd method on our store (LCBD is short for labelled concise bounded description, and returns the properties of the resource, and labels for all the resources our description references). Again, we use the $.jsRDF object to render the description as html (it uses a definition list for rendering the properties of the resource).

After we’ve rendered the description in the #description div, we also bind the click event on the links to the resource’s properties to the browse function, so that clicking on those links will retrieve and render the resources being linked to.

And that’s pretty much it.