Subscribe

Augmenting Last.fm Data with BBC data on the Talis Platform

A short while back, I created a Linked Data wrapper on the Last.FM API for Events and Artists. The artist data links to the BBC’s data about each artist using owl:sameAs.

Now that the BBC RDF is available in a Talis Platform store, I can put some of my Last.FM data into a store (it’s currently generated on the fly from the Last.FM API), search on it, and then augment it with data from the BBC.

So I put some Last.FM data into the Sandbox1 store.

Now I can search on it with the items query endpoint like:

http://api.talis.com/stores/sandbox1/items?query=Black

This gives us the results as RSS 1.0, which is also RDF/XML, and contains a graph with 12 resources in it.

We can now pass the URI of this (or any RSS 1.0) document to the BBC-Backstage store’s Augment Service like this:

http://api.talis.com/stores/bbc-backstage/services/augment?data-uri=http%3A%2F%2Fapi.talis.com%2Fstores%2Fsandbox1%2Fitems%3Fquery%3DBlack

The Augment service will look at the URIs in the RSS results, and add DESCRIBEs for any of those URIs that it finds in its own store, giving you back the RSS augmented with BBC data.

So the graph we get back now contains 15 resources, where the BBC-Backstage store has found descriptions for 3 of the URIs in the original RSS.

For further information, see Leigh Dodd’s slides on Getting Started with the Talis Platform.

Metamorph Open Source project for Semantic Converter Web Service

I’ve published the code behind the Talis Convert Service (production release at stable URL coming soon) as an open source project on Google Code, called Metamorph .

Metamorph is a service aimed at semantic web developers. It is much like triplr, babel, swignition and any23 (please leave a comment pointing to any other similar services).

You give it a(n http) URI, an (optional) input format, and an output format, and it will fetch the document from the web, and convert it into the output format.

Understood input values include:

  • Semantic HTML (RDFa, eRDF, microformats, POSH)
  • RDF (XML, Turtle, JSON)
  • SPARQL-XML
  • Facet XML (the response format of the facets service available on all platform stores)

Output for all input formats can be:

  • JSON
  • JSONP
  • HTML

If the input is some form of RDF, you can also ask for:

  • RDF (XML, Turtle, JSON, - and the default HTML is rendered as RDFa)
  • RSS 1.0
  • TriX
  • Exhibit (web page, JSON, JSONP)

In addition, if the input is an RDF format, you can specify multiple data URIs, and the results will be merged in the output document. For instance, this conversion merges data from two of my homepages, and a Turtle file.

I’m thinking about removing the TriX output, as I’m not sure it would be used by anyone - the reason I didn’t bother to write a parser for it was because I haven’t seen any data published as TriX in the first place.

I welcome any input on what else would be useful from this web service. I suspect that more output options, while fairly easy to add, would not be very useful. More input options may be useful, but perhaps not significantly so.

I suspect what might be more useful, and more likely to distinguish this from similar RDF converter services, are graph transformation services, which might include:

  • Diffs
  • Intersects
  • Smushing
  • Augmenting on property and class type URIs with labels and comments, perhaps retrieved from SchemaCache

Metamorph is coded in PHP, and uses ARC for parsing RDF and HTML, and serialising RDF/XML and Turtle.

Please use the issue tracker for raising any bugs or feature requests.

Store Admin Interface

If you have a Talis store, or even if you’re just interested in browsing around existing talis stores, you might be interested in an admin interface  I’ve been working on.

Once you have selected a store, you can browse resources by type (rdf:type), search across the contentbox index, edit resources, view pending jobs and send new ones, import data, and configure the field-predicate mapping for your stores.

Please send bug reports and feature requests to keith dot alexander at talis.com

If you do want a talis store, just ask in #talis on irc.freenode.net, or email danny dot ayers  at talis.com

Batch Changesets ARC Plugin

Platform Release 12 included a very useful new feature: the ability to send more than one changeset in a single POST to your store.

To generate a batch changeset from 2 versions of an RDF graph, you can use an ARC plugin called Talis_ChangeSetBuilderPlugin.

To use it:


	  $args = array(
			'before' => $before, //can be rdf/xml, turtle, or an ARC simpleIndex array
			'after' => $after,  //can be rdf/xml, turtle, or an ARC simpleIndex array
		);
		$cs = ARC2::getComponent('Talis_ChangeSetBuilderPlugin', $args);
		$cs_response = $store->get_metabox()->apply_versioned_changeset($cs); 

The plugin also relies upon the IndexUtils Plugin. The easiest way to get them all set up is to change to your arc directory and do:


svn co http://n2.talis.com/svn/playground/kwijibo/PHP/arc/plugins/trunk/ plugins

Tutorial: jQuery and the Talis platform

We will use the jQuery.Talis plugin to create a simple html+js interface to a talis store.

the Talis plugin is a small wrapper that simplifies retrieving json from the platform remotely (via jsonp). It allows you to query the platform, and specify callback functions for dealing with the retrieved data.

We’ll have a text box to type a search string into; this will retrieve results (of matching resource descriptions) from the platform store, and display them in a list of links. Clicking on the links will display the resource description.

1. The HTML:

We are going to need three elements for this:

  1. A text input for typing the search strings into:
    <label for="search">Search<input type="text/submit/hidden/button" name="search" id="search"/></label>
  2. A list to insert the search results into.
        <ol id="results"></ol>

    and:

  3. A div to display the resource descriptions in:
        <div id="description"></div>

2 The Javascript

At the command line, switch to the directory you saved your HTML file in, and do:

    svn co http://n2.talis.com/svn/playground/kwijibo/js/Talis.jQuery.plugin/trunk/ js/

Now we link to the javascript files from the bottom of the <body> of our html page:

<script type="text/javascript" charset="utf-8" src="js/jquery.js" mce_src="js/jquery.js"></script>
<script type="text/javascript" charset="utf-8" src="js/Talis.jQuery.js" mce_src="js/Talis.jQuery.js"></script>
<script type="text/javascript" charset="utf-8" src="js/jsRDF.jquery.js" mce_src="js/jsRDF.jquery.js"></script>

(jsRDF.jQuery.js is just a small, nascent library for manipulating RDF/JSON )

Now open another script tag, and we’ll write some javascript to connect our html with the platform:

First, we’ll declare some variables we’ll want to use:

var RSS_ITEM = 'http://purl.org/rss/1.0/item';
var RDF_TYPE = 'http://www.w3.org/1999/02/22-rdf-syntax-ns#type';
var MY_STORE = 'schema-cache';

For this tutorial, I’m using the schema-cache store, which contains many RDF and OWL vocabularies.

Now, what we want is to query the platform when we type in the text box, so:

$("#search").keyup(function(){
    var query = $("search").val();
    $.Talis.Store(MY_STORE).items(query, function(json){
        /*  we do something with the json data from the platform in here ... */
    });
});

What’s happening here, is we are taking the text that has been typed in the textbox (#search), and querying the items service of our store with it. The second parameter of the Store.items method is a callback function, in which you can specify what to do with the data when it is retrieved.

The platform items service returns the results in an RSS feed, which the jQuery.Talis plugin fetches for us in rdf/json, however, for this, we only want the items of the feed, not the RSS feed resource itself, so we need to filter in only the resources that have rss:Item as a value of their rdf:type property:

    var RDF = $.jsRDF(json);
    var rss_items = RDF.filter({p:RDF_TYPE, o:{value:RSS_ITEM}});

Here, we are loading the data into a jsRDF object, which has methods for manipulating it. We’re using the filter method to select the resources that have an rdf:type of rss:Item. Now we want to render them in the page inside our #results list:

$.each(rss_items, function(uri, properties){
    $("#results").append('<li><a href="'+uri+'" mce_href="'+uri+'">'+RDF.get_label(uri)+'</a></li>');
});

OK. We want clicking on those links to show the resource description, so we’ll define a function for retrieving that description from the store, and rendering it, then we’ll bind it the onclick event of the links in the results list:

function browse(uri){
    var uri = this.href;
    $.Talis.Store(MY_STORE).lcbd(uri, function(data){   

        var RDF = new $.jsRDF(data);
        $('#description').html( RDF.to_html(uri) );
        $('#description dd a').click(browse);
    });
    return false;
}

We get the URI of the resource from the @href attribute (which we set when we were rendering the search results), then we call the lcbd method on our store (LCBD is short for labelled concise bounded description, and returns the properties of the resource, and labels for all the resources our description references). Again, we use the $.jsRDF object to render the description as html (it uses a definition list for rendering the properties of the resource).

After we’ve rendered the description in the #description div, we also bind the click event on the links to the resource’s properties to the browse function, so that clicking on those links will retrieve and render the resources being linked to.

And that’s pretty much it.

Talis Store Plugin for ARC

The PHP coders amongst you may be interested in a Talis Store Plugin. To install it:

cd arc/plugins #yoru ARC plugins directory

svn co http://n2.talis.com/svn/playground/kwijibo/PHP/arc/plugins/trunk/talis/ talis
svn co http://n2.talis.com/svn/playground/kwijibo/PHP/arc/plugins/trunk/ARC2_SPARQLSerializerPlugin/ARC2_SPARQLSerializerPlugin.php ARC2_SPARQLSerializerPlugin.php

Then to use it:

require_once '../ARC2.php';   

/* configuration */
$talis_config = array(
  // 'db_user' => 'your_username',
  // 'db_pwd' => 'your_password',
  'store_name' => 'kwijibo-dev3', // your store name
   'fetch_graphs' => false, // If set to true, using FROM will fetch the graph as a datasource over the web, and store it in /meta
);
$store = ARC2::getComponent('Talis_StorePlugin', $talis_config);
$store->query("LOAD ")

What this does is let you use a Talis store instead of the ARC mysql store. It supports a subset of ARC’s SPARQL+ functionality. Specifically, it supports INSERT and DELETE (which I could translate to Changesets thanks to Benji’s SPARQL parser), but not the aggregate functions (which I don’t see a way to support in a client-layer at this point).

Some differences:

Named Graphs are currently a bit different in Talis stores - you can’t (yet) create your own on the fly as you can with ARC, so LOAD will put the data into the public graph by default.

Talis platform transforms bnodes into URIs, so .

I also added a few methods to the api:

$store->import($arc_store);
$store->export($arc_store);

(The idea is that you can move data between an ARC store and a Talis store).

I also added a $store->change($before_rdf, $after_rdf) method for submitting changes to an RDF graph.

It’s quite interesting comparing the two different ways of making changes (changesets and SPARQL+). I think that changesets (especially with the coming Batch Changeset support) are maybe a bit more amenable to programmatic resource updates from forms and the like. However, changesets are a bit verbose to hand-write for making quick edits and testing stuff, or pattern-based changes, and I’m finding SPARQL+ really handy for stuff like this.

What I’ve been thinking would be pretty neat would be if the SPARQL parser could be a bit more user extensible, and pre-query hooks could be set up (like ARC’s triggers, which happen post-query), so that plugin/hook writers could extend the SPARQL functionality, or just do stuff pre-query. Use cases might include:

  • rewriting SPARQL for performance improvement, or access control
  • pre-fetching data from FROM graphs over the web and adding it to the store (you can set a ‘fetch_graphs’=> true parameter in the config array you set up the talis store with, and it will do this)
  • adding versioned changesets to the ARC store
  • inventing new keywords - eg: ABOUT <http://example.org/foo> could be rewritten to DESCRIBE ?s WHERE {{ ?s rdf:subject <http://example.org/foo> } UNION {?s cs:subjectOfChange <http://example.org/foo> } } - Similarly you could add syntactic support for rollbacks, transactions, updates

You can see more usage examples at: http://n2.talis.com/svn/playground/kwijibo/PHP/arc/plugins/trunk/talis/Talis_StorePlugin.demo.php

Noodling with Atom/RDF

Now that GRDDL’s a Recommendation, it’s about time we started using it. One particular bit of (potentially) low-hanging fruit is Atom (RFC 4287) - cleanly specified XML, well deployed for bloggish content syndication and increasingly having interesting extensions shoehorned in.

Anyhow, more on that some other time. I finally got around to trying a long-standing item on my to-do list: RDFize the wonderful Planet Venus aggregator. I reckon a persistent, queryable store of interesting subscriptions is a must-have part of any respectable personal knowledgebase. I haven’t time right now to go into detail on how it works (and in it’s current form you probably wouldn’t want to know), but basically these minimal Python scripts transform Venus’s Atom cache into RDF/XML and post the result of to a Talis Platform store (after first checking the entry isn’t already in the store). So far I’ve got it working enough to make some data available for SPARQling.
If you go to this SPARQL Query form, select the “twitcrit reviews” endpoint with the dropdown and enter a query like this:

PREFIX ar: <http://djpowell.net/schemas/atomrdf/0.3/>

SELECT DISTINCT ?entry ?tp ?title ?cp ?content
WHERE {
[
a ar:EntryInstance ;
ar:entry ?entry;
ar:title [ ?tp ?title ] ;
ar:content [ ?cp ?content ]
]
}
LIMIT 10


- you should see some results.


Next steps are to set up some local caching (thinking of just keeping a list of cache filenames) and turning it over to use the Changeset Protocol rather than the basic unversioned model posts it’s doing now. Once those are in place I’ll make a cron job for it.

There are quite a few different atom2rdf XSLT’s in circulation, the current best-bet frontrunner being one atom2rdf-18.xsl from David Powell, so I used that. Here’s the Venus install, I just pulled out a bunch of the semweb related feeds from my Bloglines subscriptions (note that I cleared the cache earlier today, there was way too much stuff in it for testing).

jQuery.Talis

jQuery.Talis is a plugin for the popular javascript library jQuery. It acts as a wrapper around the talis convert service, for retrieving json, through jsonp, from the Platform.

You can read about it on the n2 wiki and download it from the n2 svn.

You can use it like this:


$.Talis.Store('schema-cache').sparql('DESCRIBE ',
      function(data){
        $("#Person h1").html(data['http://xmlns.com/foaf/0.1/Person']['http://www.w3.org/2000/01/rdf-schema#'][0].value);
});

(This would fetch a description of the foaf:Person class from the http://api.talis.com/stores/schema-cache store, and insert the rdfs:label into the DOM.)

I don’t want to declare this stable yet, but it is usable in it’s current form (I use it in the SIOC Comments Widget). The size currently comes in at ~4k without any compression or minification. So far, it’s only been tested in Firefox, Safari and Opera, so reports of cross-browser problems, and any other bugs, would be appreciated.

Which Store to SPARQL?

We’ve got quite a lot of different stores in the Talis Platfom, some of which have some pretty interesting data. The question is, what’s in them? A while ago, I polled all the stores in the platform (you can get a list as HTML or RDF at http://api.talis.com/stores) for some basic stats on the rdf:types and predicates in each store, and saved them in the silkworm-dev store.

It just occurred to me that, using (for example), ARC’s standalone SPARQL parser, I ought to be able to parse a query, and generate another query for the silkworm-dev store, to find a list of stores that you could run that query on and get some data back.

I guess this will get even more interesting when we add Store Groups into the mix (a coming-feature, where you can query a group of stores at once).

I’ll have to try it sometime soon :)

Platform release 9 is Live

We have successfully released Platform Version 9 into the live environment this evening. The release went smoothly with no problems and was completed between 18:00GMT and 18:22GMT. The outage also included a complete restart of the metadata store to include some performance tuning modifications.

Release notes for version 9 can be found here

If you have any issues or problems please do not hesitate to contact us through one of the usual channels.