Subscribe

Using Moriarty for Serving Linked Data

Although Moriarty is a general purpose library for building applications with the Talis Platform (and tried and tested in Talis Prism and Talis Aspire) one of the most common uses is simply to provide a browsable interface for linked data held in a Talis Platform store. Typically these scripts take the URI sent by the web browser and use a SPARQL query or the Talis Platform’s describe service to fetch linked data about that URI. They then style that as HTML or send it directly back as RDF. There are a series of technical details they all need to deal with: 303 redirects, content negotiation, converting RDF to HTML etc.

I’ve worked on more comprehensive libraries (e.g. Paget) to manage this kind of publishing but I thought the simple case of fetching and styling the data would make a good example of how to use Moriarty. I spent a bit of time this afternoon putting an example script together based on several I’ve written in the past. You can find the result in the dataspace subdirectory of the examples folder. That subdirectory contains four files:

  • dataspace.php — this is the example script. It contains the logic to fetch the relevant description, handle content negotiation of the best output format and styling the result appropriately. It’s not designed to be called directly, but to be included from a configuration file…
  • index.php — this is an example configuration file. It is designed to be dropped into a web server directory and then intercept all requests beginning with that URI. It contains the configuration describing which Talis Platform store to use, where cache files can be written and where to find Moriarty and ARC2. The last thing it does is to load dataspace.php which then handles the browser request.
  • sample.htaccess — this is a sample .htaccess file for Apache webservers. It redirects all requests via index.php.
  • plain.tmpl.html — this is the default template used to render the HTML views. This can be overridden in the configuration.

Using the example script is simple: just copy index.php to the root directory of your linked data space. If you’re using Apache then you need to copy sample.htaccess into the same directory and rename it to .htaccess. Edit index.php so it refers to your store and your URIs and that’s it! You can see it in action with the default template on my own linked data space.

About Moriarty… Moriarty is a simple PHP library for accessing the Talis Platform. It follows the Platform API very closely and wraps up many common tasks into convenient classes while remaining very lightweight. It also provides some simple RDF classes that are based on the excellent ARC2 class library. Moriarty is being developed by small community of developers and is in continual beta, subject to a slow stream of updates. You can read more about Moriarty on the n² wiki or visit its Google Code project

Augmenting Last.fm Data with BBC data on the Talis Platform

A short while back, I created a Linked Data wrapper on the Last.FM API for Events and Artists. The artist data links to the BBC’s data about each artist using owl:sameAs.

Now that the BBC RDF is available in a Talis Platform store, I can put some of my Last.FM data into a store (it’s currently generated on the fly from the Last.FM API), search on it, and then augment it with data from the BBC.

So I put some Last.FM data into the Sandbox1 store.

Now I can search on it with the items query endpoint like:

http://api.talis.com/stores/sandbox1/items?query=Black

This gives us the results as RSS 1.0, which is also RDF/XML, and contains a graph with 12 resources in it.

We can now pass the URI of this (or any RSS 1.0) document to the BBC-Backstage store’s Augment Service like this:

http://api.talis.com/stores/bbc-backstage/services/augment?data-uri=http%3A%2F%2Fapi.talis.com%2Fstores%2Fsandbox1%2Fitems%3Fquery%3DBlack

The Augment service will look at the URIs in the RSS results, and add DESCRIBEs for any of those URIs that it finds in its own store, giving you back the RSS augmented with BBC data.

So the graph we get back now contains 15 resources, where the BBC-Backstage store has found descriptions for 3 of the URIs in the original RSS.

For further information, see Leigh Dodd’s slides on Getting Started with the Talis Platform.

voiD, datasets, graphs, documents, and dcterms:isPartOf backlinks

One thing that I have heard people asking several times now regarding voiD is to do with how to say that data is part of a dataset.

Frédérick Giasson asked about this recently in #swig, and wondered why the voiD guide recommended using dcterms:isPartOf. I thought, since this is something that has been asked about a few times, I would blog about it and explain the reasoning behind this.

So, it wouldn’t be right to say something like:

<http://lastfm.rdfize.com/artists/Black+Sabbath> dcterms:isPartOf <http://lastfm.rdfize.com/meta.n3#Dataset> .

… because we don’t want to say that “Black Sabbath is part of the lastfm.rdfize.com dataset”.
We want to say “a description of Black Sabbath (composed of triples) is part of the lastfm.rdfize.com dataset“.

One approach to encapsulating this meaning would be to reify each individual triple and state that the triple is part of the dataset … but we felt that this would be neither practical nor popular.

So, in the voiD guide, we advocate that when you publish Linked Data, and you want to say that the data you are publishing is part of a voiD Dataset, you add a triple linking the document in which the data is published, to the dataset. eg:

<http://lastfm.rdfize.com/?artistName=Black+Sabbath> terms:partOf <http://lastfm.rdfize.com/meta.n3#Dataset> .

(where <http://lastfm.rdfize.com/?artistName=Black+Sabbath> is a document containing a description of <http://lastfm.rdfize.com/artists/Black+Sabbath>)

This way, when a Linked Data client dereferences <http://lastfm.rdfize.com/artists/Black+Sabbath> they get redirected to a document, and can follow the dcterms:isPartOf link from the document URI to the voiD Dataset.

What some people don’t like so much, is the implication that their dataset consists of documents, when what they really want to say is that their dataset consists of descriptions of resources.

The conceptual problem, if there is one, is that here the document URI is identifying an RDF/XML document, not the graph of RDF data encoded in that document. So, if you wanted to explicitly state that the graph, rather than the document, is part of the dataset, it could perhaps be done like this:

[ a <http://www.w3.org/2004/03/trix/rdfg-1/Graph> ;
<http://purl.org/vocab/frbr/core#embodiment> <http://lastfm.rdfize.com/?artistName=Black+Sabbath&output=rdf> ;
dcterms:isPartOf <http://lastfm.rdfize.com/meta.n3#Dataset> .
]

But I’m really not too sure if that is either semantically correct, or in any way a more practically useful description than simply saying the document is part of the dataset.

We (the voiD guide authors) think that the <document> dcterms:isPartOf <dataset> pattern is the most pragmatic approach to making a dataset discoverable from a LOD document.
But we are also open to suggestions for improvement as we evolve the vocabulary and guide in line with popular usage and the requirements of LOD publishers.

What do you think?