Subscribe

Automatically Creating Inverse Changesets and When They Don’t Behave as Expected

The Talis Platform uses changesets as a mechanism for updating RDF. As the configuration of the Platform is itself stored as RDF, we also use changesets to modify its configuration. This can be as part of a release or to make requested changes to a customer’s store.

I recently needed to apply a large number of changesets to the Platform configuration. But before applying them, I wanted to create another set of changesets which would, if necessary, reverse all the changes – I wanted to be able to rollback if anything went wrong.

So my changesets looked something like this:

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:cs="http://purl.org/vocab/changeset/schema#">
   <cs:ChangeSet rdf:about="http://example.com/changesets#change-1">
    <cs:subjectOfChange rdf:resource="http://api.talis.com/stores/mystore/exampleconfig"/>
    <cs:removal>
      <rdf:Statement>
        <rdf:subject rdf:resource="http://api.talis.com/stores/mystore/exampleconfig"/>
        <rdf:predicate rdf:resource="http://schemas.talis.com/2006/bigfoot/configuration#exampleproperty"/>
        <rdf:object rdf:resource="http://api.talis.com/stores/mystore/exampleconfig/old"/>
      </rdf:Statement>
    </cs:removal>
    <cs:addition>
      <rdf:Statement>
        <rdf:subject rdf:resource="http://api.talis.com/stores/mystore/exampleconfig"/>
        <rdf:predicate rdf:resource="http://schemas.talis.com/2006/bigfoot/configuration#exampleproperty"/>
        <rdf:object rdf:resource="http://api.talis.com/stores/mystore/exampleconfig/new"/>
      </rdf:Statement>
    </cs:addition>
  </cs:ChangeSet>
</rdf:RDF>

This changeset can be reversed by changing the removals to additions and changing the additions to removals. This is easy to achieve with sed:

for f in changesetdirectory/* ; do
  sed -e 's/cs:addition/TOBEAREMOVAL/' -e 's/cs:removal/TOBEANADDITION/' \
    -e 's/TOBEAREMOVAL/cs:removal/'  -e 's/TOBEANADDITION/cs:additon/' $f > rollback/$f
done

The above script creates an inverse of every changeset in the specified changesetdirectory and places them in the rollback directory. The inverse of the example changeset above is created as below:

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:cs="http://purl.org/vocab/changeset/schema#">
   <cs:ChangeSet rdf:about="http://example.com/changesets#change-1">
    <cs:subjectOfChange rdf:resource="http://api.talis.com/stores/mystore/exampleconfig"/>
    <cs:addition>
      <rdf:Statement>
        <rdf:subject rdf:resource="http://api.talis.com/stores/mystore/exampleconfig"/>
        <rdf:predicate rdf:resource="http://schemas.talis.com/2006/bigfoot/configuration#exampleproperty"/>
        <rdf:object rdf:resource="http://api.talis.com/stores/mystore/exampleconfig/old"/>
      </rdf:Statement>
    </cs:addition>
    <cs:removal>
      <rdf:Statement>
        <rdf:subject rdf:resource="http://api.talis.com/stores/mystore/exampleconfig"/>
        <rdf:predicate rdf:resource="http://schemas.talis.com/2006/bigfoot/configuration#exampleproperty"/>
        <rdf:object rdf:resource="http://api.talis.com/stores/mystore/exampleconfig/new"/>
      </rdf:Statement>
    </cs:removal>
  </cs:ChangeSet>
</rdf:RDF>

So the original changeset removes the triple:

http://api.talis.com/stores/mystore/exampleconfig 

http://schemas.talis.com/2006/bigfoot/configuration#exampleproperty

http://api.talis.com/stores/mystore/exampleconfig/old

and replaces it with:

http://api.talis.com/stores/mystore/exampleconfig 

http://schemas.talis.com/2006/bigfoot/configuration#exampleproperty

http://api.talis.com/stores/mystore/exampleconfig/new

The inverse changeset removes the triple:

http://api.talis.com/stores/mystore/exampleconfig 

http://schemas.talis.com/2006/bigfoot/configuration#exampleproperty

http://api.talis.com/stores/mystore/exampleconfig/new

and replaces the original:

http://api.talis.com/stores/mystore/exampleconfig 

http://schemas.talis.com/2006/bigfoot/configuration#exampleproperty

http://api.talis.com/stores/mystore/exampleconfig/old

Using this technique, I successfully created inverse changesets which, if I had needed to, would have rolled back the changes to the configuration.

However, there is a caveat. The set semantics of a triplestore can be a gotcha.

Suppose the following triple already exists:

http://api.talis.com/stores/mystore/exampleconfig 

http://schemas.talis.com/2006/bigfoot/configuration#exampleproperty

http://api.talis.com/stores/mystore/exampleconfig/alreadyexists

The following changeset could be applied:

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:cs="http://purl.org/vocab/changeset/schema#">
   <cs:ChangeSet rdf:about="http://example.com/changesets#change-1">
    <cs:subjectOfChange rdf:resource="http://api.talis.com/stores/mystore/exampleconfig"/>
    <cs:addition>
      <rdf:Statement>
        <rdf:subject rdf:resource="http://api.talis.com/stores/mystore/exampleconfig"/>
        <rdf:predicate rdf:resource="http://schemas.talis.com/2006/bigfoot/configuration#exampleproperty"/>
        <rdf:object rdf:resource="http://api.talis.com/stores/mystore/exampleconfig/alreadyexists"/>
      </rdf:Statement>
    </cs:addition>
  </cs:ChangeSet>
</rdf:RDF>

This changeset is accepted but doesn’t actually modify the triples as the triple it adds already existed. Creating an inverse of this changeset gives us:

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:cs="http://purl.org/vocab/changeset/schema#">
   <cs:ChangeSet rdf:about="http://example.com/changesets#change-1">
    <cs:subjectOfChange rdf:resource="http://api.talis.com/stores/mystore/exampleconfig"/>
    <cs:removal>
      <rdf:Statement>
        <rdf:subject rdf:resource="http://api.talis.com/stores/mystore/exampleconfig"/>
        <rdf:predicate rdf:resource="http://schemas.talis.com/2006/bigfoot/configuration#exampleproperty"/>
        <rdf:object rdf:resource="http://api.talis.com/stores/mystore/exampleconfig/alreadyexists"/>
      </rdf:Statement>
    </cs:removal>
  </cs:ChangeSet>
</rdf:RDF>

However, applying the inverse changeset removes the triple. As the triple existed before applying the first changeset the inverse of the changeset did not have the result we were looking for. It ended up deleting the triple which existed before we started.

So creating inverse changesets in this way can be useful, but only when you know with certainty that any triples added in the original changeset did not already exist.

SPARQL Hacks: moving query logic into data

There are too many terms that mean the same thing sometimes. Take labels. rdfs:label is perhaps the most obvious choice if you want to label something in RDF, but there are a whole bunch of semantically equivalent predicates in high usage for doing the same thing. For a while, it seems, it was common practice for every vocabulary to define their own equivalent – though very few bother to rdfs:subPropertyOf rdfs:label (and some predate rdfs:label), so even if you can do some reasoning in your query engine, this might not help you much. So when you want to get the label for something, but you don’t know which predicate the data uses, you might end up doing something like this:


construct { ?s rdfs:label ?l }
where
{
?s ?p ?o
optional
{ ?s rdfs:label ?l }
optional
{ ?s foaf:name ?l }
optional
{ ?s sioc:name ?l }
optional
{ ?s dc:title ?l }
optional
{ ?s dcterms:title ?l }
}

Nasty. And maybe later you find another label predicate in the data somewhere and have to go modify your queries.

But, if I add these triples to my store:


<#a> rdfapp:labelPredicate dc:title, rdfs:label, dcterms:title foaf:name, sioc:name .

I can instead do:


prefix rdfapp: <http://kwijibo.talis.com/vocabs/rdfapp#>
construct { ?s rdfs:label ?l }
where
{
<#a> rdfapp:labelPredicate ?labelPredicate .
?s ?labelPredicate ?l .
}

voiD, datasets, graphs, documents, and dcterms:isPartOf backlinks

One thing that I have heard people asking several times now regarding voiD is to do with how to say that data is part of a dataset.

Frédérick Giasson asked about this recently in #swig, and wondered why the voiD guide recommended using dcterms:isPartOf. I thought, since this is something that has been asked about a few times, I would blog about it and explain the reasoning behind this.

So, it wouldn’t be right to say something like:

<http://lastfm.rdfize.com/artists/Black+Sabbath> dcterms:isPartOf <http://lastfm.rdfize.com/meta.n3#Dataset> .

… because we don’t want to say that “Black Sabbath is part of the lastfm.rdfize.com dataset”.
We want to say “a description of Black Sabbath (composed of triples) is part of the lastfm.rdfize.com dataset“.

One approach to encapsulating this meaning would be to reify each individual triple and state that the triple is part of the dataset … but we felt that this would be neither practical nor popular.

So, in the voiD guide, we advocate that when you publish Linked Data, and you want to say that the data you are publishing is part of a voiD Dataset, you add a triple linking the document in which the data is published, to the dataset. eg:

<http://lastfm.rdfize.com/?artistName=Black+Sabbath> terms:partOf <http://lastfm.rdfize.com/meta.n3#Dataset> .

(where <http://lastfm.rdfize.com/?artistName=Black+Sabbath> is a document containing a description of <http://lastfm.rdfize.com/artists/Black+Sabbath>)

This way, when a Linked Data client dereferences <http://lastfm.rdfize.com/artists/Black+Sabbath> they get redirected to a document, and can follow the dcterms:isPartOf link from the document URI to the voiD Dataset.

What some people don’t like so much, is the implication that their dataset consists of documents, when what they really want to say is that their dataset consists of descriptions of resources.

The conceptual problem, if there is one, is that here the document URI is identifying an RDF/XML document, not the graph of RDF data encoded in that document. So, if you wanted to explicitly state that the graph, rather than the document, is part of the dataset, it could perhaps be done like this:

[ a <http://www.w3.org/2004/03/trix/rdfg-1/Graph> ;
<http://purl.org/vocab/frbr/core#embodiment> <http://lastfm.rdfize.com/?artistName=Black+Sabbath&output=rdf> ;
dcterms:isPartOf <http://lastfm.rdfize.com/meta.n3#Dataset> .
]

But I’m really not too sure if that is either semantically correct, or in any way a more practically useful description than simply saying the document is part of the dataset.

We (the voiD guide authors) think that the <document> dcterms:isPartOf <dataset> pattern is the most pragmatic approach to making a dataset discoverable from a LOD document.
But we are also open to suggestions for improvement as we evolve the vocabulary and guide in line with popular usage and the requirements of LOD publishers.

What do you think?

A MalBestPractice with RDF: Making Assumptions

Michael Hausenblas has a new blog post listing some common malpractices when working with RDF.

RDF is a model, not a format

I especially agree with his point about “Thinking of RDF on the serialisation level” (as a malpractice) – grabbing values from RDF/XML or RDFa wih XPath or regexes is not wise. It is making an unsafe assumption about the stability of the serialisation. In fact, if you are writing a Linked Data application, there are very few assumptions you can safely make, about either the serialisation, or the model.

RDF isn’t SQL, XML, OO …

So maybe my favourite MalBestPractising is: trying to treat RDF too much like some other software paradigm – too much like a relational database, too much like OO, too much like XML. It’s enticing to try to write software that treats RDF as if it was something that the mainstream of software development are more familiar with, to try to use the same kind of techniques and shortcuts. But these shortcuts often rely on assumptions that can’t be made about RDF data (at least, not proper, organic, free-range RDF from the web). You can’t assume that the same RDF graph will be serialised the same way as last time. You can’t assume that the http://xmlns.com/foaf/0.1/ namespace will always be bound to the foaf prefix. You can’t assume that a resource will, or won’t have a particular property, just because it has another property, or a particular type. If you don’t know that a statement exists, you can’t assume it doesn’t, only that you don’t know about it. et cetera.

Not making these assumptions can be tedious, and at times problematic, but ultimately, the less assumptions you write into your code, the more interesting, open, and ‘webby’ your application can be.

Less assumption, less code, more data, more web

The huge game-changing thing about web development with the Web of Data though, is not the set of assumptions you can’t make, but the assumptions you don’t have to make . Thanks to the Follow Your Nose principle espoused by Linked Data, you don’t need to write assumptions about your data into your code; you can instead let the application “follow its nose” to find out more about the data.

You can follow vocabulary term URIs to find out how they can be used, how they can be labeled, and what inferences can be drawn from their use. You can follow owl:sameAs and rdfs:seeAlso links to find out more about a resource. You can use semantic index services like Sindice to find occurrences of a URI or keyword across the Web of Data. You can follow dcterms:partOf links from RDF documents back to voiD Datasets, which will often have links you can follow to licenses that tell you how the data can be used, and to other services (such as SPARQL endpoints).

The more data is published, not just within datasets, but about datasets, and about services , the more we can write applications that open up to the web, and the fewer lines of code we will need to do it!

Metamorph Open Source project for Semantic Converter Web Service

I’ve published the code behind the Talis Convert Service (production release at stable URL coming soon) as an open source project on Google Code, called Metamorph .

Metamorph is a service aimed at semantic web developers. It is much like triplr, babel, swignition and any23 (please leave a comment pointing to any other similar services).

You give it a(n http) URI, an (optional) input format, and an output format, and it will fetch the document from the web, and convert it into the output format.

Understood input values include:

  • Semantic HTML (RDFa, eRDF, microformats, POSH)
  • RDF (XML, Turtle, JSON)
  • SPARQL-XML
  • Facet XML (the response format of the facets service available on all platform stores)

Output for all input formats can be:

  • JSON
  • JSONP
  • HTML

If the input is some form of RDF, you can also ask for:

  • RDF (XML, Turtle, JSON, – and the default HTML is rendered as RDFa)
  • RSS 1.0
  • TriX
  • Exhibit (web page, JSON, JSONP)

In addition, if the input is an RDF format, you can specify multiple data URIs, and the results will be merged in the output document. For instance, this conversion merges data from two of my homepages, and a Turtle file.

I’m thinking about removing the TriX output, as I’m not sure it would be used by anyone – the reason I didn’t bother to write a parser for it was because I haven’t seen any data published as TriX in the first place.

I welcome any input on what else would be useful from this web service. I suspect that more output options, while fairly easy to add, would not be very useful. More input options may be useful, but perhaps not significantly so.

I suspect what might be more useful, and more likely to distinguish this from similar RDF converter services, are graph transformation services, which might include:

  • Diffs
  • Intersects
  • Smushing
  • Augmenting on property and class type URIs with labels and comments, perhaps retrieved from SchemaCache

Metamorph is coded in PHP, and uses ARC for parsing RDF and HTML, and serialising RDF/XML and Turtle.

Please use the issue tracker for raising any bugs or feature requests.

GRDDLing DeWitt’s Friends

DeWitt Clinton has a great write-up of Creating a HTML “friends” page from a Google Reader subscription list, a bit of hackery which leads to a hCard microformat-enriched friends list. A little tweak to the HTML can make it more machine-friendly, just adding a HTML Meta Data profile URI:

<head profile="http://www.w3.org/2006/03/hcard">

That profile is GRDDL-enabled, so any GRDDL-aware agent can interpret the source document as RDF. This part’s easy to demonstrate, thanks the online W3C GRDDL service. So I’ve put a tweaked version of the HTML online, and here’s DeWitt’s friends page as RDF (in Turtle syntax, rendered a little verbosely).

Having set this up I realised the data wasn’t actually expressing the friend relationship, so went on to put together some SPARQL to sort that out – below. But afterwards I realised that DeWitt’s HTML was actually expressing the relationships using XFN class names, but again without the profile URI to make it machine-friendly. So another tweak:

<head profile="http://www.w3.org/2006/03/hcard http://www.w3.org/2003/g/td/xfn-workalike">

- the corresponding service output (scroll down to see the extra bits). I suppose I should mention that you can have as many space-separate profiles as you like, and the GRDDL-aware agent will interpret them independently, just accumulating all the triples. The second profile URI adds xfn:friend relationships, I think it would have been more useful with foaf:knows as well, but it is only a demo.One of these days the microformats folks might get around to tweaking the official profile appropriately…

The SPARQL I mentioned looks like this:

prefix rdf:
prefix vcard:
prefix foaf:

CONSTRUCT
{
[ a foaf:Person;
foaf:homepage ;
foaf:name "DeWitt Clinton" ;
]
foaf:knows
[ a foaf:Person;
foaf:homepage ?homepage ;
foaf:name ?name ] .
}
WHERE
{
[ a vcard:VCard ;
vcard:url ?homepage ;
vcard:fn ?name ]
}

- when applied to DeWitt’s data (as RDF), this will map it across from the vCard vocabulary – finding the appropriate ?variables by matching the pattern in the WHERE clause, inserting those ?variables into the CONSTRUCT clause to produce some new RDF.

I tried this on the Redland SPARQL demo, and I think it’s producing the RDF I wanted. Unfortunately the serialization is really ugly – lots of bnodes, and it’s hard to check visually. It appears to confuse Tabulator too, and the W3C RDF Validator which is handy for this kind of visualization appears to be down. (Here’s a copy of the RDF/XML). Still, it was only a workaround – with the right profiles in place it’s not needed.

I’m not sure if there’s a microformat way of expressing that the source data was a subscription/reading list. To get the richest RDF out it might be easier to do what DeWitt did, but to a full RDF serialization rather than microformatted HTML (which is effectively a CustomRdfDialect), producing something like Planet RDF‘s blogroll.

More recording studio RDF

Yves Raimond responded to my post aboutMusic/Audio Equipment Lists with Describing a recording session in RDF. I like it – looks useful.

Coincidentally I found my self doing something closely related yesterday. I wanted to better organize the various ‘songs’ we’ve put together over the past few months. Our music room (formerly the cats’ dining room) doesn’t pick up the house wi-fi so I just made things up as I went along. Yves’ session data is more fine-grained than what I was after for this job, but I’m pretty sure with a bit of tweaking something consistent is possible.

Here’s a sample of what I came up with:


@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix dc: <http://purl.org/dc/elements/1.1/> .
@prefix : <http://purl.org/stuff/studio/> .

[ a :P rogressReport;
  dc:date "2008-03-18";
    :subject
  [ a :WorkInProgress;
      :shortName "gloriaok" ;
      dc:title "Gloria" ;
      :o rigin [ rdfs:label "Cover" ];
      :style "blues rock";
      :currentState "lots down";
      :nextAction "redo vocals";
      :nextAction "mix bass"
  ]
] .

As well as resolving the Music Ontology overlap, I’d also like to align this with my general-purpose Project Vocabulary so that not only will it keep things better organised (I had a few out-of-sync variations of the same tune) whenever I finally get around to building the GTD tools it’ll help me decide what to do next.

Shorter term, sticking the stuff in a store with a SPARQL endpoint would make a handy reference. Right away it seemed there were a couple of opportunities for automation – several of the

:nextAction

values were “archive”, a lot were “delete”. A simple script should be able to take care of those.

Noodling with Atom/RDF

Now that GRDDL‘s a Recommendation, it’s about time we started using it. One particular bit of (potentially) low-hanging fruit is Atom (RFC 4287) – cleanly specified XML, well deployed for bloggish content syndication and increasingly having interesting extensions shoehorned in.

Anyhow, more on that some other time. I finally got around to trying a long-standing item on my to-do list: RDFize the wonderful Planet Venus aggregator. I reckon a persistent, queryable store of interesting subscriptions is a must-have part of any respectable personal knowledgebase. I haven’t time right now to go into detail on how it works (and in it’s current form you probably wouldn’t want to know), but basically these minimal Python scripts transform Venus’s Atom cache into RDF/XML and post the result of to a Talis Platform store (after first checking the entry isn’t already in the store). So far I’ve got it working enough to make some data available for SPARQling.
If you go to this SPARQL Query form, select the “twitcrit reviews” endpoint with the dropdown and enter a query like this:

PREFIX ar: <http://djpowell.net/schemas/atomrdf/0.3/>

SELECT DISTINCT ?entry ?tp ?title ?cp ?content
WHERE {
[
a ar:EntryInstance ;
ar:entry ?entry;
ar:title [ ?tp ?title ] ;
ar:content [ ?cp ?content ]
]
}
LIMIT 10


- you should see some results.


Next steps are to set up some local caching (thinking of just keeping a list of cache filenames) and turning it over to use the Changeset Protocol rather than the basic unversioned model posts it’s doing now. Once those are in place I’ll make a cron job for it.

There are quite a few different atom2rdf XSLT’s in circulation, the current best-bet frontrunner being one atom2rdf-18.xsl from David Powell, so I used that. Here’s the Venus install, I just pulled out a bunch of the semweb related feeds from my Bloglines subscriptions (note that I cleared the cache earlier today, there was way too much stuff in it for testing).

Drupal and the opportunity of RDF

At the start of this week, Dries Buytaert presented the keynote presentation at DrupalCon 2008 . The most exciting revelation came at the end: Drupal’s future is in the semantic web..

While Dries talks about the semantic web, and RDF, you don’t hear much reaction from the crowd; but then he says Let me show you a video of the future And proceeds to demonstrate SPARQLing on linked data from sources like dbpedia dbtunes, geodata, events, friends lists, and google spreadsheets, mashed-up in Exhibit.

This gets a lot of applause :)

In the keynote, he puts emphasis on data interoperability, decentralisation, remote querying, and how having a lot of data is great fun :)

It’s a really great talk, with a lot of excellent quotes about the value of RDF for Drupal, here are some of my favourites:

Web 3.0 (much as I hate to use the term) is all about infinite interoperability

We have the opportunity to be mentioned in the history books of the web … This is where the web is going. And this right time, and the right place, to make it happen.

Using RDF you can connect all these different parts of data, that live in different parts of the web.

RDF turns the web into a database

The real opportunity we have here is to start sprinkling this map [of linked open data sources] with Drupal. Every single Drupal site can be an RDF repository that people can query

Google are trying to build a world social graph, connecting people … but what we are doing with RDF is connecting not just people, but everything

With RDF, the import/export problem we have in Drupal just goes away. It just works, without having to describe database schemas… It just works. It’s a problem that is already solved.

You can listen to the audio of the presentation at archive.org (~45MB – the RDF stuff starts at around 53 minutes), and view a video of the RDF demonstration

You can also read more about Drupal and RDF here

Styles of Web Application – FlowPHP

Ian blogged a while back about why MVC is a rubbish pattern for web development because it doesn’t describe the problem in a way that helps you understand it better. I completely agree, and it’s surprising how much “received wisdom” there is about MVC being the right way to do things, but the natural response is, Well, what isn’t a rubbish pattern then?

Someone asks that in the comments on the blog post, and Ian replies:

Doesn’t REST define the pattern you need: resource/representation? Your application uses the URI to locate the appropriate resource and asks it to produce the appropriate representation.

I’m not completely happy with that as an answer though. To me, REST defines the interface to your application, and while it helps define at least that part of the problem, it doesn’t really give you enough of a solution. It doesn’t help you decide how to structure your code in the same way that MVC does (even if that decision is ultimately suboptimal).

I’ve been writing web apps in a similar style to that used by RESTful frameworks like Tonic and web.py, which I guess could be described as what rsinger called “_VC” on #talis the other day. Basically you have different ‘Resource’ classes that map to your application’s url design and return representations when, eg, a GET, or a POST method is called on them. A great boon of developing with RDF is that, because all data is the same shape, you can do things pretty generically, and write less domain-specific code. So I tried to keep my resource classes as generic as possible, and have different url routes set up the classes with different parameters as need be.

However, I’ve been growing pretty dissatisfied with this way of doing it, because it still seemed to be obscuring too much of the problem for me conceptually. There was still a problem of, ‘OK, where is the best place to put this‘, and a constant tension between whether to try to extend a generic class to cope with another situation, or writing a new one to do what you want. So I’d end up with a lot of classes that did a lot of pretty similar things (retrieving SPARQL queries, parsing them, passing data to the template), but not similar enough to be able just to do it with one class. I also found that class inheritance was a slightly messy way to share functionality, and it could be annoying to try to remember which class was used for which url space, and look it up in the routing configuration, and it wasn’t very amenable to serving representations derived from a combination of data sources.

So the other day I had an idea for a different style, which I’m pretentiously code-naming ‘FlowPHP’ (pronounced floaf – the P is silent ;) ).

The motivation is to try to model the process of receiving a request and returning a response as a chain of modular bits of code that create a response from the incoming request, and filter it until it is served. I’ve been trying this idea out, and so far, it looks like this:


try{
$KwijiboDev1 = new Store('http://api.talis.com/stores/kwijibo-dev1');
$R = new Request(array('SERVER' => $_SERVER, 'GET' => $_GET));
switch(true):
	case $R->is('GET','/posts'): // method is GET and url is /posts
		$R->response()->
                        checkCache()->
                            RDFList($KwijiboDev1, SIOC.'Post')->
                                SmushGraph()->
                                   serve('posts','main');
		break;
	case $R->is('GET','/post', array('uri')):
		 $R->response()->
                            checkCache()->
                                 CBD($KwijiboDev1, $R->GET['uri'])->
                                    serve('post','main');
		break;
	default:
		throw new HTTP_404("Page could not be found");
endswitch;
}
catch (Exception $e){
	echo $e->serve('error','main');
}

So what this is doing, is:

  • building a Request object with data from the $_SERVER and $_GET variables.
  • Checking the HTTP REQUEST METHOD, the REQUEST URI, and (optionally) for the existence of any required parameters.
  • Processing the Request and serving a Response by:
    1. Checking for a cached version we could serve first
    2. Retrieving the data: eg, CBD
    3. processing the data (eg: SmushGraph)
    4. Serving it in templates (serve() takes a variable length list of templates as parameters, rendering each inside the next template in the list)
  • Responding with an appropriate error if necessary (eg, HTTP 404, 405, 406, 500 – I pinched the idea of modelling 4xx and 5xx as Exceptions from Konstrukt)

Each ‘method’ in the chain, up until ‘serve()’, is returning the altered response object for the next method to manipulate. The methods that deal with adding data to the response, doing stuff with data, etc, aren’t really methods at all, but dynamically-called functions from a separate file. The reason I did it like this is I think it might be more modular and extensible, whilst not necessitating the creation of lots of different subclasses of Response.

This is still all evolving of course, and some/all of the ideas might turn out to be rubbish, but the thing I’m liking so far is the transparency: I think it’s relatively easy to see what’s going on with the code – what happens where, and when. The thing I’m experimenting with, I suppose, is the level of abstraction – my previous approach was perhaps too high-level and inflexible, which resulted in either lots of code, or lots of configuration, and the routing was kept too separate from the logic of returning the response.

The particular tension I’m finding with trying to develop flowphp at the moment, is to find a good idiom for setting variables midway through the chain of events – I’m loathe to have to break out of the chained methods, but maybe that’s only for aesthetic reasons.