Subscribe

Batch Changesets ARC Plugin

Platform Release 12 included a very useful new feature: the ability to send more than one changeset in a single POST to your store.

To generate a batch changeset from 2 versions of an RDF graph, you can use an ARC plugin called Talis_ChangeSetBuilderPlugin.

To use it:


	  $args = array(
			'before' => $before, //can be rdf/xml, turtle, or an ARC simpleIndex array
			'after' => $after,  //can be rdf/xml, turtle, or an ARC simpleIndex array
		);
		$cs = ARC2::getComponent('Talis_ChangeSetBuilderPlugin', $args);
		$cs_response = $store->get_metabox()->apply_versioned_changeset($cs); 

The plugin also relies upon the IndexUtils Plugin. The easiest way to get them all set up is to change to your arc directory and do:


svn co http://n2.talis.com/svn/playground/kwijibo/PHP/arc/plugins/trunk/ plugins

Rollbacks in Moriarty

Editing resources in the metabox of Talis Platform stores is done with Changesets. If you choose to use the versioned changesets API, your changesets will be stored as data in the metabox as well.

The great practical benefit of doing this is you can then reverse previous ChangeSets to return a resource to its previous state. You can read about one way to reverse changesets on the wiki. You can also now create rollback changesets from Moriarty with the new Rollback class.

To use it:


define('MORIARTY_ARC_DIR', 'arc/');
require 'moriarty/store.class.php';
require 'moriarty/rollback.class.php';  

//create a store object
$store = new Store('http://api.talis.com/stores/my_store');  

//Instantiate the Rollback class with a sparql service object:
$sparql = $store->get_sparql_service();
$rollback = new Rollback($sparql);  

//Call the to_changeset method, with a changeset's uri as the argument
$HTTP_Response = $rollback->to_changeset('http://api.talis.com/stores/my_store/items/1200302910905#self');  

// the body of the response is the changeset you need to revert back to the
// state of the resource before the changeset that you have given the URI of  

if($HTTP_Response->is_success()){  

//submit changeset  

	$rollbackResponse =  $store->apply_versioned_changeset($HTTP_Response->body);  

	if($rollbackResponse->is_success()){
		//relax!
	}else{
		// throw an error
	}  

}  

Moriarty Update

After a short break, it’s time for an update to Moriarty. Actually the changes in this version have been under development for several weeks but I wasn’t able to release them until Platform release 13 went live at the beginning of this week. There is one organisational change and functional changes, one of which is a major addition. The notes in this blog post relate to revision 679 in Moriarty’s subversion project

Firstly constants.inc.php has been deprecated in favour of moriarty.inc.php which has less of a potential name clash. constants.inc.php is now just a shell that includes moriarty.inc.php so no code should break. However you should update your applications to include moriarty.inc.php because in some future release I shall be removing constants.inc.php entirely.

This renaming is in preparation for a wider breaking change that I would like to make. Because PHP has traditionally had no namespacing capability the community has adopted library naming conventions to avoid name conflicts. For example, classes in Konstruct are prefixed with k_ (like k_Document) and classes in ARC are prefixed with ARC_ (e.g. ARC2_RDFXMLParser). Moriarty doesn’t do this which leads to a higher chance of naming clashes with client code. The right thing to do in a future release is to rename all the classes. So instead of Store we might have MORIARTY_Store or M_Store. I’d like some feedback on what you prefer so please do comment on this post.

It’s worth remembering that moriarty.inc.php defines the MORIARTY_DIR constant, setting it to be the directory in which moriarty.inc.php lives (this isn’t new, constants.inc.php used to do this). The preferred way of including Moriarty classes is like this:

require_once '/path/to/moriarty.inc.php';
require_once MORIARTY_DIR . 'store.class.php';

The major piece of new functionality in Moriarty is HTTP caching support. The Platform supports etags and other related caching headers in many places and for a long time I’ve wanted Moriarty to automatically take advantage of these. I added this support over a period of several weeks, refining it and tuning it so that it could work with the minimum of effort on the client developer’s part. Enabling caching in Moriarty is very simple. Just define a constant called MORIARTY_HTTP_CACHE_DIR and set it to be a valid, writable directory. Moriarty will then start using that directory to cache responses from HTTP requests. For example, add something like this at the main entry point of your code:

define('MORIARTY_HTTP_CACHE_DIR', '/var/cache');

Moriarty uses cached etag headers to intercept standard GET requests and turn them automatically into conditional ones. Although it still requires a network transaction, the amount of bandwidth used for a cache hit is very small. This kind of caching is smart. Dumb caching just keeps content for a pre-determined time period and only requests a fresh version when the time period has expired. That means it won’t be aware of any changes in the source until minutes or hours later. This may work well for content that doesn’t change often but causes extreme difficulties for interactive applications that involve updating as well as reading content. Many Platform-based applications use a simple pattern of fetching a current resource description, diffing it with the one entered by a user and generating a changeset to apply to the store. Dumb caching interferes with this by not fetching the true state of the resource description, and to fix it requires close coordination between user-supplied updates and cache invalidation. Conditional GETs avoid this by revalidating the cached content with the source on every request. The result is a slight trade off in performance for better consistency.

If you’re confident that you only need dumb caching then you can switch it on by defining the MORIARTY_HTTP_CACHE_READ_ONLY constant somewhere in your application. Moriarty doesn’t care about the value of this constant, just whether it is defined or not. When this constant is defined Moriarty will use the max-age headers in HTTP responses to determine how long retrieved content should be considered to be fresh for. It intercepts HTTP requests to the Platform and if it finds a fresh cache entry then it will immediately return that without making any network request. If the cache entry is stale then the request proceeds as normal and the entry gets updated with the newly retrieved content. Use this constant when your application is predominantly read-only and you don’t care if content is stale for a few hours.

Moriarty supports one other caching related constant: MORIARTY_HTTP_CACHE_USE_STALE_ON_FAILURE. Define this constant if you want Moriarty to return a cache entry when it can’t communicate with the Platform. This enables your application to continue runng even if there are network problems, a tradeoff of apparent liveness against freshness of content. (I use this constant in my application when I’m developing offline on the train. While I’m on the network I hit a few pages to freshen up the cache and then when I disconnect I can still browse and test the application using cached content.)

One caveat you need to be aware of: the cache files are not encrypted. Avoid using Moriarty’s caching support if you are dealing with private or secure information that you don’t want to be stored unencrypted on a web server file system. I might provide this capability if there is demand.

Finally, this version of Moriarty includes support for the new describe service. This was included as part of release 13 of the Platform and is now the preferred way of obtaining resource descriptions from the metabox or a private graph. See the section labeled “GET” in the Metabox documentation. You can use it in Moriarty like this:

$store = new Store('http://api.talis.com/stores/mystore');
$mb = $store->get_metabox();
$response = $mb->describe('http://example.com/foo');

The best thing is that the new describe service supports etags for resource descriptions, which means that Moriarty’s new caching functions can really speed up applications that use describe heavily (and if you’re building open world applications, then you should be). The Platform’s SPARQL services don’t currently support etags, so caching is less efficient. To support efficient HTTP caching we’d need to determine whether the resultset has changed since the last time the client issued the query. The only way to do that in SPARQL is by executing the query which could be very cheap or it could be horrendously expensive. The describe service packages up a very common use of SPARQL into a constrained service that is very easy to relate to changes in the underlying graph. That means we can really optimise this service, provide decent caching support and generally boost performance a lot more easily than we can for arbitrary SPARQL queries. Expect us to expand on the describe service in future releases and also to bite off a few other constrained derivatives of SPARQL.

About Moriarty… Moriarty is a simple PHP library for accessing the Talis Platform. It follows the Platform API very closely and wraps up many common tasks into convenient classes while remaining very lightweight. It also provides some simple RDF classes that are based on the excellent ARC2 class library. Moriarty is primarily being developed by Ian Davis and is in continual alpha, subject to occasional rapid bursts of change. You can read more about Moriarty on the n² wiki and get its source from the n² subversion repository

Moriarty Version 1.0

Tonight Moriarty turns 1.0. We’re starting to use Moriarty more seriously within Talis so we need some discipline around its development. To help this I’ve formally tagged the current version of Moriarty as 1.0. The intention is that all versions of Moriarty with the same major version number will be backwards compatible, so version 1.5 will be a drop in replacement for 1.0. Version 2.0, however, might see us introduce some breaking changes. We’ll try to avoid that of course but often it’s inevitable

You can download the latest release: moriarty-1.0.tgz or you can check it out of subversion using http://n2.talis.com/svn/playground/iand/moriarty/tags/1.0/. The trunk is still the bleeding edge and can be found here: http://n2.talis.com/svn/playground/iand/moriarty/trunk/

About Moriarty… Moriarty is a simple PHP library for accessing the Talis Platform. It follows the Platform API very closely and wraps up many common tasks into convenient classes while remaining very lightweight. It also provides some simple RDF classes that are based on the excellent ARC2 class library. Moriarty is primarily being developed by Ian Davis and is in continual alpha, subject to occasional rapid bursts of change. You can read more about Moriarty on the n² wiki and get its source from the n² subversion repository

Breaking Changes for Moriarty

As I alluded to earlier I have made some breaking changes to Moriarty (now in subversion as revision 657). These changes are to the index structure used by SimpleGraph which make it compatible with the RDF/PHP Specification. Most of the effects will be internal to Moriarty but some applications may be using the index directly via the get_index method.

I think these are the last breaking changes needed for the foreseeable future so this is probably going to be version 1.0. More on that and versioning policy in a while.

Specifically the changes to the index structure are:

  • The val key is renamed to value
  • The dt key is renamed to datatype
  • The type key now takes values of uri | bnode | literal instead of iri | bnode | literal

About Moriarty… Moriarty is a simple PHP library for accessing the Talis Platform. It follows the Platform API very closely and wraps up many common tasks into convenient classes while remaining very lightweight. It also provides some simple RDF classes that are based on the excellent ARC2 class library. Moriarty is primarily being developed by Ian Davis and is in continual alpha, subject to occasional rapid bursts of change. You can read more about Moriarty on the n² wiki and get its source from the n² subversion repository

Moriarty Facets

The latest batch of changes to Moriarty made it into subversion at the end oflast week (svn revision 655). The main change is the addition of a new FacetService class. You use it in the usual way. Either indirectly via the Store:

$store = new Store("http://api.talis.com/stores/mystore");
$fs = $store->get_facet_service();

Or directly if you know its URI:

$fs = new FacetService("http://api.talis.com/stores/mystore/service/facet");

Using the FacetService class is pretty simple: just call the facets method passing in the query, an array of fields to facet on and optionally the number of terms to return for each facet. As usual this method returns an HttpResponse:

$response = $fs->facets('query', array('field1','field2'));
if ($response->is_success()) {
  // do something useful
}
else {
  // mummy...
}

You can parse the XML response using the parse_facet_xml method which returns a nested array of data representing the facet data:

array (
  'field1' => array (
        0 => array ( 'value' => 'term1', 'number' => '5' ),
        1 => array ( 'value' => 'term2', 'number' => '4' ),
        1 => array ( 'value' => 'term3', 'number' => '2' ),
       ),
  'field2' => array (
        0 => array ( 'value' => 'term4', 'number' => '5' ),
        1 => array ( 'value' => 'term5', 'number' => '4' ),
        1 => array ( 'value' => 'term6', 'number' => '2' ),
       ),
) 

If you like living dangerously then you can combine both the previous steps into one using facets_to_array. If an error occurs this method simply returns an empty array:

$facets = $fs->facets_to_array('query', array('field1','field2'));

That’s it. A simple class for a simple but powerful service. You can read more about the Facet Service on the n² wiki.

There are a couple of big changes that I want to make pretty soon and I’m giving a heads up here because they may not be backwards compatible. The version of ARC I’m using is quite out of date (January 2008) so I need to update to the latest version. I’m not sure what that will involve. Maybe it’ll be completely smooth with no significant changes needed.

The second change is needed to make SimpleGraph’s index compatible with our RDF/PHP specification. I can see at least one major breaking change: I need to rename the hash key “val” to “value”. That is a pretty major breakage but I want to make Moriarty compatible with the RDF/PHP spec and with ARC2. I’m going to try and do that very soon.

About Moriarty… Moriarty is a simple PHP library for accessing the Talis Platform. It follows the Platform API very closely and wraps up many common tasks into convenient classes while remaining very lightweight. It also provides some simple RDF classes that are based on the excellent ARC2 class library. Moriarty is primarily being developed by Ian Davis and is in continual alpha, subject to occasional rapid bursts of change. You can read more about Moriarty on the n² wiki and get its source from the n² subversion repository

Ask Moriarty?

Another day, another incremental improvement to Moriarty (svn revision 490)! After my last set of changes I thought I’d better hurry up and add the copy_to function to the FieldPredicateMap too. You can now clone Field/Predicate Maps from one store to another:

  $fp = new FieldPredicateMap("http://api.talis.com/stores/mystore/config/fpmaps/1");
  $response = $fp->get_from_network();
  if ( $response->is_success() ) {
    $new_fp = $fp->copy_to("http://api.talis.com/stores/otherstore/config/fpmaps/1");
    $new_fp->put_to_network();
  }

I then set about thinking through my plan for adding HTTP caching support to Moriarty. I want this to work automatically and transparently, taking advantage of conditional GETs on the Platform. I’ll let it be switched off by defining a constant but I want it to be there by default so the developer gets the benefit without any effort.

I stubbed out some initial ideas for the HttpCache class on the train this morning. Then at lunchtime today, Danny pinged me on IRC wondering why Moriarty didn’t have SPARQL ASK support. “Not by design”, I said, “more by lack of time. But it should be easy to add, give me 15 minutes”. Then I promptly went into a series of meetings that ate the rest of my day. In the end the code did only take 15 minutes, but I finished it 11 hours later than I expected. Hopefully Danny didn’t spend all that time waiting for me to respond on IRC :-)

You can perform an ASK query on a store like this:

  $store = new Store("http://api.talis.com/stores/mystore");
  $sparql = $store->get_sparql_service();
  $response = $sparql->ask( "ASK WHERE {?s a .}" );
  if ($response->is_success()) {
    $result = $sparql->parse_ask_results( $response->body);
  }

Enjoy, Danny!

About Moriarty… Moriarty is a simple PHP library for accessing the Talis Platform. It follows the Platform API very closely and wraps ups many common tasks into convenient classes while remaining very lightweight. It also provides some simple RDF classes that are based on the excellent ARC2 class library. Moriarty is primarily being developed by Ian Davis and is in continual alpha, subject to occasional rapid bursts of change. You can read more about Moriarty on the n² wiki and get its source from the n² subversion repository

Query Profiles in Moriarty

I just committed another batch of changes to Moriarty (svn revision 482). This version contains some important changes to the way classes are included (thanks to prompting by kwijibo on #talis over the weekend). Previously Moriarty assumed that your classes were in directories in your include path. Now Moriarty expects its classes to reside in the directory defined by MORIARTY_DIR. If this isn’t already defined then Moriarty will define it to be the same directory as that containing constants.inc.php. A similar constant MORIARTY_ARC_DIR defines the directory where Moriarty expects to find ARC2.php. If this isn’t set then it will assume ARC is in a sibling directory. Take a look at constants.inc.php for the logic.

I also added support for query profiles which control the relative weights applied to each field in a text search. In Moriarty this class is a NetworkResource so you can easily populate the object by getting it from the network:

  $qp = new QueryProfile("http://api.talis.com/stores/mystore/config/queryprofiles/1");
  $response = $qp->get_from_network();
  if ( $response->is_success() ) {
    // do something with qp...
  }

Setting a query profile for a store is also quite easy. This example shows how to create a new query profile, set some field weights and then save it to the Platform:

  $qp = new QueryProfile("http://api.talis.com/stores/mystore/config/queryprofiles/1");
  $qp->add_field_weight('name', '2.0'); // the name field is twice as important than average
  $qp->add_field_weight('comments', '0.5'); // the name field is half as important as average
  $response = $qp->put_to_network();
  if ( $response->is_success() ) {
    // do something with qp...
  }

You can also remove field weights and replace them with alternate ones:

  $qp = new QueryProfile("http://api.talis.com/stores/mystore/config/queryprofiles/1");
  $response = $qp->get_from_network();
  if ( $response->is_success() ) {
    $qp->remove_field_weight('comments');
    $qp->add_field_weight('comments', '3');
  }

Finally, I added a utility function to assist when copying a query profile from one store to another. It recalculates all the URIs so they apply to the new store rather than the old one. Here’s how you could use it to clone a query profile from one store to another:

  $qp = new QueryProfile("http://api.talis.com/stores/mystore/config/queryprofiles/1");
  $response = $qp->get_from_network();
  if ( $response->is_success() ) {
    $new_qp = $qp->copy_to("http://api.talis.com/stores/otherstore/config/queryprofiles/1");
    $new_qp->put_to_network();
  }

You might be wondering why I chose the long method names get_from_network and put_to_network over shorter ones like load or save? The reason is that I strongly believe that it’s wrong to hide the network from the application. One of Moriarty’s principles is that it is the thinnest wrapper around HTTP that is possible. That’s why many network operations return the actual response object from the HTTP interaction. The developer can then inspect any headers that the server sends. Naming these methods explicitly reminds the developer that these are network operations and not local ones. The developer needs to be aware because networked applications need to be written differently to those operating on a single machine. Networks have latency, so it’s not wise to be calling these methods a thousand times a second and they are unreliable so the developer needs to be able to handle failure gracefully and be prepared to retry (these are a couple of the 8 Fallacies of Distributed Computing). Moriarty doesn’t try to hide these issues from the developer.

I also added query profile support to the Config class. The get_first_query_profile method is guaranteed to get you the query profile of your store, regardless of its URI. As explained in the FAQ query profiles can exist in a number of locations depending on the store configuration. I worked out the logic for every existing store on the platform, so this code will always get your query profile:

  $store = new Store("http://api.talis.com/stores/mystore");
  $config = $store->get_config();
  $qp = $config->get_first_query_profile();

If you just want the query profile URI then you can call $config->get_first_query_profile_uri()

About Moriarty… Moriarty is a simple PHP library for accessing the Talis Platform. It follows the Platform API very closely and wraps ups many common tasks into convenient classes while remaining very lightweight. It also provides some simple RDF classes that are based on the excellent ARC2 class library. Moriarty is primarily being developed by Ian Davis and is in continual alpha, subject to occasional rapid bursts of change. You can read more about Moriarty on the n² wiki and get its source from the n² subversion repository

Store Groups in Moriarty

I just committed the latest batch of changes to Moriarty (svn revision 440) which introduces a new StoreGroup class. I also introduced a new NetworkResource class that contains simple logic for fetching, updating and deleting resources on the Platform (using GET, PUT and DELETE).

Moriarty is a simple PHP library for accessing the Talis Platform. It follows the Platform API very closely and wraps ups many common tasks into convenient classes while remaining very lightweight. It also provides some simple RDF classes that are based on the excellent ARC2 class library. Moriarty is primarily being developed by Ian Davis and is in continual alpha, subject to occasional rapid bursts of change. You can read more about Moriarty on the n² wiki and get its source from the n² subversion repository

Store groups are a new Platform feature that we hope to deploy to live in Release 10. I wrote this code to help me perform the acceptance testing on our internal release candidate. A store group is an aggregate view of up to five stores. They present the same contentbox search and sparql services that a normal store does but the results of each type of searching are drawn from the members of the group. Because the group API is the same as that provided by stores you should just be able to point your application code for store searching at a group and get the benefit of cross searching.

You don’t normally need any special permissions to use a store group but if any of the member stores are secured you’ll be asked to authenticate. Note that creation of store groups is currently restricted to Talis Live Services staff but, like store creation, we hope to open this up for general availability in due course.