SPARQL 1.1 Early Access Features
In yesterday’s monthly Talis Platform release we started rolling out some early access support for the SPARQL 1.1 query language. We’ve been monitoring the activity around the development of SPARQL extensions for some time and have been watching the Working Group’s activity to get a feel for which new features are to be included in the forthcoming revision to the language. For those of you interested in some background on that then Lee Feigenbaum has a nice presentation that summarizes the working groups current thinking.
One major missing feature from SPARQL 1.0 was support for aggregates, i.e. the ability to count, sum and group results. These features have already been implemented by a number of triple stores and this work will get standardised as part of SPARQL 1.1. Because of our confidence in this feature being added to the specification; the existing implementation experience; and in response to customer feedback we have decided to release early access support for these specific features as an experimental enhancement to the Platform SPARQL endpoint.
The documentation on the developer wiki has been updated to start to itemize the supported SPARQL extensions.
Users should be aware that the syntax of the extensions may be subject to change as we’ll be attempting to track the progress of the working group as they clarify the specification of these features for inclusion in the standard. We’ll provide notice of any expected changes.
Users should also be aware that while the basic functionality of aggregates is supported in a number of other implementations, care should be taken if queries are intended to be portable across different triplestores and/or services. For example, the Talis Platform contains some mirrors of other datasets so queries written to use the new functionality may not be portable across other services due to the basic feature not being supported or due to minor syntactic differences.
With the warnings out of the way, here are some simple examples of the extensions in practice. The first query uses the BBC programmes and music data hosted in the platform, and asks for the number of albums release by the Prodigy. The query uses the count() function to count up the number of album titles. The results of the count are assigned to a variable called ?count in the SELECT clause using the new “SELECT expression” syntax.
#How many albums have been released by The Prodigy?
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX mo: <http://purl.org/ontology/mo/>
PREFIX rel: <http://purl.org/vocab/relationship/>
PREFIX rev: <http://purl.org/stuff/rev#>
SELECT (count(?title) as ?count) WHERE {
?group a mo:MusicGroup;
foaf:name "The Prodigy";
foaf:made ?album.
?album dc:title ?title.
}
The second example is a variant of one of the example queries that can be used against the Edubase data. In this case the query retrieves the number of schools closed in each parliamentary constituency in 2008, ordering the results in descending order. The new GROUP BY keyword is used to group the results by the label of the constituency.
#How many schools closed in each parliamentary constituency in 2008?
#In descending order of number of closures
prefix sch-ont: <http://education.data.gov.uk/ontology/school#>
prefix xsd: <http://www.w3.org/2001/XMLSchema#>
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?label (count(?school) as ?count) WHERE {
?school a sch-ont:School;
sch-ont:establishmentName ?name ;
sch-ont:establishmentStatus sch-ont:EstablishmentStatus_Closed ;
sch-ont:closeDate ?date ;
sch-ont:parliamentaryConstituency ?cons .
?cons rdfs:label ?label.
FILTER (?date > "2008-01-01"^^xsd:date && ?date < "2009-01-01"^^xsd:date)
}
GROUP BY ?label
ORDER BY DESC(?count)
We can revise this query to only include those constituencies in which at least 10 schools have closed. To do this we need to filter the results to just those where the count is equal to or greater than 10. The new HAVING keyword allows an expression to be applied to the result set before it is returned:
prefix sch-ont: <http://education.data.gov.uk/def/school/>
prefix xsd: <http://www.w3.org/2001/XMLSchema#>
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?label (count(?school) as ?count) WHERE {
?school a sch-ont:School;
sch-ont:establishmentName ?name ;
sch-ont:establishmentStatus sch-ont:EstablishmentStatus_Closed ;
sch-ont:closeDate ?date ;
sch-ont:parliamentaryConstituency ?cons .
?cons rdfs:label ?label.
FILTER (?date > "2008-01-01"^^xsd:date && ?date < "2009-01-01"^^xsd:date)
}
GROUP BY ?label
HAVING (?count >= 10)
ORDER BY DESC(?count)
The SPARQL extensions page includes a few more examples of the syntax and a list of the operators now supported in the extended query language. Any feedback or questions, then please leave a comment below.


October 21st, 2009 at 2:51 pm
[...] » Blog Archive » SPARQL 1.1 Early Access Features http://blogs.talis.com/n2/archives/854 # Tags: [...]
November 17th, 2009 at 5:58 pm
[...] query that interested me, which has become possible since the Platform introduced support for the COUNT() function from SPARQL 1.1, is, which are the most commonly used vocabularies? (SIOC and FOAF so far! – thought this is [...]