SPARQLing data.gov.uk: Edubase Data
Last week the Cabinet Office issued a call for Open Data Developers to sign-up to get a preview of the forthcoming UK Government public data website. The site includes a directory of existing datasets plus a growing number of datasets that have been converted to RDF and which will shortly be available as Linked Data. This data is being stored in the Talis Platform providing developers with access to SPARQL endpoints as a means to query the data; we’ll also be including search and other access mechanisms at a later date.
In this series of postings I wanted to show some example SPARQL queries that can be used to access the data. If you’re new to SPARQL then you might want to look at Lee Feigenbaum’s SPARQL by Example tutorial, or my own short slide deck that covers all the basic syntax.
The first dataset I wanted to highlight is an extract of the Edubase dataset available from the Department of Children, Schools and Families. The conversion was carried out by the team at HP Labs and has been loaded into a Talis Platform store. The public facing SPARQL endpoint is available from: http://services.data.gov.uk/education/sparql.
Here are some sample SPARQL queries you can use against the data:
#1. Select the names of schools in the Administrative District of the City of London
# Ordering results by name of the school
prefix sch-ont: <http://education.data.gov.uk/def/school/>
SELECT ?name WHERE {
?school a sch-ont:School;
sch-ont:establishmentName ?name;
sch-ont:districtAdministrative
<http://statistics.data.gov.uk/id/local-authority-district/00AA> ;
}
ORDER BY ?name
#2. Which schools in the BANES area have a nursery?
prefix sch-ont: <http://education.data.gov.uk/def/school/>
prefix xsd: <http://www.w3.org/2001/XMLSchema#>
SELECT ?name WHERE {
?school a sch-ont:School;
sch-ont:establishmentName ?name;
sch-ont:districtAdministrative
<http://statistics.data.gov.uk/id/local-authority-district/00HA> ;
sch-ont:nurseryProvision "true"^^xsd:boolean
}
ORDER BY ?name
#3. Select the names and addresses of schools in the Administrative District of the City of London
# Ordering results by name of the school
# Note: we use OPTIONAL here as not every school has an address listed in the data
prefix sch-ont: <http://education.data.gov.uk/def/school/>
SELECT ?name ?address1 ?address2 ?postcode ?town WHERE {
?school a sch-ont:School;
sch-ont:establishmentName ?name;
sch-ont:districtAdministrative
<http://statistics.data.gov.uk/id/local-authority-district/00AA> .
OPTIONAL {
?school sch-ont:address ?address .
?address sch-ont:address1 ?address1 ;
sch-ont:address2 ?address2 ;
sch-ont:postcode ?postcode ;
sch-ont:town ?town .
}
}
ORDER BY ?name
#4. Select the name, lowest and highest age ranges, capacity and pupil:teacher ratio
# for all schools in the Bath & North East Somerset district
# Again we use OPTIONAL to allow for missing data items.
prefix sch-ont: <http://education.data.gov.uk/def/school/>
SELECT ?name ?lowage ?highage ?capacity ?ratio WHERE {
?school a sch-ont:School;
sch-ont:establishmentName ?name;
sch-ont:districtAdministrative
<http://statistics.data.gov.uk/id/local-authority-district/00HA> .
OPTIONAL {
?school sch-ont:statutoryLowAge ?lowage ;
}
OPTIONAL {
?school sch-ont:statutoryHighAge ?highage ;
}
OPTIONAL {
?school sch-ont:schoolCapacity ?capacity ;
}
OPTIONAL {
?school sch-ont:pupilTeacherRatio ?ratio
}
}
ORDER BY ?name
#5. What is the uri, name, and opening date of the oldest school in the UK?
prefix sch-ont: <http://education.data.gov.uk/def/school/>
SELECT ?school ?name ?date WHERE {
?school a sch-ont:School;
sch-ont:establishmentName ?name;
sch-ont:openDate ?date.
}
ORDER BY ASC(?date)
LIMIT 1
#6. Select the name, easting and northing for the 100 newest schools in the UK.
# Can be used to plot them on a map
prefix sch-ont: <http://education.data.gov.uk/def/school/>
SELECT ?school ?name ?date ?easting ?northing WHERE {
?school a sch-ont:School;
sch-ont:establishmentName ?name;
sch-ont:openDate ?date ;
sch-ont:easting ?easting ;
sch-ont:northing ?northing .
}
ORDER BY DESC(?date)
LIMIT 100
#7. Select the uri, name, easting and northing for all schools opened in 2008
prefix sch-ont: <http://education.data.gov.uk/def/school/>
prefix xsd: <http://www.w3.org/2001/XMLSchema#>
SELECT ?school ?name ?date ?easting ?northing WHERE {
?school a sch-ont:School;
sch-ont:establishmentName ?name;
sch-ont:openDate ?date ;
sch-ont:easting ?easting ;
sch-ont:northing ?northing .
FILTER (?date > "2008-01-01"^^xsd:date && ?date < "2009-01-01"^^xsd:date)
}
#8. Select the uri, name, and the reason for closing for all schools that are currently
# scheduled for closure. The reason is a URI from a controlled vocabulary in the ontology.
prefix sch-ont: <http://education.data.gov.uk/def/school/>
prefix xsd: <http://www.w3.org/2001/XMLSchema#>
SELECT ?school ?name ?reason WHERE {
?school a sch-ont:School;
sch-ont:establishmentName ?name ;
sch-ont:establishmentStatus sch-ont:EstablishmentStatus_Open_but_proposed_to_close ;
sch-ont:reasonEstablishmentClosed ?reason .
}
#9. In which parliamentary constituencies did schools close in 2008?
prefix sch-ont: <http://education.data.gov.uk/def/school/>
prefix xsd: <http://www.w3.org/2001/XMLSchema#>
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT DISTINCT ?cons ?label WHERE {
?school a sch-ont:School;
sch-ont:establishmentName ?name ;
sch-ont:establishmentStatus sch-ont:EstablishmentStatus_Closed ;
sch-ont:closeDate ?date ;
sch-ont:parliamentaryConstituency ?cons .
?cons rdfs:label ?label.
FILTER (?date > "2008-01-01"^^xsd:date && ?date < "2009-01-01"^^xsd:date)
}
ORDER BY ?cons
#10. In which parliamentary constituencies did schools open in 2008?
prefix sch-ont: <http://education.data.gov.uk/def/school/>
prefix xsd: <http://www.w3.org/2001/XMLSchema#>
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT DISTINCT ?cons ?label WHERE {
?school a sch-ont:School;
sch-ont:establishmentName ?name ;
sch-ont:openDate ?date ;
sch-ont:parliamentaryConstituency ?cons .
?cons rdfs:label ?label.
FILTER (?date > "2008-01-01"^^xsd:date && ?date < "2009-01-01"^^xsd:date)
}
ORDER BY ?cons
Hopefully that’s enough to get you started. If you want a bit more background on the modelling and a look at the ontology, then read this posting to the uk-government-data mailing list by Stuart Williams.
note: updated 16 Nov 2009 to reflect changes to the EduBase data. The first version of this dataset was created before the proposed guidelines for public sector URIs was published. The school ontology used in that first dataset had a URI of http://education.data.gov.uk/ontology/school# which has now been replaced with http://education.data.gov.uk/def/school/. Also the URIs for administrative districts were temporary placeholders containing the phrase “placeholder-id” in their path. These have now been updated to URIs based on the Office for National Statistics district codes, for example http://statistics.data.gov.uk/id/local-authority-district/00AA

