Subscribe

Visualising BBC Programme Categories

Whilst I was exploring the BBC programmes data looking for possible demonstration applications I thought it might be interesting to try and create a visualisation of the relationships between different categories of BBC programmes The BBC datasets use SKOS as a categorization scheme, with separate taxonomies for formats (e.g. documentaries, animation, etc) and genres (e.g. childrens programmes, science fiction, etc). If you poke around a little, you can also see a nascent category system for places and people, although there doesn’t seem to be much data there at present (and what is there seems to change regularly).

For my purposes, the genre classifications looked most interesting. Episodes are associated with their genre category via the po:category property. As I was interested in finding relationships between genres, what I was looking for was a way to relate together individual categories, other than by the obvious super/sub-category relationship.

It occured to me that if two categories were associated with the same episode, then this could be viewed as a declaration of some implicit relationship between the categories. Extracting this in SPARQL is straight-forward, as we just need to match episodes that have more than one category:


SELECT ?categoryLabel ?relatedLabel WHERE
{
  ?episode a po:Episode;
    po:category ?category;
    po:category ?related. 

  ?category a po:Genre;
    rdfs:label ?categoryLabel. 

  ?related a po:Genre;
    rdfs:label ?relatedLabel. 

  FILTER (?category != ?related)
}
ORDER BY ?categoryLabel

In the above SPARQL query we match any episode that has at least two categories (because we use two po:category patterns), and where those categories are different (in the FILTER). This excludes the unwanted result where the ?category and ?related variables are bound to the same value. I didn’t bother with pruning out duplicates as this could easily be done on the client-side.

In order to visualise the results, I decided to use MooWheel. This provides a simple Javascript visualisation toolkit for presenting connections between a set of resources. MooWheel can be configured using a JSON data structure, so generating a a MooWheel visualisation from a SPARQL query is relatively straight-forward: the query results can be retrieved as SPARQL/JSON which can then be massaged into the appropriate JSON structure to generate the MooWheel visualisation. Check out the source code of the demonstration for sample code to do this (look at the success callback).

My first attempt at a visualisation simple executed the above query across the entire BBC dataset. This generated a huge wheel of connections between the categories, but ultimately the visualisation wasn’t that useful. So I decided to refine the visualisation to generate separate category wheels for each of the main BBC TV channels. This involved refining the SPARQL query to include an extra triple pattern to limit Episodes to just those associated with a specific channel (po:masterbrand). The following revised query restricts results to BBC 1:


SELECT ?categoryLabel ?relatedLabel WHERE
{
  ?episode a po:Episode;
    po:masterbrand ;
    po:category ?category;
    po:category ?related. 

  ?category a po:Genre;
    rdfs:label ?categoryLabel. 

  ?related a po:Genre;
    rdfs:label ?relatedLabel. 

  FILTER (?category != ?related)
}
ORDER BY ?categoryLabel

The results of this visualisation is much more interesting.

Each of the BBC channels has a different range of programming and this emphasis is really clear in the visualisation. Compare for example BBC 1 and BBC 3, or either with BBC 4. For those of us in the UK who have already internalised this, there may not be a great deal of new information here, but its nice to see how this feature of the dataset can be easily surfaced with very little effort. There’s more analysis that could be done here though, particularly if the BBC open up their programme archives. For example, how do the range of programme categories for a channel change over time? Which programmes actually link the different categories together? Could other visualisations provide more insight into the programming than a simple relationship wheel? For example, could a treemap style visualisation give some indication of the amount of schedule time devoted to a particular category of programme?

Why not see what you can come up with?

One Response

  1. goatslayer2001 Says:

    Hi,

    The visualization seems to have disappeared, is it possible to see it returned it was a striking demonstration of the use of linked data.

    Thanks,

    rs

Leave a Reply