<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd"
	xmlns:media="http://search.yahoo.com/mrss/"
	>
<channel>
	<title>Comments on: Data Migration using SPARQL and Changesets</title>
	<atom:link href="http://blogs.talis.com/n2/archives/659/feed" rel="self" type="application/rss+xml" />
	<link>http://blogs.talis.com/n2/archives/659</link>
	<description>All about developing with the Talis Platform</description>
	<lastBuildDate>Wed, 21 Dec 2011 19:51:10 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
	<item>
		<title>By: kwijibo</title>
		<link>http://blogs.talis.com/n2/archives/659/comment-page-1#comment-1025</link>
		<dc:creator>kwijibo</dc:creator>
		<pubDate>Fri, 10 Jul 2009 10:53:30 +0000</pubDate>
		<guid isPermaLink="false">http://blogs.talis.com/n2/?p=659#comment-1025</guid>
		<description>Interesting points. Rewriting  the incoming queries could provide some backwards compatibility - and I guess, if the query is a CONSTRUCT or a DESCRIBE, you ought to also rewrite the RDF on the way out again, otherwise the client code could still break. But maybe this should be for special cases, where there are specific breakages you want to avoid while still migrating.  I&#039;d side in general with the view that the client is responsible for coping with the data it gets, and the server is responsible for a stable API, though not necessarily stable data. But it&#039;s pretty likely there will also be special cases when being backwards-compatible data-wise is also important, and rewriting  queries and data is a good solution (and what about when the data is retrieved  by dereffing LOD, not SPARQLed? What should happen then?). 

You could also leave the old triples in the store instead of rewriting, though there are potential drawbacks with that bloating the size of your store and response document sizes (you could possibly filter out the old triples from the RDF you are returning unless they are specified in the SPARQL query).  In some cases having both old and new triples might make your data wrong or inconsistent. 

The backwards-compatible store idea could be OK if the data is fairly static - you could sniff the SPARQL query for  old triple patterns and redirect to the appropriate endpoint maybe?

There are lots of different ways you might change your data too - it could be a typo or switching vocab terms, or it could be a radically different modeling; in which case, it would be harder for the client to anticipate the change (and might even require more extensive rewriting of the client application).


Assuming that it is the client&#039;s responsibility to adjust to changes in data pulled down from the wild web, there are a couple of things that I think might help:

1. A way of bundling up equivalent terms / graph patterns for specific situations.  Maybe some collections of owl:sameAs and rdfs:subPropertyOf statements and the like,  and a bit of reasoning, could work, but maybe a better solution would be something like profiles, where terms/graph patterns are stated to be functionally substitutable for a given task, eg: Here are a bunch of predicate URIs you can look through if you want to display something as a label, or here are a bunch of things you can try if you want to show an image associated with a (non-IR) resource; or (trickier) here&#039;s a bunch of patterns you can try if you want to find tags/taggings. These profiles (for want of a better word) could be kept up to date and served up from somewhere, and the client application could retrieve and cache them - so then the client could be vocabulary agnostic, and cope with evolutions in data modeling, provided the profiles are kept up to date.

2. Ways for the server to hint to clients when a data migration has taken place - and what to do about it. I suppose the lightest weight option is to include old and new triples in the response for a while and hope the client picks up on it. The server could  include extra metadata in the response document, or in it&#039;s own (RDF, of course) service description, or dataset description (voiD provides terms to say what vocabularies a dataset is using, for example).</description>
		<content:encoded><![CDATA[<p>Interesting points. Rewriting  the incoming queries could provide some backwards compatibility &#8211; and I guess, if the query is a CONSTRUCT or a DESCRIBE, you ought to also rewrite the RDF on the way out again, otherwise the client code could still break. But maybe this should be for special cases, where there are specific breakages you want to avoid while still migrating.  I&#8217;d side in general with the view that the client is responsible for coping with the data it gets, and the server is responsible for a stable API, though not necessarily stable data. But it&#8217;s pretty likely there will also be special cases when being backwards-compatible data-wise is also important, and rewriting  queries and data is a good solution (and what about when the data is retrieved  by dereffing LOD, not SPARQLed? What should happen then?). </p>
<p>You could also leave the old triples in the store instead of rewriting, though there are potential drawbacks with that bloating the size of your store and response document sizes (you could possibly filter out the old triples from the RDF you are returning unless they are specified in the SPARQL query).  In some cases having both old and new triples might make your data wrong or inconsistent. </p>
<p>The backwards-compatible store idea could be OK if the data is fairly static &#8211; you could sniff the SPARQL query for  old triple patterns and redirect to the appropriate endpoint maybe?</p>
<p>There are lots of different ways you might change your data too &#8211; it could be a typo or switching vocab terms, or it could be a radically different modeling; in which case, it would be harder for the client to anticipate the change (and might even require more extensive rewriting of the client application).</p>
<p>Assuming that it is the client&#8217;s responsibility to adjust to changes in data pulled down from the wild web, there are a couple of things that I think might help:</p>
<p>1. A way of bundling up equivalent terms / graph patterns for specific situations.  Maybe some collections of owl:sameAs and rdfs:subPropertyOf statements and the like,  and a bit of reasoning, could work, but maybe a better solution would be something like profiles, where terms/graph patterns are stated to be functionally substitutable for a given task, eg: Here are a bunch of predicate URIs you can look through if you want to display something as a label, or here are a bunch of things you can try if you want to show an image associated with a (non-IR) resource; or (trickier) here&#8217;s a bunch of patterns you can try if you want to find tags/taggings. These profiles (for want of a better word) could be kept up to date and served up from somewhere, and the client application could retrieve and cache them &#8211; so then the client could be vocabulary agnostic, and cope with evolutions in data modeling, provided the profiles are kept up to date.</p>
<p>2. Ways for the server to hint to clients when a data migration has taken place &#8211; and what to do about it. I suppose the lightest weight option is to include old and new triples in the response for a while and hope the client picks up on it. The server could  include extra metadata in the response document, or in it&#8217;s own (RDF, of course) service description, or dataset description (voiD provides terms to say what vocabularies a dataset is using, for example).</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Ian Davis</title>
		<link>http://blogs.talis.com/n2/archives/659/comment-page-1#comment-1022</link>
		<dc:creator>Ian Davis</dc:creator>
		<pubDate>Fri, 10 Jul 2009 09:34:21 +0000</pubDate>
		<guid isPermaLink="false">http://blogs.talis.com/n2/?p=659#comment-1022</guid>
		<description>Dan, I think you could do rewriting of queries or some OWL might help. But practically speaking I would just leave the original triples there for a while. This does does raise the question of which side should bear the cost of changing data models: the client or the server. There&#039;s an argument to say that since clients are dealing with arbitrary data then they should be flexible in what they look for in a dataset. I have some notes on &quot;open world&quot; development that I really need to write up as a blog post....</description>
		<content:encoded><![CDATA[<p>Dan, I think you could do rewriting of queries or some OWL might help. But practically speaking I would just leave the original triples there for a while. This does does raise the question of which side should bear the cost of changing data models: the client or the server. There&#8217;s an argument to say that since clients are dealing with arbitrary data then they should be flexible in what they look for in a dataset. I have some notes on &#8220;open world&#8221; development that I really need to write up as a blog post&#8230;.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Elliot Smith</title>
		<link>http://blogs.talis.com/n2/archives/659/comment-page-1#comment-1019</link>
		<dc:creator>Elliot Smith</dc:creator>
		<pubDate>Fri, 10 Jul 2009 09:28:05 +0000</pubDate>
		<guid isPermaLink="false">http://blogs.talis.com/n2/?p=659#comment-1019</guid>
		<description>Nice one Keith, could come in handy!</description>
		<content:encoded><![CDATA[<p>Nice one Keith, could come in handy!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Dan Brickley</title>
		<link>http://blogs.talis.com/n2/archives/659/comment-page-1#comment-1016</link>
		<dc:creator>Dan Brickley</dc:creator>
		<pubDate>Fri, 10 Jul 2009 09:15:53 +0000</pubDate>
		<guid isPermaLink="false">http://blogs.talis.com/n2/?p=659#comment-1016</guid>
		<description>Thanks for writing this up. Nice and practical :)

Any thoughts on how to deal with code &quot;out there&quot; which might be using the old graph patterns? Can a back-compatible extra store (or named graph) be generated in similar fashion? But I guess SPARQL endpoint details would need to change too. Have you looked into the pros/cons of rewriting incoming queries?</description>
		<content:encoded><![CDATA[<p>Thanks for writing this up. Nice and practical <img src='http://blogs.talis.com/n2/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p>Any thoughts on how to deal with code &#8220;out there&#8221; which might be using the old graph patterns? Can a back-compatible extra store (or named graph) be generated in similar fashion? But I guess SPARQL endpoint details would need to change too. Have you looked into the pros/cons of rewriting incoming queries?</p>
]]></content:encoded>
	</item>
</channel>
</rss>

