Subscribe

Importing Large RDF Documents: Streaming Parsing of RDF/XML with ARC2

A common trouble when parsing RDF is running out of memory because the document is too large. ARC2 solves this problem (for RDF/XML) by being able to stream it.

If you want to take advantage of the streaming, you just need to extend the ARC2_RDFXMLParser class and overwrite the addT method:

<?php
require 'arc/ARC2.php';
require 'arc/parsers/ARC2_RDFXMLParser.php'; 

class Streamer extends ARC2_RDFXMLParser { 

	function addT($s, $p, $o, $s_type, $o_type, $o_dt = '', $o_lang = ''){
		var_dump($s, $p, $o, $s_type, $o_type, $o_dt, $o_lang);
	} 

} 

$p = new Streamer(); 

$p->parse('big-data.rdf'); 

?>

In this simple example, I’m just var_dumping out the triples as they come in, but of course you should do whatever it is you want to do instead to the triple in that method.