On 21 May 2009, at 17:30, Frederick Giasson wrote:

Hi Daniel,

I've been using the ARC PHP libraries to query my local Virtuoso SPARQL end point. While this works fine for small amounts of data, the memory usage of paging through hundreds of pages of results is too much for my PHP process to handle.

Is there a better way to do SPARQL querying against a local Virtuoso than using ARC?

Feel free to tell me to RTFM, but i'd appreciate any thoughts you might have.

Thanks,

Dan


Dan,

What's the configuration of your machine? Basically, how much RAM is in place?

BTW - Have you looked at the Virtuoso tunning guide?

Links:

1. http://docs.openlinksw.com/virtuoso/rdfperformancetuning.html

I think this is related to ARC or PHP or the SPARQL query he sends and not the performance of the virtuoso data store.


Daniel: make sure that what handle the paging is the sparql query by using LIMIT and OFFSET. Otherwise, a really big number of triples can be returned, and then PHP can choke with its memory if too many objects are created by ARC.

So, if you are performing the paging using SPARQL, a small amount of data will be loaded in PHP objects (ARC) and will then be usable.


Does this answer your question?


Hi Fred,

You're right that it's a PHP issue - Virtuoso is returning results quickly.

The problem is that even though PHP has a max memory of 4GB (out of the machine's 8GB) it still grinds to a halt and gets killed by the kernel after a number of hours of processing results.

I'm using LIMIT and OFFSET in the SPARQL queries.

As far as I can gather from looking at memory usage with xdebug, something that ARC uses (possibly the XML parser) is leaking memory even iteration, and this is adding up.

Is there a better way to query virtuoso (with SPARQL queries) locally, rather than having to use the HTTP/XML endpoint?



Thanks,

Dan


--
Daniel Alexander Smith

IAM Group
School of Electronics and Computer Science
University of Southampton
das...@ecs.soton.ac.uk




Reply via email to