Hi,

i developed some machine learning algorithms that i'd like to run against 
various datasets.
Many of them provide Virtuoso powered SPARQL endpoints online, but running my 
algorithms against them would for sure not be considered "fair use".

Some datasets provide dumps, so i'm able to play nice, load the dumps on a 
local Virtuoso instance and torture that local instance with my algorithms.

How can i do something similar in case there is no dump available for download, 
but only a SPARQL endpoint?

I was thinking about issuing a `construct where { ?s ?p ?o } limit X offset Y` 
and stepping through the endpoint like that once, but the bigger the offset, 
the slower the response time:

http://dbpedia.org/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&qtxt=select+*+where+{%3Fs+%3Fp+%3Fo.}+limit+10000+offset+400020000&format=text%2Fhtml&CXML_redir_for_subjs=121&CXML_redir_for_hrefs=&timeout=30000&debug=on

Any suggestions how to improve this and do this in a "nice" way?
Also maybe without the danger of skipping a lot of data by different orders?

Best,
Jörn


------------------------------------------------------------------------------
Transform Data into Opportunity.
Accelerate data analysis in your applications with
Intel Data Analytics Acceleration Library.
Click to learn more.
http://pubads.g.doubleclick.net/gampad/clk?id=278785471&iu=/4140
_______________________________________________
Virtuoso-users mailing list
Virtuoso-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/virtuoso-users

Reply via email to