Re: [Virtuoso-users] best way to update large RDF stores with triples of a large document

Gang Fu Fri, 17 Apr 2015 05:30:47 -0700

Thank you very much, Hugh! You are right, we are going to update the large
data set on weekly basis, and the updates are on the scale of a couple of
millions, or less than that. The bulk loader with delete option sounds good
for me, but only the nquad files are allowed. Our input files are in ttl,
which are dumped from sql database. Preparing another set of dump scripts
is not good....converting ttl to nquad requires extra step in the
pipeline....is there a way to do rdf loader 'with delete' with ttl files?
Otherwise, I think sparql delete is better for us, since no extra efforts
are needed.


On Wed, Apr 8, 2015 at 12:59 PM, Hugh Williams <hwilli...@openlinksw.com>
wrote:

> Hi Gang,
>
> To be clear when you say "I want to update a large RDF store with 10
> billions triples once a week" , presume you are *NOT* loading 10billion new
> triples every week, but rather the base 10billion triples are to be updated
> which triples/graphs being inserted/deleted/updated, thus  the overall
> number of triples does increase (or decrease) on that scale ?
>
> As if these updates are in the form of documents ie datasets and if they
> or can be converted to nquad format to meet the requirements of the
> Virtuoso RDF Bulk Loder "with_delete" [1] option, then this would be the
> most the fastest and most efficient way to do this I would say ...
>
> [1]
> http://virtuoso.openlinksw.com/dataspace/doc/dav/wiki/Main/VirtRDFBulkLoaderWithDelete
>
> Best Regards
> Hugh Williams
> Professional Services
> OpenLink Software, Inc.      //              http://www.openlinksw.com/
> Weblog   -- http://www.openlinksw.com/blogs/
> LinkedIn -- http://www.linkedin.com/company/openlink-software/
> Twitter  -- http://twitter.com/OpenLink
> Google+  -- http://plus.google.com/100570109519069333827/
> Facebook -- http://www.facebook.com/OpenLinkSoftware
> Universal Data Access, Integration, and Management Technology Providers
>
> On 8 Apr 2015, at 12:27, Gang Fu <gangfu1...@gmail.com> wrote:
>
> using isql or jdbc or http will make any difference?
>
> On Wed, Apr 8, 2015 at 7:25 AM, Gang Fu <gangfu1...@gmail.com> wrote:
>
>> There are millions of triples to be updated on weekly basis.
>>
>> On Wed, Apr 8, 2015 at 7:24 AM, Gang Fu <gangfu1...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> I want to update a large RDF store with 10 billions triples once a week.
>>> The triples to be inserted or deleted are save in documents.
>>> There is no variable binding or blank nodes in the documents.
>>> So I guess the best fit sparql update functions are
>>> insert data/delete data
>>>
>>> What is the best way to do this?
>>> Using JDBC connection pool or http?
>>> Using 'modify graph <graph-iri> insert/delete', or insert/delete data?
>>> Is it possible to run concurrent update jobs?
>>>
>>>
>>> Best,
>>> Gang
>>>
>>>
>>
>
> ------------------------------------------------------------------------------
> BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
> Develop your own process in accordance with the BPMN 2 standard
> Learn Process modeling best practices with Bonita BPM through live
> exercises
> http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual-
> event?utm_
>
> source=Sourceforge_BPM_Camp_5_6_15&utm_medium=email&utm_campaign=VA_SF_______________________________________________
> Virtuoso-users mailing list
> Virtuoso-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/virtuoso-users
>
>
>

------------------------------------------------------------------------------
BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
Develop your own process in accordance with the BPMN 2 standard
Learn Process modeling best practices with Bonita BPM through live exercises
http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_
source=Sourceforge_BPM_Camp_5_6_15&utm_medium=email&utm_campaign=VA_SF

_______________________________________________
Virtuoso-users mailing list
Virtuoso-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/virtuoso-users

Re: [Virtuoso-users] best way to update large RDF stores with triples of a large document

Reply via email to