Thank you for your advice Daniel.

Actually I want to delete only the statements containing the specific
predicate. I don't want to delete all the triples containing the subject of
the predicate. As I have already said, I don't feel comfortable with the
DELETE queries.
Is my query wrong? Could you suggest the correct query?

Kind regards,
Pantelis Natsiavas

2016-08-17 17:19 GMT+03:00 Davis, Daniel (NIH/NLM) [C] <daniel.da...@nih.gov
>:

> So, this has nothing to do with the large vector size, but just to be sure
> the SPARQL is correct - do you wish to delete the subjects (and all their
> triples) where the subject has the predicate, or just the predicate itself?
>
>
>
> As far as avoiding the maximum vector size, I think your best approach is
> to limit the number of matches and repeat the query until there are no
> results, maybe with a count query in-between.   I have had to do similar
> sorts of work-arounds to avoid the maximum # of results and maximum size of
> string issues.   For instance, my first attempts to export large NTriples
> files after processing failed due to these issues.   You may be able to
> adapt the code below, but I think that a repeated deleted query limited to
> a # of triples will be best in your case.
>
>
>
> Anyway, the code:
>
>
>
> CREATE PROCEDURE meshrdf_export(in graph_uri varchar, in file_name
> varchar) {
>
>     DECLARE banner any;
>
>     DECLARE env, ses any;
>
>     DECLARE ses_len, max_ses_len any;
>
>
>
>     SET isolation = 'uncommitted';
>
>
>
>     max_ses_len := 10000000;
>
>
>
>     --
>
>     -- Truncate file and write a comment line indicating the graph and
> datetime of export.
>
>     --
>
>     --no_c_escapes-
>
>     banner := sprintf('# <%s> exported at %s\n', graph_uri,
> datestring(now()));
>
>     string_to_file (file_name, banner, -2);
>
>
>
>     env := vector (0, 0, 0);
>
>     ses := string_output ();
>
>
>
>     FOR (SELECT * FROM (SPARQL
>
>                         define input:storage ""
>
>                         SELECT ?s ?p ?o WHERE {
>
>                           GRAPH `iri(?:graph_uri)` {
>
>                             ?s ?p ?o
>
>                           }
>
>                         } ORDER BY ?s ?p ?o) AS sub OPTION (loop)) DO {
>
>         http_nt_triple (env, "s", "p", "o", ses);
>
>         ses_len := length (ses);
>
>
>
>         IF (ses_len > max_ses_len) {
>
>             string_to_file (file_name, ses, -1);
>
>             ses := string_output ();
>
>         }
>
>     }
>
>     IF (length (ses)) {
>
>         string_to_file (file_name, ses, -1);
>
>     }
>
> }
>
>
>
> Dan Davis, Systems/Applications Architect (Contractor),
>
> Office of Computer and Communications Systems,
>
> National Library of Medicine, NIH
>
>
>
>
>
> *From:* Pantelis Natsiavas [mailto:natsia...@gmail.com]
> *Sent:* Wednesday, August 17, 2016 4:36 AM
> *To:* virtuoso-users <virtuoso-users@lists.sourceforge.net>
> *Subject:* [Virtuoso-users] Deleting large number of triples
>
>
>
> Hi everybody.
>
>
>
> I am trying to delete a large number of triples of a very big graph. The
> graph contains *217.609.545* triples and I want to delete all the triples
> having a specific predicate (*64.884.016* triples).
>
>
>
> I am trying to do it through the isql-v command line interface, using the
> command:
>
>
>
> SPARQL DEFINE sql:log-enable 3
>
> WITH <graph>
>
> DELETE { ?s <predicate> ?o }
>
> WHERE{ ?s <predicate> ?o }
>
>
>
> After some time (I don't know exactly how much) I got the error
>
>
>
> *** Error 42000: [Virtuoso Driver][Virtuoso Server]FRVEC: array in for
> vectored over max vector length 2000000 > 1000000
> at line 1 of Top-Level:
>
>
>
>  I checked the virtuoso.log and I see nothing related to the specific
> error.
>
>
>
> I changed the parameters in virtuoso.ini:
>
> MaxQueryMem  = 8G          ; from 2G
> VectorSize = 1000               ; not changed
> MaxVectorSize = 2000000  ; from 1000000
> AdjustVectorSize = 1           ; from 0
>
>
>
> I am not very confident about these changes in virtuoso settings, but
> checking the http://docs.openlinksw.com/virtuoso/dbadm.html these changes
> seemed the right thing to do.
>
>
>
> I restarted the VM and retried the whole process. After one hour, the
> memory consumed by Virtuoso got around 100% and got an error:
>
> *** Error 08S01: [Virtuoso Driver]CL065: Lost connection to server
>
>
>
> Please note that from previous similar errors, I already have the
> following virtuoso.ini settings:
>
> NumberOfBuffers = 1360000
> MaxDirtyBuffers = 1000000
> ThreadCleanupInterval    = 1
> ResourcesCleanupInterval = 1
>
>
>
> My questions:
>
> 1. Is there any way to improve my query in order to facilitate its
> processing? It is the first time I am doing a DELETE query and I am not
> comfortable with it.
>
> 2. Is there any way to "split" the query so that it doesn't need to handle
> all these triples at once?
>
> 3. Alternatively, is there any configuration change that might improve
> memory handling in order to handle such big queries?
>
>
>
> Kind regards,
>
> Pantelis Natsiavas
>
>
>
>
>
------------------------------------------------------------------------------
_______________________________________________
Virtuoso-users mailing list
Virtuoso-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/virtuoso-users

Reply via email to