Hello Peter,

Messages like
07:42:06 Bad parent link in 506496, coming from 506536, parent link =
249186 unconfirmed, detected after wait.

does not indicate real problem, this may happen when many threads
continuously tries to update same table in same places. This was really
unusual before RDF is added to the database, now we will probably
entirely disable this diagnostics.

Message
09:38:47 Bad parent link in 1732780, coming from 457028, parent link =
1809981 confirmed.
09:38:47 Consult your documentation on how to recover from this situation
09:38:47 GPF: gate.c:738 fatal consistency check failure

is about real confirmed error that requires crashdump and restore, see the 
User's Guide for how-to.
There were several fixes that eliminated reasons for such an error.

> isql 1112 dba dba verbose=on banner=off prompt=off echo=ON
> errors=stdout exec="log_enable (0); ttlp_mt (file_to_string_output
> ('`articles_longabstract_pl.nt '), '', 'http://dbpedia.org/');
> checkpoint; log_enable (1);"

log_enable() setting does not affect additional threads that make real
work.

The next update of the documentation will contain description of fifth
parameter of the procedure, now it is optional and undocumented. The
whole signature is

create procedure DB.DBA.TTLP_MT (
    in strg any,       -- text of the resource
    in base varchar,   -- base IRI to resolve relative IRIs to absolute
    in graph varchar,  -- target graph IRI, parsed triples will appear
in that graph.
    in flags int := 0, -- flags, use 0
    in log_mode int) -- Logging mode.

For speed you may experiment with log mode. When set to zero the routine
will log nothing and it can destroy logical integrity of the RDF
storage. If the loading is terminated with an error, the server should
be _killed_ without proper shutdown and transaction log must be erased
before restart. Fast but really unsafe. Do not use in vain.

When set to one the routine will log newly allocated IRIs and literals
but not new quads. The RDF storage always remains in safe state but in
case of server crash and restart with transaction log replay an
unpredictable amount of loaded triples may be lost, so the result of the
log replay may differ from the result of the first run. It should not be
used for 'volatile' data that can not be, say, re-loaded from local
filesystem in case of crash; application may contain log operations to
replay calls of this procedure if text of the resource is permanently
available from some source.

When set to two then every action of the loader is individually
committed and logged so the loading is as safe as any traditional data
manipulation on relational data. It's slow but it's the right choice for
census/financial/customer data.

Similarly, DB.DBA.RDF_LOAD_RDFXML_MT also has log mode as fourth
undocumented argument. As in TTLP_MT case, use the value of two to stay
on safe side unless you precisely know what you're doing.

Note that meaning of this parameter differs from meaning of the argument
of log_enable().

Best Regards,
Ivan Mikhailov.



Reply via email to