Kunal, The log_mode parameter of DB.DBA.RDF_LOAD_RDFXML_MT() does not match one-to-one to the mode set by log_enable(). In multithreaded RDF loaders quads are always committed without waiting for end of loading but log_mode parameter controls how much data are written into log.
There exists an integrity issue. There are tables that store 'numeric' values of type IRI_IDs for all IRIs stored in the database, so it's possible to find exising or create new IRI_ID for given IRI string or find IRI string by its IRI_ID. RDF_QUAD table should not contain IRI_IDs that are not listed in that tables. If this ever happens then it's better to zap entire RDF storage and reload all RDF data from scratch. This may be impossible so the integrity violation should be avoided. The only way to get this integrity violation is to kill the server when some quads are written to the database and logged whereas new IRI_IDs are not logged. This is possible, for instance, when a transaction with disabled logging loads a resource, allocates a new IRI_ID for a new IRI, same IRI occurs in triples loaded by second client with enabled logging and then crash happens. After server restart and log replay the database will contain quads of the second client but not IRI_IDs that were not written to the log by the first client. To avoid the problem, multithread parsers use the following logging modes: log_mode=0 means no logging at all; all RDF data should be re-loaded in case of server crash. log_mode=1 means log only IRI_IDs, but not quads. This provides RDF storage integrity in case of any crashes, but the administrator should re-load resources that were loading at the time of crash, otherwise the storage will contain no new quads or some parts of them. log_mode=2 means logging of both IRI_IDs and quads. This is the default, when unsure -- use it. Best Regards, Ivan Mikhailov, OpenLink Software. On Wed, 2008-02-27 at 15:09 -0800, Kunal Patel wrote: > Hi all, > > When I use the function DB.DBA.RDF_LOAD_RDFXML_MT with the log_mode > parameter set to 0, does it log all the transactions? Would it be > more efficient to set the value of log_mode to 2 instead of 0 when > loading a huge rdf file. I believe using the value 2 turns on the row > by row autocommit which the virtuoso documentation says, > > "Row by row autocommit mode is good for any batch operations where > concurrent updates are not expected or are not an issue. Examples > include bulk loading of data, materialization of RDF inferred data > etc." http://docs.openlinksw.com/virtuoso/coredbengine.html > > Regards, > Kunal > > > > ______________________________________________________________________ > Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try > it now. > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2008. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ Virtuoso-users mailing list > Virtuoso-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/virtuoso-users