Hi Ivan,

thanks for the suggestions. I launched about two hours ago once again the loading process. It has run for about 2 hours. and, after the first 5 minutes the load of the system reached 40!

It just stopped again. I set the number of buffers to 1250000 (I have 16GB fo RAM) and I used DB.DBA.RDF_LOAD_RDFXML_MT. However, the server stopped with the following piece of log:

15:53:36 { Loading plugin 1: Type `plain', file `wikiv' in `/virtuoso/data/virtuoso-opensource-full/lib/virtuoso/hosting'
15:53:36   WikiV version 0.6 from OpenLink Software
15:53:36   Support functions for WikiV collaboration tool
15:53:36 SUCCESS plugin 1: loaded from /virtuoso/data/virtuoso-opensource-full/lib/virtuoso/hosting/wikiv.so } 15:53:36 { Loading plugin 2: Type `plain', file `mediawiki' in `/virtuoso/data/virtuoso-opensource-full/lib/virtuoso/hosting'
15:53:36   MediaWiki version 0.1 from OpenLink Software
15:53:36   Support functions for MediaWiki collaboration tool
15:53:36 SUCCESS plugin 2: loaded from /virtuoso/data/virtuoso-opensource-full/lib/virtuoso/hosting/mediawiki.so } 15:53:36 { Loading plugin 3: Type `plain', file `creolewiki' in `/virtuoso/data/virtuoso-opensource-full/lib/virtuoso/hosting'
15:53:36   CreoleWiki version 0.1 from OpenLink Software
15:53:36   Support functions for CreoleWiki collaboration tool
15:53:36 SUCCESS plugin 3: loaded from /virtuoso/data/virtuoso-opensource-full/lib/virtuoso/hosting/creolewiki.so }
15:53:36 OpenLink Virtuoso Universal Server
15:53:36 Version 05.00.3026-pthreads for Linux as of Feb 12 2008
15:53:36 uses parts of OpenSSL, PCRE, Html Tidy
15:53:37 Database version 3016
15:53:41 SQL Optimizer enabled (max 1000 layouts)
15:53:42 Compiler unit is timed at 0.001329 msec
15:53:45 Roll forward started
15:53:45 Roll forward complete
15:53:45 Can't open file /virtuoso/data/virtuoso-opensource-full/var/lib/virtuoso/db//sesJqs1qg, error 2 15:53:46 Can't open file /virtuoso/data/virtuoso-opensource-full/var/lib/virtuoso/db//sesbsZwzc, error 2 15:53:46 Can't open file /virtuoso/data/virtuoso-opensource-full/var/lib/virtuoso/db//sesDfxtF9, error 2 15:53:46 Can't open file /virtuoso/data/virtuoso-opensource-full/var/lib/virtuoso/db//sesFuJxT6, error 2 15:53:47 Can't open file /virtuoso/data/virtuoso-opensource-full/var/lib/virtuoso/db//sesVqUfq5, error 2 15:53:47 Can't open file /virtuoso/data/virtuoso-opensource-full/var/lib/virtuoso/db//sesaOqF63, error 2
15:53:47 Checkpoint made, log reused
15:53:48 HTTP/WebDAV server online at 8891
15:53:48 Server online at 1112 (pid 9253)
15:57:05 Can't open file /virtuoso/data/virtuoso-opensource-full/var/lib/virtuoso/db//sesHu3o0M, error 2
17:27:24 Server received signal 15
17:27:24 Server received signal 15
17:27:24 Server received signal 15
17:27:24 Server received signal 15
17:27:24 Server received signal 15
17:27:24 Server received signal 15
17:27:24 Server received signal 15
17:27:24 Server received signal 15
17:27:24 Server received signal 15
17:27:24 Server received signal 15
17:27:24 Server received signal 15
17:27:24 Server received signal 15
17:27:24 Server received signal 15
17:27:24 Server received signal 15
17:27:24 Server received signal 15
17:27:24 Server received signal 15
17:27:24 Server received signal 15
17:27:24 Server received signal 15
17:27:24 Server received signal 15
17:27:24 Server received signal 15
17:27:24 Server received signal 15
17:27:24 Server received signal 15
17:27:24 Server received signal 15
17:27:24 Server received signal 15
17:27:24 Server received signal 15
17:27:24 Server received signal 15
17:27:24 Server received signal 15
17:27:50 Initiating quick shutdown
17:27:57 Server shutdown complete


and the ISQL:

SQL> DB.DBA.RDF_LOAD_RDFXML_MT(file_to_string_output('/virtuoso/data/rdf/uniprot.rdf'), 'http://www.cellcycleontology.org/ontology/rdf/uniprot', 'http://www.cellcycleontology.org/ontology/rdf/uniprot');
*** Error 08S01: [Virtuoso Driver]CL065: Lost connection to server
at line 1 of Top-Level:
DB.DBA.RDF_LOAD_RDFXML_MT(file_to_string_output('/virtuoso/data/rdf/uniprot.rdf'), 'http://www.cellcycleontology.org/ontology/rdf/uniprot', 'http://www.cellcycleontology.org/ontology/rdf/uniprot')

Any idea what could be the problem? The initial DB is empty. Do you think that the striping may have a side effect or any other INI param?

Erick

Ivan Mikhailov wrote:
Eric,

I've loaded Uniprot (both reviewed and not-reviewed parts, all 14.2Gb)
into a database that was some about 5Gb before the loading.
I've used DB.DBA.RDF_LOAD_RDFXML_MT without any splitting into parts. It
was a bit slower than I've expected but there were no hangs. I've set
checkpoint_interval (6000) before the start, just to know better how
much disk pages I really need for that amount of data but I sure that
made no important difference.

I've used 2 x Quad Xeon box with 16Gb RAM and 6 cheap SATA disks. During
loading I've used only 1000000 buffers (i.e. only 8Gb of RAM was used
for the database) because my Fedore 2.6.21-6.fc7xen makes weird things
if I try to allocate 12Gb (the sum of resident sizes of all processes +
free RAM + buffer RAM suddenly becomes much less than 16Gb; that result
in weird swapping; I don't know the exact reason). This problem with
memory allocation seems to be my personal problem because other Linux
boxes work fine. This problem happens only during intensive data
loading, I use 1500000 buffers for all other activities (i.e. 12Gb out
of 16Gb are usually for buffers).

Best Regards,

Ivan Mikhailov,
OpenLink Software.

On Tue, 2008-05-06 at 14:02 +0200, Erick Antezana wrote:
Hello,

I am trying to upload a very huge file (uniprot.rdf) Its size is about 45GB!! (the compressed file (3.5GB) can be found in: ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/rdf/). I have adapted a bit the virtuoso.ini file setting the striping options (about 100GB reserved). I have also played with the NumberOfBuffers, and MaxCheckPointRemap as suggested in http://docs.openlinksw.com/virtuoso/rdfperformancetuning.html#rdfperfloading . The ISQL sentence I am using is:

DB.DBA.RDF_LOAD_RDFXML(file_to_string_output('/virtuoso/data/rdf/uniprot.rdf'), 'http://www.cellcycleontology.org/ontology/rdf/uniprot', 'http://www.cellcycleontology.org/ontology/rdf/uniprot');

however, after initiating the loading process, virtuoso freezes the OS (the load average of the system rises to 40!!) then after some time I get an error message:

SQL> SET AUTOCOMMIT ON;
SQL> DB.DBA.RDF_LOAD_RDFXML(file_to_string_output('/virtuoso/data/rdf/uniprot.rdf'), 'http://www.cellcycleontology.org/ontology/rdf/uniprot', 'http://www.cellcycleontology.org/ontology/rdf/uniprot');

*** Error 08S01: [Virtuoso Driver]CL065: Lost connection to server
at line 2 of Top-Level:
DB.DBA.RDF_LOAD_RDFXML(file_to_string_output('/virtuoso/data/rdf/uniprot.rdf'), 'http://www.cellcycleontology.org/ontology/rdf/uniprot', 'http://www.cellcycleontology.org/ontology/rdf/uniprot')

It seems that the function DB.DBA.RDF_LOAD_RDFXML_MT <http://docs.openlinksw.com/virtuoso/fn_rdf_load_rdfxml_mt.html> (http://docs.openlinksw.com/virtuoso/functionidx.html) could help me dealing with large RDF files perhaps by loading split files (file_to_string_output) as suggested in http://docs.openlinksw.com/virtuoso/rdfperformancetuning.html#rdfperfloading If it would be the case, how would it be recommended to split such as large file? In the http://docs.openlinksw.com/virtuoso/fn_file_to_string_output.html it is mentioned that the initial and final segments should be defined (how long should they be?). Once loaded, will virtuoso be able to cope with such DB? Were some tuning in the INI parameters be still needed/suggested?

thanks in advance for any hints,
Erick


--
==================================================================
Erick Antezana                    http://www.cellcycleontology.org
PhD student
Tel:+32 (0)9 331 38 24                        fax:+32 (0)9 3313809
VIB Department of Plant Systems Biology, Ghent University
Technologiepark 927, 9052 Gent, BELGIUM
er...@psb.ugent.be                  http://www.psb.ugent.be/~erant
==================================================================

Reply via email to