Hello,

I am trying to upload a very huge file (uniprot.rdf) Its size is about 45GB!! (the compressed file (3.5GB) can be found in: ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/rdf/). I have adapted a bit the virtuoso.ini file setting the striping options (about 100GB reserved). I have also played with the NumberOfBuffers, and MaxCheckPointRemap as suggested in http://docs.openlinksw.com/virtuoso/rdfperformancetuning.html#rdfperfloading . The ISQL sentence I am using is:

DB.DBA.RDF_LOAD_RDFXML(file_to_string_output('/virtuoso/data/rdf/uniprot.rdf'), 'http://www.cellcycleontology.org/ontology/rdf/uniprot', 'http://www.cellcycleontology.org/ontology/rdf/uniprot');

however, after initiating the loading process, virtuoso freezes the OS (the load average of the system rises to 40!!) then after some time I get an error message:

SQL> SET AUTOCOMMIT ON;
SQL> DB.DBA.RDF_LOAD_RDFXML(file_to_string_output('/virtuoso/data/rdf/uniprot.rdf'), 'http://www.cellcycleontology.org/ontology/rdf/uniprot', 'http://www.cellcycleontology.org/ontology/rdf/uniprot');

*** Error 08S01: [Virtuoso Driver]CL065: Lost connection to server
at line 2 of Top-Level:
DB.DBA.RDF_LOAD_RDFXML(file_to_string_output('/virtuoso/data/rdf/uniprot.rdf'), 'http://www.cellcycleontology.org/ontology/rdf/uniprot', 'http://www.cellcycleontology.org/ontology/rdf/uniprot')

It seems that the function DB.DBA.RDF_LOAD_RDFXML_MT <http://docs.openlinksw.com/virtuoso/fn_rdf_load_rdfxml_mt.html> (http://docs.openlinksw.com/virtuoso/functionidx.html) could help me dealing with large RDF files perhaps by loading split files (file_to_string_output) as suggested in http://docs.openlinksw.com/virtuoso/rdfperformancetuning.html#rdfperfloading If it would be the case, how would it be recommended to split such as large file? In the http://docs.openlinksw.com/virtuoso/fn_file_to_string_output.html it is mentioned that the initial and final segments should be defined (how long should they be?). Once loaded, will virtuoso be able to cope with such DB? Were some tuning in the INI parameters be still needed/suggested?

thanks in advance for any hints,
Erick

--
==================================================================
Erick Antezana                    http://www.cellcycleontology.org
PhD student
Tel:+32 (0)9 331 38 24                        fax:+32 (0)9 3313809
VIB Department of Plant Systems Biology, Ghent University
Technologiepark 927, 9052 Gent, BELGIUM
er...@psb.ugent.be                  http://www.psb.ugent.be/~erant
==================================================================

Reply via email to