Hi Daniel,
It does seem strange that your "sparql load ..." load queries report
they have successfully completed when the data has not actually
loaded completely. Typically with large datasets of this nature we
would expect an error of the form "[Virtuoso Server]SR325:
Transaction aborted because it's log after image size went above the
limit" to be returned due to the transaction log filling up
attempting to load such large datasets. It would be interested to see
your "virtuoso.log" to see if any such errors might have been
reported their ?
As you have done one way around this is to split these file into
small chunks and load separately. Alternatively we recommend ( and
this is what we use for load DBpedia and other large datasets we
host) the use of the following two functions for loading large datasets:
http://docs.openlinksw.com/virtuoso/fn_ttlp_mt.html
http://docs.openlinksw.com/virtuoso/fn_rdf_load_rdfxml.html
Worth noting for next time you have large datasets to load ...
Best Regards
Hugh Williams
Professional Services
OpenLink Software
Web: http://www.openlinksw.com
Support: http://support.openlinksw.com
Forums: http://boards.openlinksw.com/support
On 26 Apr 2009, at 21:48, Daniel Alexander Smith wrote:
Hi,
I've been experimenting with virtuoso (open source), and I seem to hit
a limit of about 200K triples per individually imported N3 file.
When I split up the file, it works fine, so I was just wondering if
there's something i'm doing wrong, or if I can get around this limit
somehow.
(Note that i've removed the file in the example below, as it contains
some non-public data.)
I'm using an N3 file with about 3M triples in it, and the import seems
fine:
SQL> sparql load "http://helix.ecs.soton.ac.uk/musicspacedata/
musicspace-large.n3
" into graph <musicspace>;
callret-0
VARCHAR
_____________________________________________________________________
__________
Load <http://helix.ecs.soton.ac.uk/musicspacedata/musicspace-
large.n3> into graph <musicspace> -- done
1 Rows. -- 5901 msec.
Results are missing, and the number of statements is too low:
SQL> sparql select count(*) where { graph <musicspace> {?d ?f ?g}};
callret-0
INTEGER
_____________________________________________________________________
__________
222893
1 Rows. -- 20 msec.
The file is ~ 139M, and has 3M+ triples in it (one per line):
helix:musicspacedata das05r$ ls -l
total 135932
-rw-r--r-- 1 root root 139049340 Apr 26 21:33 musicspace-large.n3
helix:musicspacedata das05r$ wc -l musicspace-large.n3
3092038 musicspace-large.n3
I'm using the following version of Virtuoso:
Virtuoso Open Source Edition (multi threaded)
Version 5.0.11.3039-pthreads as of Apr 25 2009
Compiled for Linux (x86_64-unknown-linux-gnu)
Any ideas?
Thanks,
Dan
--
Daniel Alexander Smith
IAM Group
School of Electronics and Computer Science
University of Southampton
das...@ecs.soton.ac.uk
----------------------------------------------------------------------
--------
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensign option that enables unlimited
royalty-free distribution of the report engine for externally
facing
server and web deployment.
http://p.sf.net/sfu/businessobjects
_______________________________________________
Virtuoso-users mailing list
Virtuoso-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/virtuoso-users