Jörn,

> On 17 Nov 2015, at 19:58, Jörn Hees <j_h...@cs.uni-kl.de> wrote:
> 
> Hi Hugh,
> 
> thanks for helping me with this...
> 
> Just for context:
> As you might have guessed i'm updating our research group's local linked data 
> endpoint and this guide:
> https://joernhees.de/blog/2014/11/10/setting-up-a-local-dbpedia-2014-mirror-with-virtuoso-7-1-0/
> 
> So my use-case is loading some pretty big RDF dumps that we use and analyse 
> in several research projects (btw. an equally big thank you guys for making 
> that possible with Virtuoso ;) ).
> After that initial loading the Virtuoso DB is basically never modified, 
> meaning there is mostly read-only access via SPARQL.
> 
> What puzzles me is that after import and several checkpoints and restarts, 
> just leaving the DB idle without any queries (see below) it seems to become 
> busy.
> I guess it does some kind of "re-organization" and i'd mostly like to find 
> out how i can tell it "do it now, take all resources you want, don't care if 
> anyone is waiting, admin override, full speed ;)".
> That would allow me to then have that static state of the DB which i can 
> back-up and replay if things go wrong or someone wants an old version, 
> leaving us with "ready to use" backups, and not such that first start some 
> lengthy "re-organization after mass import".
> 
> The mentioned "re-organization state" now seems to be over after leaving the 
> DB switched on and idle for the last couple of days.

[Hugh] Does your database have Full Text indexing enabled which would is  a 
scheduled background task that would take time to complete on a newly loaded 
large database like yours, see:

        
http://docs.openlinksw.com/virtuoso/sparqlextensions.html#rdfsparqlrulefulltext 
<http://docs.openlinksw.com/virtuoso/sparqlextensions.html#rdfsparqlrulefulltext>

> 
> Remaining comments inline...
> 
> 
>> On 17 Nov 2015, at 04:04, Hugh Williams <hwilli...@openlinksw.com> wrote:
>> 
>> I don’t see any indication of you having provided a copy of your 
>> “virtuoso.ini” thus can you please provide one ?
> 
> I only had the following values in my last mail, but i also attached the full 
> virtuoso.ini below...
> 
>>> ;; values for 128 GB RAM
>>> MaxCheckpointRemap = 1250000
>>> NumberOfBuffers          = 10900000
>>> MaxDirtyBuffers          = 8000000
> 
> <virtuoso.ini>
> 
> 
>> So you have about 6billion triples with 128GB RAM available , which should 
>> be sufficient. 
>> 
>> Can you provide some details of database space consumption by querying the 
>> DB.DBA.SYS_INDEX_SPACE_STATS view as detailed at:
>> 
>>      http://docs.openlinksw.com/virtuoso/databaseadmsrv.html#statusfunc
>> 
>> and also query the DB.DBA.sys_col_info to obtain information on column store 
>> space utlliization as detailed at:
>> 
>>      http://docs.openlinksw.com/virtuoso/coredbengine.html#colstorespaceutil
> 
> 
> Note: as mentioned above the "re-organization state" seems to be over now and 
> i ran this command on a 0 % CPU idling DB:
> <db.dba.sys_index_space_stats.txt><db.dba.sys_col_info.txt>
> 
> 
>> The second status output after about 5mins shows "10900000 buffers, 3662255 
>> used," what activities would have been performed on the database up to this 
>> time ie would you have executed the typical query work load to run on the 
>> system such that the database fully warmed up to that point only, as only 
>> about a third of the allocated buffers are in use ?
> 
> That's the thing... i just started the DB and did nothing. (I can also 
> exclude requests by others as i had even pulled up a firewall only allowing 
> local traffic to make sure that nothing disturbs the import / anyone from our 
> group runs queries against a half-loaded endpoint.)
> I still have the exact snapshot of the DB after import that when started 
> showed the behaviour above in case you want that (~ 86 GB compressed).
> 
> 
>> The following error in the log you asked about can be ignored as they are 
>> not dangerous and more for information as the server has corrected the 
>> condition:
>> 
>>      04:32:47 cpt/lt unusual cond rltrx.c:1437
>>      15:01:58 The __sys_ext_map free pages incorrect 270 != 140 actually free
> 
> OK, good to know, i guess the wording just worried me a bit.
> 
> 
>> For the following errors:
>> 
>>      12:54:51 Checkpoint removed 2004 MB of remapped pages, leaving 0 MB. 
>> Duration     18.78 s.  To save this time, increase MaxCheckpointRemap and/or 
>> set Unremap quota to 0 in ini file.
>> 
>> It is recommended the MaxCheckpointRemap is set to 1/4 of the database size, 
>> thus what is yours set to ? See:
>> 
>>      
>> http://virtuoso.openlinksw.com/dataspace/doc/dav/wiki/Main/VirtRDFPerformanceTuning
>>      http://docs.openlinksw.com/virtuoso/checkpoint.html#checkpointparams
>> 
> 
> Hmm, i think i might have overlooked this and set it to 1/8th instead of 
> 1/4th, but reading the docs on it again i'm actually asking myself if after 
> that massive import i should set it to 0 and run a checkpoint which just puts 
> all pages back where they come from.
> Is that maybe what i should be doing and am searching for?

[Hugh] Should not be necessary ..

Regards
Hugh

> 
> 
>> Given the size of the database do you have striping enabled especially if 
>> you have multiple file systems the database can be striped across as this 
>> can improve performance of IO operations to disk, see:
>> 
>>      http://docs.openlinksw.com/virtuoso/databaseadmsrv.html#ini_Striping
>>      http://docs.openlinksw.com/virtuoso/databaseadmsrv.html#IOQS
>> 
>> What type of disks are being used are they SSD or normal or other ?
> 
> No, currently no striping is enabled and the whole DB is on a single SSD 
> drive (sadly the only one in the server), but thanks for the tipp...
> 
> 
> Cheers,
> Jörn
> 

------------------------------------------------------------------------------
_______________________________________________
Virtuoso-users mailing list
Virtuoso-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/virtuoso-users

Reply via email to