Re: [Virtuoso-users] speeding up SPARQL query

Bart Vandewoestyne Fri, 18 Apr 2014 03:14:23 -0700

On 2014-04-18 04:50, Hugh Williams wrote:
>
>> What else could I try?  Do you need any further info from my side?
>
> [Hugh] I note that you do not appear to have any of the parameters the new v7 
> vectored execution set in your INI file , these being:
>
> MaxQueryMem            = 2G           ; memory allocated to query processor
> VectorSize             = 1000         ; initial parallel query vector (array 
> of query operations) size
> MaxVectorSize          = 1000000      ; query vector size threshold.
> AdjustVectorSize       = 0
> ThreadsPerQuery        = 8
> AsyncQueueMaxThreads   = 10
>
> See the following documentation for details on use:
>
>       http://docs.openlinksw.com/virtuoso/databaseadmsrv.html#confvectexec
>       http://docs.openlinksw.com/virtuoso/vexqrparl.html
>
> These should improve query performance, although probably not the relative 
> performance between the two builds.


Hugh,

Note that I did have

AdjustVectorSize = 1
MaxVectorSize = 3000000

in my config file, but I'll try next with the following settings for my 
24-core machine (the output of nproc on linux is 24):

VectorSize = 10000        ; The default is 10000, which is good for
                           ; most cases.
MaxQueryMem = 1G          ; default
AdjustVectorSize = 1      ; default
MaxVectorSize = 1000000   ; This can reach up to 4000000 but values in
                           ; excess of 1000000 have not been found
                           ; useful in practice.
ThreadsPerQuery = 24      ; The number of cores on the machine is a
                           ; reasonable default if running large queries.
AsyncQueueMaxThreads = 16 ; default (TODO: we have 24 cpus, so
                           ; experiment with other values!)

I'm either taking the default here, or following the advice written in 
the documentation.  It's hard to understand what all these parameters 
exactly stand for... so if you have better suggestions for these 
parameters, or formulas to calculate their optimal value, I would be 
glad to hear!


Next to that, I also played around with the IndexTreeMaps parameter. 
Query timing results for a value of 64, 256 and 1024 can be found at

https://dl.dropboxusercontent.com/u/32340538/query_timings_IndexTreeMaps.png

As you can see, for one of my two worst queries, the result is clearly 
best with IndexTreeMaps=256, not 1024.  So it isn't always 'the higher 
the, the better' as mentioned in the documentation...

> If the query response time are still not fast enough, are you able to provide 
> a copy of the datasets such that we can load locally for recreation ?

I cannot provide you a copy of the dataset as I am bound by an NDA.  I 
will first try with the vectored execution parameters mentioned above 
and see if that brings any improvement.  If not, we're up for another 
iteration ;-)  In the meanwhile, still feel free to point me to other 
parameters that I might still tweak.

Kind regards,
Bart

------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/NeoTech
_______________________________________________
Virtuoso-users mailing list
Virtuoso-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/virtuoso-users

Re: [Virtuoso-users] speeding up SPARQL query

Reply via email to