Re: [Virtuoso-users] speeding up SPARQL query

Hugh Williams Fri, 18 Apr 2014 03:35:38 -0700

Hi Bart,

If you cannot provide the data then the next best thing would be to get the 
"profile" for the problem queries ie the compiler and execution plans for 
analysis as detailed at:


        
http://virtuoso.openlinksw.com/dataspace/doc/dav/wiki/Main/VirtTipsAndTricksAanalyzingSPARQLQuery
        http://docs.openlinksw.com/virtuoso/databaseadmsrv.html#EfficientSQL

Best Regards
Hugh Williams
Professional Services
OpenLink Software, Inc.      //              http://www.openlinksw.com/
Weblog   -- http://www.openlinksw.com/blogs/
LinkedIn -- http://www.linkedin.com/company/openlink-software/
Twitter  -- http://twitter.com/OpenLink
Google+  -- http://plus.google.com/100570109519069333827/
Facebook -- http://www.facebook.com/OpenLinkSoftware
Universal Data Access, Integration, and Management Technology Providers

On 18 Apr 2014, at 11:13, Bart Vandewoestyne <bart.vandewoest...@telenet.be> 
wrote:

> On 2014-04-18 04:50, Hugh Williams wrote:
>> 
>>> What else could I try?  Do you need any further info from my side?
>> 
>> [Hugh] I note that you do not appear to have any of the parameters the new 
>> v7 vectored execution set in your INI file , these being:
>> 
>> MaxQueryMem           = 2G           ; memory allocated to query processor
>> VectorSize            = 1000         ; initial parallel query vector (array 
>> of query operations) size
>> MaxVectorSize         = 1000000      ; query vector size threshold.
>> AdjustVectorSize      = 0
>> ThreadsPerQuery       = 8
>> AsyncQueueMaxThreads          = 10
>> 
>> See the following documentation for details on use:
>> 
>>      http://docs.openlinksw.com/virtuoso/databaseadmsrv.html#confvectexec
>>      http://docs.openlinksw.com/virtuoso/vexqrparl.html
>> 
>> These should improve query performance, although probably not the relative 
>> performance between the two builds.
> 
> Hugh,
> 
> Note that I did have
> 
> AdjustVectorSize = 1
> MaxVectorSize = 3000000
> 
> in my config file, but I'll try next with the following settings for my 
> 24-core machine (the output of nproc on linux is 24):
> 
> VectorSize = 10000        ; The default is 10000, which is good for
>                           ; most cases.
> MaxQueryMem = 1G          ; default
> AdjustVectorSize = 1      ; default
> MaxVectorSize = 1000000   ; This can reach up to 4000000 but values in
>                           ; excess of 1000000 have not been found
>                           ; useful in practice.
> ThreadsPerQuery = 24      ; The number of cores on the machine is a
>                           ; reasonable default if running large queries.
> AsyncQueueMaxThreads = 16 ; default (TODO: we have 24 cpus, so
>                           ; experiment with other values!)
> 
> I'm either taking the default here, or following the advice written in 
> the documentation.  It's hard to understand what all these parameters 
> exactly stand for... so if you have better suggestions for these 
> parameters, or formulas to calculate their optimal value, I would be 
> glad to hear!
> 
> 
> Next to that, I also played around with the IndexTreeMaps parameter. 
> Query timing results for a value of 64, 256 and 1024 can be found at
> 
> https://dl.dropboxusercontent.com/u/32340538/query_timings_IndexTreeMaps.png
> 
> As you can see, for one of my two worst queries, the result is clearly 
> best with IndexTreeMaps=256, not 1024.  So it isn't always 'the higher 
> the, the better' as mentioned in the documentation...
> 
>> If the query response time are still not fast enough, are you able to 
>> provide a copy of the datasets such that we can load locally for recreation ?
> 
> I cannot provide you a copy of the dataset as I am bound by an NDA.  I 
> will first try with the vectored execution parameters mentioned above 
> and see if that brings any improvement.  If not, we're up for another 
> iteration ;-)  In the meanwhile, still feel free to point me to other 
> parameters that I might still tweak.
> 
> Kind regards,
> Bart
> 
> ------------------------------------------------------------------------------
> Learn Graph Databases - Download FREE O'Reilly Book
> "Graph Databases" is the definitive new guide to graph databases and their
> applications. Written by three acclaimed leaders in the field,
> this first edition is now available. Download your free book today!
> http://p.sf.net/sfu/NeoTech
> _______________________________________________
> Virtuoso-users mailing list
> Virtuoso-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/virtuoso-users

------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/NeoTech

_______________________________________________
Virtuoso-users mailing list
Virtuoso-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/virtuoso-users

Re: [Virtuoso-users] speeding up SPARQL query

Reply via email to