On 2014-04-16 11:43, Hugh Williams wrote: > Hi Bart, > > Your status(''); output shows that the default Buffers size of 20000 is > being allocated on server startup and all quickly used due to the size > of your dataset, with the "NumberOfBuffers" setting of 2000000 in the > INI file not being pickup due to those leading spaces which should be > removed and the server restarted: > >> 20000 buffers, 19878 used, > >> IndexTreeMaps = 64 >> NumberOfBuffers = 2000000 >> MaxDirtyBuffers = 1500000
Thanks for pointing this out, Hugh! I did not know leading spaces were not allowed in the virtuoso.ini file. I didn't came across this in the documentation. Is it documented somewhere? By using the correct values for NumberOfBuffers and MaxDirtyBuffers, I was able to speed up my slowest query quite a bit. See graph at https://dl.dropboxusercontent.com/u/32340538/query_timings.png The red bars are timings for queries using the defaults NumberOfBuffers = 2000 MaxDirtyBuffers = 1200 The green ones are with the new values (calculated for 24 GB of free memory). As you can see, my previously worst query dropped from >140s to about 65s. My now worst query still takes about 100s. Both 100s and 65s are still not acceptable for our application, so we still want it faster. I'm confident that further tweaking my virtuoso.ini can still speed things up. It is however difficult to know what parameters I can further finetune. I already tried changing O_DIRECT=1 instead of O_DIRECT=0 but that didn't help. My swapiness is set to 10 instead of 60. What else could I try? Below, you can find more information on my setup. If you need more info, just let me know! Kind regards, Bart ---------------- operating system ------------------------------ Ubuntu 12.04.4 LTS Linux hp-g7-02 3.2.0-59-generic #90-Ubuntu SMP Tue Jan 7 22:43:51 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux ----------------- output of free -m ---------------------------- bart@hp-g7-02:~$ free -m total used free shared buffers cached Mem: 24101 5306 18794 0 3 4882 -/+ buffers/cache: 420 23681 Swap: 529 24 505 ----------- output of status() while worst query is running ------------ SQL> status(); REPORT VARCHAR _______________________________________________________________________________ OpenLink Virtuoso Server Version 07.00.3203-pthreads for Linux as of Mar 26 2014 Started on: 2014/04/17 14:12 GMT+120 Database Status: File size 152433590272, 18607616 pages, 5854115 free. 2000000 buffers, 242406 used, 0 dirty 0 wired down, repl age 0 0 w. io 158 w/crsr. Disk Usage: 243428 reads avg 0 msec, 0% r 0% w last 5 s, 1087 writes, 1098 read ahead, batch = 218. Autocompact 0 in 0 out, 0% saved. Gate: 179 2nd in reads, 0 gate write waits, 0 in while read 0 busy scrap. Log = /data/virtuoso7/1114/virtuoso.trx, 185 bytes 12753014 pages have been changed since last backup (in checkpoint state) Current backup timestamp: 0x0000-0x00-0x00 Last backup date: unknown Clients: 2 connects, max 2 concurrent RPC: 23 calls, 2 pending, 2 max until now, 0 queued, 0 burst reads (0%), 0 second brk=16624304128 Checkpoint Remap 38 pages, 0 mapped back. 0 s atomic time. DB master 18607616 total 5854115 free 38 remap 0 mapped back temp 256 total 251 free Lock Status: 0 deadlocks of which 0 2r1w, 0 waits, Currently 2 threads running 0 threads waiting 0 threads in vdb. Pending: Client 1111:1: Account: dba, 1167 bytes in, 754 bytes out, 1 stmts. PID: 25685, OS: unix, Application: unknown, IP#: 127.0.0.1 Transaction status: PENDING, 1 threads. Locks: Client 1111:2: Account: dba, 1107 bytes in, 21127 bytes out, 1 stmts. PID: 25689, OS: unix, Application: unknown, IP#: 127.0.0.1 Transaction status: PENDING, 1 threads. Locks: Running Statements: Time (msec) Text 542 status() 31685 #line 1 "temp.sparql" sparql # This query finds the first 15 items containing a Hash indexes 45 Rows. -- 542 msec. ----------------- virtuoso version ----------------------------- Virtuoso Open Source Edition (Column Store) (multi threaded) Version 7.0.0-rc2.3203-pthreads as of Mar 26 2014 Compiled for Linux (x86_64-unknown-linux-gnu) Copyright (C) 1998-2013 OpenLink Software ------------- current worst query ---------------------------- PREFIX foo: <http://blabla_unimportant_blabla#> SELECT ?id ?label ?term ?type WHERE { { SELECT ?id WHERE { ?id foo:hasSearchTerm ?searchterm . ?searchterm bif:contains '"bar*"' . } GROUP BY ?id LIMIT 15 } ?id foo:preferredLabel ?label ; disq:hasSearchTerm ?term ; a ?type } -------- current second worst query ------------------------- SELECT ?val (COUNT(DISTINCT ?id) as ?vc) WHERE { ?id <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> ?val ; ?property1 ?property_value1 ; ?property2 ?property_value2 . ?property_value1 bif:contains "'foofoo'" . ?property_value2 bif:contains "'barbar'" . } GROUP BY ?val ORDER BY DESC(?vc) ----------------------------- virtuoso.ini --------------------- [Database] DatabaseFile = /data/virtuoso7/1114/virtuoso.db ErrorLogFile = /data/virtuoso7/1114/virtuoso.log LockFile = /data/virtuoso7/1114/virtuoso.lck TransactionFile = /data/virtuoso7/1114/virtuoso.trx xa_persistent_file = /data/virtuoso7/1114/virtuoso.pxa ErrorLogLevel = 7 FileExtend = 200 MaxCheckpointRemap = 2000 Striping = 0 TempStorage = TempDatabase [TempDatabase] DatabaseFile = /data/virtuoso7/1114/virtuoso-temp.db TransactionFile = /data/virtuoso7/1114/virtuoso-temp.trx MaxCheckpointRemap = 2000 Striping = 0 [Parameters] ServerPort = 1111 LiteMode = 0 DisableUnixSocket = 1 DisableTcpSocket = 0 ServerThreads = 20 CheckpointInterval = 60 O_DIRECT = 0 CaseMode = 2 MaxStaticCursorRows = 5000 CheckpointAuditTrail = 0 AllowOSCalls = 0 SchedulerInterval = 10 DirsAllowed = ., /opt/virtuoso/share/virtuoso/vad ThreadCleanupInterval = 0 ThreadThreshold = 10 ResourcesCleanupInterval = 0 FreeTextBatchSize = 100000 SingleCPU = 0 VADInstallDir = /opt/virtuoso/share/virtuoso/vad/ PrefixResultNames = 0 RdfFreeTextRulesSize = 100 IndexTreeMaps = 256 MaxMemPoolSize = 200000000 PrefixResultNames = 0 MacSpotlight = 0 IndexTreeMaps = 64 NumberOfBuffers = 2000000 MaxDirtyBuffers = 1500000 AdjustVectorSize = 1 MaxVectorSize = 3000000 [HTTPServer] ServerPort = 8890 ServerRoot = /opt/virtuoso/var/lib/virtuoso/vsp ServerThreads = 20 DavRoot = DAV EnabledDavVSP = 0 HTTPProxyEnabled = 0 TempASPXDir = 0 DefaultMailServer = localhost:25 ServerThreads = 10 MaxKeepAlives = 10 KeepAliveTimeout = 10 MaxCachedProxyConnections = 10 ProxyConnectionCacheTimeout = 15 HTTPThreadSize = 280000 HttpPrintWarningsInOutput = 0 Charset = UTF-8 [AutoRepair] BadParentLinks = 0 [Client] SQL_PREFETCH_ROWS = 100 SQL_PREFETCH_BYTES = 16000 SQL_QUERY_TIMEOUT = 0 SQL_TXN_TIMEOUT = 0 [VDB] ArrayOptimization = 0 NumArrayParameters = 10 VDBDisconnectTimeout = 1000 KeepConnectionOnFixedThread = 0 [Replication] ServerName = db-HP-G7-02 ServerEnable = 1 QueueMax = 50000 [Striping] Segment1 = 100M, db-seg1-1.db, db-seg1-2.db Segment2 = 100M, db-seg2-1.db [Zero Config] ServerName = virtuoso (HP-G7-02) [Mono] [URIQA] DynamicLocal = 0 DefaultHost = localhost:8890 [SPARQL] ResultSetMaxRows = 10000 MaxQueryExecutionTime = 600 ; in seconds DefaultQuery = select distinct ?Concept where {[] a ?Concept} LIMIT 100 DeferInferenceRulesInit = 0 ; controls inference rules loading [Plugins] LoadPath = /opt/virtuoso/lib/virtuoso/hosting Load1 = plain, wikiv Load2 = plain, mediawiki Load3 = plain, creolewiki ------------------------------------------------------------------------------ Learn Graph Databases - Download FREE O'Reilly Book "Graph Databases" is the definitive new guide to graph databases and their applications. Written by three acclaimed leaders in the field, this first edition is now available. Download your free book today! http://p.sf.net/sfu/NeoTech _______________________________________________ Virtuoso-users mailing list Virtuoso-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/virtuoso-users