Hi Hugh, again many thanks for your reply and working at the weekend!
I've created the index just on a copy of the original database for some tests... So all indexes (STATISTICS DB.DBA.RDF_QUAD;) created on these various instances are the same, since these are all default indexes. If you are interested on the output of STATISTICS DB.DBA.RDF_QUAD; for each Virtuoso instance, please see the zip file referenced below. The triple counts of each instance is - Virtuoso instance running at port 8891: 17765873 triples - Virtuoso instance running at port 8893: 27897291 triples - Virtuoso instance running at port 8895: 168888956 triples - Virtuoso instance running at port 8899: 72372256 triples Right, I've already changed the ini files according to your suggestions: - removing ServerThreads - reducing MaxClientConnections to 200 - reducing MaxQueryMem - setting AdjustVectorSize to 0 For profiling I shrinked the test queries such that it is probably easier to find out the problem. By doing so, I came to the assumption that not the FILTERS causes the long query evaluation time, but the size of the result set. If I run q1: SELECT ?x ?b WHERE { SERVICE <http://141.87.4.8:8891/sparql> { ?x <http://swrc.ontoware.org/ontology#edition> ?b . } . } at lets say Virtuoso instance running at port 8895 it will end up within 2 minutes. This is because the service call will return only 32562 results. But if I run q2: SELECT ?x ?b WHERE { SERVICE <http://141.87.4.8:8891/sparql> { ?x <http://swrc.ontoware.org/ontology#pages> ?b . } . } where the service call returns 313421 results, the query evaluation on the same instance (running at port 8895) will take something around 1 hour! Do you have any explanation for that or any suggestion how such long evaluation times can be fixed? If you are still interested in explain and profile plans of those queries, please find it at https://www.dropbox.com/s/ein0hxhtgk85ci9/analysis_files.zip?dl=0. Please notice that *_federated means the queries are like above and are evaluated at the Virtuoso instance running at port 8895, and *_local means the evaluation of these queries (without SERVICE) at the Virtuoso instance running at port 8891. Best regards Andy -----Ursprüngliche Nachricht----- Von: Hugh Williams [mailto:hwilli...@openlinksw.com] Gesendet: Sonntag, 28. Februar 2016 22:21 An: Nolle, Andreas <no...@hs-albsig.de> Cc: virtuoso-users@lists.sourceforge.net Betreff: Re: [Virtuoso-users] infrequent errors on parallel querying Hi Andreas, Generally there should be no create additional indexes as the 2 Full & 3 partial indexes have been found to be sufficient for most use cases, with only one use case encountered where an addtional partial index was required as detailed at: http://virtuoso.openlinksw.com/dataspace/doc/dav/wiki/Main/VirtRDFPerformanceTuning#Index%20Scheme%20Selection Thus what indexes (STATISTICS DB.DBA.RDF_QUAD;) are created on these various instances you have are they all the same or do they have different indexes as the more indexes the more large the database and hence more memory required for hosting it. Also are the triple counts the same on all these instances ? Did you make any of the other INI file changes in my previous email, certainly “AdjustVectorSize” should be set to 0 ? You should also consider profiling the queries to determine if the best quey plan is being used, which can be done with the “profile()” function as detailed at: http://virtuoso.openlinksw.com/dataspace/doc/dav/wiki/Main/VirtTipsAndTricksAanalyzingSPARQLQuery http://docs.openlinksw.com/virtuoso/databaseadmsrv.html#querylogging - Query Logging Note there is are also some INI file params that can be set to control query optimisation ie plans as if a bad plan is being chosen then these options can in some cases enable better plans to be chosen, see: http://virtuoso.openlinksw.com/dataspace/doc/dav/wiki/Main/VirtQueryOptDiagnostic If you can provide query plans, database statistics then these can be analysed to trying and determine the cause of long running queries … It certainly would be interesting to see how the query plans and database stats vary being the instance where the query runs in msecs and the other were it runs in 40+mins ... Best Regards Hugh Williams Professional Services OpenLink Software, Inc. // http://www.openlinksw.com/ Weblog -- http://www.openlinksw.com/blogs/ LinkedIn -- http://www.linkedin.com/company/openlink-software/ Twitter -- http://twitter.com/OpenLink Google+ -- http://plus.google.com/100570109519069333827/ Facebook -- http://www.facebook.com/OpenLinkSoftware Universal Data Access, Integration, and Management Technology Providers ------------------------------------------------------------------------------ Site24x7 APM Insight: Get Deep Visibility into Application Performance APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month Monitor end-to-end web transactions and take corrective actions now Troubleshoot faster and improve end-user experience. Signup Now! http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140 _______________________________________________ Virtuoso-users mailing list Virtuoso-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/virtuoso-users