Hi Andreas, Analysing your logs further I see:
1. On the remote (8891) instance the SERVICE keyword references I see many occurrences of the following in there log: 07:11:34 WARNING: * Monitor: Many lock waits 07:13:34 WARNING: * Monitor: Many lock waits 07:15:43 WARNING: * Monitor: Many lock waits 07:17:45 WARNING: * Monitor: Many lock waits 07:19:34 WARNING: * Monitor: Should read for update because lock escalation from shared to exclusive fails frequently (1) 07:19:34 WARNING: * Monitor: Locks are held for a long time 07:19:46 WARNING: * Monitor: Many lock waits and status shows 12 deadlocks and thousands of occurrences of the following messaged in the output: Lock Status: 12 deadlocks of which 0 2r1w, 107 waits, Currently 10 threads running 7 threads waiting 0 threads in vdb. Pending: 82435: IER 141.87.4.9 212: IER 141.87.4.9 208: IER 141.87.4.9 213: IER 141.87.4.9 138756: IER 141.87.4.9 276: IER 141.87.4.9 137220: IER 141.87.4.9 . . . The deadlocks are not a problem per say , it is more the many pending waits. Thus as indicated in the performance tuning link (http://docs.openlinksw.com/virtuoso/databaseadmsrv.html#ptune <http://docs.openlinksw.com/virtuoso/databaseadmsrv.html#ptune>) sent previously please provide the output of running the query: select top 50 * sys_l_stat order by wait_msecs desc To see in which table all the waits are pending on … 2. On the local instance (8895) the queries are being run from I see many occurrences of these errors in the logs: 01:13:01 ERROR: No ext map for dp 3193852 in uncommitted blob cpt 01:13:01 ERROR: No ext map for dp 3193853 in uncommitted blob cpt 01:13:01 ERROR: No ext map for dp 3193854 in uncommitted blob cpt 01:13:01 ERROR: No ext map for dp 3193855 in uncommitted blob cpt 01:13:04 INFO: Checkpoint finished, log reused 01:45:34 WARNING: * Monitor: Locks are held for a long time 01:45:35 INFO: Free blob page refd start = 2319104 L=2319104 01:45:35 ERROR: Blob starting L=2319104 inconsistent before delete. Not deleted 01:45:35 INFO: Free blob page refd start = 2319105 L=2319105 01:45:35 ERROR: Blob starting L=2319105 inconsistent before delete. Not deleted 01:45:35 INFO: Free blob page refd start = 2319106 L=2319106 01:45:35 ERROR: Blob starting L=2319106 inconsistent before delete. Not deleted 01:45:35 INFO: Free blob page refd start = 2319107 L=2319107 Thus you should first running a database integrity check on the database: backup ‘/dev/null’ and perform a +crash-dump and restore of the database to eliminate these errors as detailed at: http://docs.openlinksw.com/virtuoso/databaseadmsrv.html#diagnosingrepairing <http://docs.openlinksw.com/virtuoso/databaseadmsrv.html#diagnosingrepairing> Best Regards Hugh Williams Professional Services OpenLink Software, Inc. // http://www.openlinksw.com/ Weblog -- http://www.openlinksw.com/blogs/ LinkedIn -- http://www.linkedin.com/company/openlink-software/ Twitter -- http://twitter.com/OpenLink Google+ -- http://plus.google.com/100570109519069333827/ Facebook -- http://www.facebook.com/OpenLinkSoftware Universal Data Access, Integration, and Management Technology Providers > On 2 Mar 2016, at 14:30, Nolle, Andreas <no...@hs-albsig.de> wrote: > > Dear Kingsley, > > thanks for your reply. > > The overall triple counts of each instance is > - Virtuoso instance running at port 8891: 17765873 triples > - Virtuoso instance running at port 8893: 27897291 triples > - Virtuoso instance running at port 8895: 168888956 triples > - Virtuoso instance running at port 8899: 72372256 triples > > If you are interested in the ini files, you can find the current versions > athttps://www.dropbox.com/s/awdx744vomsb5vr/virtuoso_ini.zip?dl=0 > <https://www.dropbox.com/s/awdx744vomsb5vr/virtuoso_ini.zip?dl=0> > > For profiling I just used simple the test queries such that it is probably > easier to find out the problem. By doing so, I came to the assumption that > especially the size of the result set causes really long query evaluation > time: > If I run q1: > SELECT ?x ?b > WHERE { > SERVICE <http://141.87.4.8:8891/sparql <http://141.87.4.8:8891/sparql>> > { > ?x <http://swrc.ontoware.org/ontology#edition > <http://swrc.ontoware.org/ontology#edition>> ?b . > } . > } > at lets say Virtuoso instance running at port 8895 it will end up within 2 > minutes. This is because the service call will return only 32562 results. > > But if I run q2: > SELECT ?x ?b > WHERE { > SERVICE <http://141.87.4.8:8891/sparql <http://141.87.4.8:8891/sparql>> > { > ?x <http://swrc.ontoware.org/ontology#pages > <http://swrc.ontoware.org/ontology#pages>> ?b . > } . > } > where the service call returns 313421 results, the query evaluation on the > same instance (running at port 8895) will take something around 1 hour! > > The corresponding explain and profile plans of those queries can be found > athttps://www.dropbox.com/s/ein0hxhtgk85ci9/analysis_files.zip?dl=0 > <https://www.dropbox.com/s/ein0hxhtgk85ci9/analysis_files.zip?dl=0> > Please notice that *_federated means the queries are like above and are > evaluated at the Virtuoso instance running at port 8895, and *_local means > the evaluation of these queries (without SERVICE) at the Virtuoso instance > running at port 8891. > > On analyzing the long evaluation times, especially for q2 in the federated > case, I found out that if q2 is for example evaluated at Virtuoso instance > running at port 8895 and the Virtuoso instance running at port 8891 is shut > downed after 5 minutes, the query evaluation still proceeds and ended after > more than one hour with the same number of results (313421) as e.g. in the > local evaluation. > > Please let me know if you need any other information. > > Best regards > Andy > > ------------------------------------------------------------------------------ > Site24x7 APM Insight: Get Deep Visibility into Application Performance > APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month > Monitor end-to-end web transactions and take corrective actions now > Troubleshoot faster and improve end-user experience. Signup Now! > http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140_______________________________________________ > > <http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140_______________________________________________> > Virtuoso-users mailing list > Virtuoso-users@lists.sourceforge.net > <mailto:Virtuoso-users@lists.sourceforge.net> > https://lists.sourceforge.net/lists/listinfo/virtuoso-users > <https://lists.sourceforge.net/lists/listinfo/virtuoso-users> Begin forwarded message: From: Kingsley Idehen <kide...@openlinksw.com> Subject: Re: [Virtuoso-users] infrequent errors on parallel querying Date: 2 March 2016 at 13:28:22 GMT To: virtuoso-users@lists.sourceforge.net On 3/2/16 4:24 AM, Nolle, Andreas wrote: > Hi Hugh, > > > > this was also my first guess, but the network is not the problem. This is > definitively the case because I’ve evaluated again q2 > > SELECT ?x ?b > > WHERE { > > SERVICE < <http://141.87.4.8:8891/sparql>http://141.87.4.8:8891/sparql > <http://141.87.4.8:8891/sparql>> { > > ?x < > <http://swrc.ontoware.org/ontology#pages>http://swrc.ontoware.org/ontology#pages > <http://swrc.ontoware.org/ontology#pages>> ?b . > > } . > > } > > at Virtuoso instance running at port 8895 and SHUTDOWN the Virtuoso instance > running at port 8891 after 5 minutes. > > The query evaluation hat still proceed and ended after more than one hour > with the same number of results (313421). > > At that it doesn’t matter if the query is executed via isql or via the SPARQL > interface (locally AND remote). > > > > So for me it seems like a bug, because the Virtuoso at 8895 takes one hour > despite the fact that no additional join or any other operation has to be > done. > > > > Best > > Andy > Andy, There are a lot of factors that could come into play here. If you haven't already done so, please provide query and result times for queries against each of the individual instances in this SPARQL-FED query. The goal would be to at least eliminate any local issues. Also shed some light on the triple count per each of these instances. Ultimately we have a network or query optimizer issue in play, we just need a productive route to determining which of these it is.. -- Regards, Kingsley Idehen Founder & CEO OpenLink Software Company Web: http://www.openlinksw.com <http://www.openlinksw.com/> Personal Weblog 1: http://kidehen.blogspot.com <http://kidehen.blogspot.com/> Personal Weblog 2: http://www.openlinksw.com/blog/~kidehen <http://www.openlinksw.com/blog/~kidehen> Twitter Profile: https://twitter.com/kidehen <https://twitter.com/kidehen> Google+ Profile: https://plus.google.com/+KingsleyIdehen/about <https://plus.google.com/+KingsleyIdehen/about> LinkedIn Profile: http://www.linkedin.com/in/kidehen <http://www.linkedin.com/in/kidehen> Personal WebID: http://kingsley.idehen.net/dataspace/person/kidehen#this <http://kingsley.idehen.net/dataspace/person/kidehen#this> ------------------------------------------------------------------------------ Site24x7 APM Insight: Get Deep Visibility into Application Performance APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month Monitor end-to-end web transactions and take corrective actions now Troubleshoot faster and improve end-user experience. Signup Now! http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140_______________________________________________ Virtuoso-users mailing list Virtuoso-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/virtuoso-users
smime.p7s
Description: S/MIME cryptographic signature
------------------------------------------------------------------------------ Site24x7 APM Insight: Get Deep Visibility into Application Performance APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month Monitor end-to-end web transactions and take corrective actions now Troubleshoot faster and improve end-user experience. Signup Now! http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________ Virtuoso-users mailing list Virtuoso-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/virtuoso-users