yes, Alex. This is reproducible. Will check if we can run Wireshark. Thank you.
On Mon, Aug 17, 2020 at 8:11 PM Alexandre Rafalovitch <arafa...@gmail.com> wrote: > If this is reproducible, I would run Wireshark on the network and see what > happens at packet level. > > Leaning towards firewall timing out and just starting to drop all packets. > > Regards, > Alex > > On Mon., Aug. 17, 2020, 6:22 p.m. Susheel Kumar, <susheel2...@gmail.com> > wrote: > > > Thanks for the all responses. > > > > Shawn - to your point both ping or select in between taking 600+ seconds > to > > return as you can see below 1st ping attempt was all good and 2nd took > long > > time. Similarly for select couple of select all returned fine and then > > suddenly taking long time. I'll try to run select with shards.info to > see > > if it is a problem with any particular shard but solr.log on many of the > > shard has QTime>600s entries. > > > > Heap doesn't seems to be a problem but will take a look on all the > shards. > > I'll share top output as well. > > > > Thnx > > > > > > Ping > > > > server65:/home/kumar # curl --location --request GET ' > > http://server1:8080/solr/COLL/admin/ping?distrib=true' > > <?xml version="1.0" encoding="UTF-8"?> > > <response> > > <lst name="responseHeader"><bool name="zkConnected">true</bool><int > > name="status">0</int><int name="QTime">20</int><lst name="params"><str > > name="q">{!lucene}*:*</str><str name="distrib">true</str><str > > name="df">wordTokens</str><str name="preferLocalShards">false</str><str > > name="rows">10</str><str name="echoParams">all</str></lst></lst><str > > name="status">OK</str> > > </response> > > server65:/home/kumar # curl --location --request GET ' > > http://server1:8080/solr/COLL/admin/ping?distrib=true' > > <?xml version="1.0" encoding="UTF-8"?> > > <response> > > <lst name="responseHeader"><bool name="zkConnected">true</bool><int > > name="status">0</int><int name="QTime">600123</int><lst > name="params"><str > > name="q">{!lucene}*:*</str><str name="distrib">true</str><str > > name="df">wordTokens</str><str name="preferLocalShards">false</str><str > > name="rows">10</str><str name="echoParams">all</str></lst></lst><str > > name="status">OK</str> > > </response> > > > > select > > > > > > server67:/home/kumar # curl --location --request GET ' > > http://server1:8080/solr/COLL/select?indent=on&q=*:*&wt=json&rows=0' > > { > > "responseHeader":{ > > "zkConnected":true, > > "status":0, > > "QTime":13, > > "params":{ > > "q":"*:*", > > "indent":"on", > > "rows":"0", > > "wt":"json"}}, > > "response":{"numFound":62221186,"start":0,"maxScore":1.0,"docs":[] > > }} > > server67:/home/kumar # curl --location --request GET ' > > http://server1:8080/solr/COLL/select?indent=on&q=*:*&wt=json&rows=0' > > { > > "responseHeader":{ > > "zkConnected":true, > > "status":0, > > "QTime":10, > > "params":{ > > "q":"*:*", > > "indent":"on", > > "rows":"0", > > "wt":"json"}}, > > "response":{"numFound":62221186,"start":0,"maxScore":1.0,"docs":[] > > }} > > server67:/home/kumar # curl --location --request GET ' > > http://server1:8080/solr/COLL/select?indent=on&q=*:*&wt=json&rows=0' > > { > > "responseHeader":{ > > "zkConnected":true, > > "status":0, > > "QTime":18, > > "params":{ > > "q":"*:*", > > "indent":"on", > > "rows":"0", > > "wt":"json"}}, > > "response":{"numFound":63094900,"start":0,"maxScore":1.0,"docs":[] > > }} > > server67:/home/kumar # curl --location --request GET ' > > http://server1:8080/solr/COLL/select?indent=on&q=*:*&wt=json&rows=0' > > { > > "responseHeader":{ > > "zkConnected":true, > > "status":0, > > "QTime":600093, > > "params":{ > > "q":"*:*", > > "indent":"on", > > "rows":"0", > > "wt":"json"}}, > > "response":{"numFound":62221186,"start":0,"maxScore":1.0,"docs":[] > > }} > > > > On Sat, Aug 15, 2020 at 1:41 PM Dominique Bejean < > > dominique.bej...@eolya.fr> > > wrote: > > > > > Hi, > > > > > > How long to display the solr console ? > > > What about CPU and iowait with top ? > > > > > > You should start by eliminate network issue between your solr nodes by > > > testing it with netcat on solr port. > > > http://deice.daug.net/netcat_speed.html > > > > > > Dominique > > > > > > Le ven. 14 août 2020 à 23:40, Susheel Kumar <susheel2...@gmail.com> a > > > écrit : > > > > > > > Hello, > > > > > > > > > > > > > > > > One of our Solr 6.6.2 DR cluster (target CDCR) which even doesn't > have > > > any > > > > > > > > live search load seems to be taking 600000 ms many times for the > ping / > > > > > > > > health check calls. Anyone has seen this before/suggestion what could > > be > > > > > > > > wrong. The collection has 8 shards/3 replicas and 64GB memory and > index > > > > > > > > seems to fit in memory. Below solr log entries. > > > > > > > > > > > > > > > > > > > > > > > > solr.log.26:2020-08-13 14:03:20.827 INFO (qtp1775120226-46486) > [c:COLL > > > > > > > > s:shard1 r:core_node19 x:COLL_shard1_replica1] o.a.s.c.S.Request > > > > > > > > [COLL_shard1_replica1] webapp=/solr path=/admin/ping > > > > > > > > params={distrib=true&_stateVer_=COLL:3032&wt=javabin&version=2} > > > > > > > > hits=62569458 status=0 QTime=600113 > > > > > > > > solr.log.26:2020-08-13 14:03:20.827 WARN (qtp1775120226-46486) > [c:COLL > > > > > > > > s:shard1 r:core_node19 x:COLL_shard1_replica1] o.a.s.c.SolrCore slow: > > > > > > > > [COLL_shard1_replica1] webapp=/solr path=/admin/ping > > > > > > > > params={distrib=true&_stateVer_=COLL:3032&wt=javabin&version=2} > > > > > > > > hits=62569458 status=0 QTime=600113 > > > > > > > > solr.log.26:2020-08-13 14:03:20.827 INFO (qtp1775120226-46486) > [c:COLL > > > > > > > > s:shard1 r:core_node19 x:COLL_shard1_replica1] o.a.s.c.S.Request > > > > > > > > [COLL_shard1_replica1] webapp=/solr path=/admin/ping > > > > > > > > params={distrib=true&_stateVer_=COLL:3032&wt=javabin&version=2} > > status=0 > > > > > > > > QTime=600113 > > > > > > > > solr.log.26:2020-08-13 14:03:20.827 WARN (qtp1775120226-46486) > [c:COLL > > > > > > > > s:shard1 r:core_node19 x:COLL_shard1_replica1] o.a.s.c.SolrCore slow: > > > > > > > > [COLL_shard1_replica1] webapp=/solr path=/admin/ping > > > > > > > > params={distrib=true&_stateVer_=COLL:3032&wt=javabin&version=2} > > status=0 > > > > > > > > QTime=600113 > > > > > > > > solr.log.38:2020-08-08 15:01:45.640 INFO (qtp1775120226-46254) > [c:COLL > > > > > > > > s:shard1 r:core_node19 x:COLL_shard1_replica1] o.a.s.c.S.Request > > > > > > > > [COLL_shard1_replica1] webapp=/solr path=/admin/ping > > > > > > > > params={distrib=true&_stateVer_=COLL:3032&wt=javabin&version=2} > > > > > > > > hits=62221186 status=0 QTime=600092 > > > > > > > > solr.log.38:2020-08-08 15:01:45.640 WARN (qtp1775120226-46254) > [c:COLL > > > > > > > > s:shard1 r:core_node19 x:COLL_shard1_replica1] o.a.s.c.SolrCore slow: > > > > > > > > [COLL_shard1_replica1] webapp=/solr path=/admin/ping > > > > > > > > params={distrib=true&_stateVer_=COLL:3032&wt=javabin&version=2} > > > > > > > > hits=62221186 status=0 QTime=600092 > > > > > > > > solr.log.38:2020-08-08 15:01:45.640 INFO (qtp1775120226-46254) > [c:COLL > > > > > > > > s:shard1 r:core_node19 x:COLL_shard1_replica1] o.a.s.c.S.Request > > > > > > > > [COLL_shard1_replica1] webapp=/solr path=/admin/ping > > > > > > > > params={distrib=true&_stateVer_=COLL:3032&wt=javabin&version=2} > > status=0 > > > > > > > > QTime=600092 > > > > > > > > solr.log.38:2020-08-08 15:01:45.640 WARN (qtp1775120226-46254) > [c:COLL > > > > > > > > s:shard1 r:core_node19 x:COLL_shard1_replica1] o.a.s.c.SolrCore slow: > > > > > > > > [COLL_shard1_replica1] webapp=/solr path=/admin/ping > > > > > > > > params={distrib=true&_stateVer_=COLL:3032&wt=javabin&version=2} > > status=0 > > > > > > > > QTime=600092 > > > > > > > > solr.log.39:2020-08-08 13:20:12.117 INFO (qtp1775120226-46254) > [c:COLL > > > > > > > > s:shard1 r:core_node19 x:COLL_shard1_replica1] o.a.s.c.S.Request > > > > > > > > [COLL_shard1_replica1] webapp=/solr path=/admin/ping > > > > > > > > params={distrib=true&_stateVer_=COLL:3032&wt=javabin&version=2} > > > > > > > > hits=63094900 status=0 QTime=600095 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > server1:/home/kumar # curl --location --request GET ' > > > > > > > > http://server1:8080/solr/COLL/admin/ping?distrib=true' > > > > > > > > <?xml version="1.0" encoding="UTF-8"?> > > > > > > > > <response> > > > > > > > > <lst name="responseHeader"><bool name="zkConnected">true</bool><int > > > > > > > > name="status">0</int><int name="QTime">600095</int><lst > > > name="params"><str > > > > > > > > name="q">{!lucene}*:*</str><str name="distrib">true</str><str > > > > > > > > name="df">wordTokens</str><str > name="preferLocalShards">false</str><str > > > > > > > > name="rows">10</str><str name="echoParams">all</str></lst></lst><str > > > > > > > > name="status">OK</str> > > > > > > > > </response> > > > > > > > > > > > > > >