check out the videos on this website TROO.TUBE don't be such a sheep/zombie/loser/NPC. Much love! https://troo.tube/videos/watch/aaa64864-52ee-4201-922f-41300032f219
On Mon, May 4, 2020 at 5:43 PM Webster Homer <webster.ho...@milliporesigma.com> wrote: > > My company has several Solrcloud environments. In our most active cloud we > are seeing outages that are related to GC pauses. We have about 10 > collections of which 4 get a lot of traffic. The solrcloud consists of 4 > nodes with 6 processors and 11Gb heap size (25Gb physical memory). > > I notice that the 4 nodes seem to do their garbage collection at almost the > same time. That seems strange to me. I would expect them to be more staggered. > > This morning we had a GC pause that caused problems . During that time our > application service was reporting "No live SolrServers available to handle > this request" > > Between 3:55 and 3:56 AM all 4 nodes were having some amount of garbage > collection pauses, for 2 of the nodes it was minor, for one it was 50%. For 3 > nodes it lasted until 3>57. However the node with the worst impact didn't > recover until 4am. > > How is it that all 4 nodes were in lock step doing GC? If they all are doing > GC at the same time it defeats the purpose of having redundant cloud servers. > We just this weekend switched to use G1GC from CMS > > At this point in time we also saw that traffic to solr was not well > distributed. The application calls solr using CloudSolrClient which I thought > did its own load balancing. We saw 10X more traffic going to one solr node > that all the others, the we saw it start hitting another node. All solr > queries come from our application. > > During this period of time I saw only 1 error message in the solr log: > ERROR (zkConnectionManagerCallback-8-thread-1) [ ] o.a.s.c.ZkController > There was a problem finding the leader in > zk:org.apache.solr.common.SolrException: Could not get leader props > > We are currently using Solr 7.7.2 > GC Tuning > GC_TUNE="-XX:NewRatio=3 \ > -XX:SurvivorRatio=4 \ > -XX:TargetSurvivorRatio=90 \ > -XX:MaxTenuringThreshold=8 \ > -XX:+UseG1GC \ > -XX:MaxGCPauseMillis=250 \ > -XX:+ParallelRefProcEnabled" > > > > > This message and any attachment are confidential and may be privileged or > otherwise protected from disclosure. If you are not the intended recipient, > you must not copy this message or attachment or disclose the contents to any > other person. If you have received this transmission in error, please notify > the sender immediately and delete the message and any attachment from your > system. Merck KGaA, Darmstadt, Germany and any of its subsidiaries do not > accept liability for any omissions or errors in this message which may arise > as a result of E-Mail-transmission or for damages resulting from any > unauthorized changes of the content of this message and any attachment > thereto. Merck KGaA, Darmstadt, Germany and any of its subsidiaries do not > guarantee that this message is free of viruses and does not accept liability > for any damages caused by any virus transmitted therewith. > > > > Click http://www.merckgroup.com/disclaimer to access the German, French, > Spanish and Portuguese versions of this disclaimer.