not committing after the batch. made sure we have that turned off. maxTime is set to 300000 (300 seconds), openSearcher is set to true.
On Sat, Nov 4, 2017 at 6:50 PM, Amrit Sarkar <sarkaramr...@gmail.com> wrote: > Pretty much what Emir has stated. I want to know, when you saw; > > all of this runs perfectly ok when indexing isn't happening. as soon as > > we start "nrt" indexing one of the follower nodes goes down within 10 to > 20 > > minutes. > > > When you say "NRT" indexing, what is the commit strategy in indexing. With > auto-commit so highly set, are you committing after batch, if yes, what's > the number. > > Amrit Sarkar > Search Engineer > Lucidworks, Inc. > 415-589-9269 > www.lucidworks.com > Twitter http://twitter.com/lucidworks > LinkedIn: https://www.linkedin.com/in/sarkaramrit2 > > On Sat, Nov 4, 2017 at 2:47 PM, Emir Arnautović < > emir.arnauto...@sematext.com> wrote: > > > Hi Rick, > > Do you see any errors in logs? Do you have any monitoring tool? Maybe you > > can check heap and GC metrics around time when incident happened. It is > not > > large heap but some major GC could cause pause large enough to trigger > some > > snowball and end up with node in recovery state. > > What is indexing rate you observe? Why do you have max warming searchers > 5 > > (did you mean this with autowarmingsearchers?) when you commit every 5 > min? > > Why did you increase it - you seen errors with default 2? Maybe you > commit > > every bulk? > > Do you see similar behaviour when you just do indexing without queries? > > > > Thanks, > > Emir > > -- > > Monitoring - Log Management - Alerting - Anomaly Detection > > Solr & Elasticsearch Consulting Support Training - http://sematext.com/ > > > > > > > > > On 4 Nov 2017, at 05:15, Rick Dig <teram...@gmail.com> wrote: > > > > > > hello all, > > > we are trying to run solrcloud 6.6 in a production setting. > > > here's our config and issue > > > 1) 3 nodes, 1 shard, replication factor 3 > > > 2) all nodes are 16GB RAM, 4 core > > > 3) Our production load is about 2000 requests per minute > > > 4) index is fairly small, index size is around 400 MB with 300k > documents > > > 5) autocommit is currently set to 5 minutes (even though ideally we > would > > > like a smaller interval). > > > 6) the jvm runs with 8 gb Xms and Xmx with CMS gc. > > > 7) all of this runs perfectly ok when indexing isn't happening. as soon > > as > > > we start "nrt" indexing one of the follower nodes goes down within 10 > to > > 20 > > > minutes. from this point on the nodes never recover unless we stop > > > indexing. the master usually is the last one to fall. > > > 8) there are maybe 5 to 7 processes indexing at the same time with > > document > > > batch sizes of 500. > > > 9) maxRambuffersizeMB is 100, autowarmingsearchers is 5, > > > 10) no cpu and / or oom issues that we can see. > > > 11) cpu load does go fairly high 15 to 20 at times. > > > any help or pointers appreciated > > > > > > thanks > > > rick > > > > >