Markus, What does iostat(1) tell you? Cheers -- Rick On July 19, 2017 5:35:32 AM EDT, Markus Jelsma <markus.jel...@openindex.io> wrote: >Hello, > >Another peculiarity here, our six node (2 shards / 3 replica's) cluster >is going crazy after a good part of the day has passed. It starts >eating CPU for no good reason and its latency goes up. Grafana graphs >show the problem really well > >After restarting 2/6 nodes, there is also quite a distinction in the >VisualVM monitor views, and the VisualVM CPU sampler reports (sorted on >self time (CPU)). The busy nodes are deeply red in >o.a.h.impl.io.AbstractSessionInputBuffer.fillBuffer (as usual), the >restarted nodes are not. > >The real distinction between busy and calm nodes is that busy nodes all >have o.a.l.codecs.perfield.PerFieldPostingsFormat$FieldsReader.terms() >as second to fillBuffer(), what are they doing?! Why? The calm nodes >don't show this at all. Busy nodes all have o.a.l.codec stuff on top, >restarted nodes don't. > >So, actually, i don't have a clue! Any, any ideas? > >Thanks, >Markus > >Each replica is underpowered but performing really well after restart >(and JVM warmup), 4 CPU's, 900M heap, 8 GB RAM, maxDoc 2.8 million, >index size 18 GB.
-- Sorry for being brief. Alternate email is rickleir at yahoo dot com