RE: 6.6 cloud starting to eat CPU after 8+ hours

Markus Jelsma Wed, 19 Jul 2017 03:56:39 -0700

Hello,

Not too much actually:


avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          10.55    0.00    0.25    0.03    0.95   88.22

Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
sda               3.26        78.34       218.67  188942841  527408404

These are all SSD's.

Thanks,
Markus

-----Original message-----
> From:Rick Leir <[email protected]>
> Sent: Wednesday 19th July 2017 12:48
> To: [email protected]
> Subject: Re: 6.6 cloud starting to eat CPU after 8+ hours
> 
> Markus, 
> What does iostat(1) tell you? Cheers -- Rick
> 
> On July 19, 2017 5:35:32 AM EDT, Markus Jelsma <[email protected]> 
> wrote:
> >Hello,
> >
> >Another peculiarity here, our six node (2 shards / 3 replica's) cluster
> >is going crazy after a good part of the day has passed. It starts
> >eating CPU for no good reason and its latency goes up. Grafana graphs
> >show the problem really well
> >
> >After restarting 2/6 nodes, there is also quite a distinction in the
> >VisualVM monitor views, and the VisualVM CPU sampler reports (sorted on
> >self time (CPU)). The busy nodes are deeply red in
> >o.a.h.impl.io.AbstractSessionInputBuffer.fillBuffer (as usual), the
> >restarted nodes are not.
> >
> >The real distinction between busy and calm nodes is that busy nodes all
> >have o.a.l.codecs.perfield.PerFieldPostingsFormat$FieldsReader.terms()
> >as second to fillBuffer(), what are they doing?! Why? The calm nodes
> >don't show this at all. Busy nodes all have o.a.l.codec stuff on top,
> >restarted nodes don't.
> >
> >So, actually, i don't have a clue! Any, any ideas? 
> >
> >Thanks,
> >Markus
> >
> >Each replica is underpowered but performing really well after restart
> >(and JVM warmup), 4 CPU's, 900M heap, 8 GB RAM, maxDoc 2.8 million,
> >index size 18 GB.
> 
> -- 
> Sorry for being brief. Alternate email is rickleir at yahoo dot com

RE: 6.6 cloud starting to eat CPU after 8+ hours

Reply via email to