Thanks guys! Jeff Jirsa helped me take a look, and I found a 10sec young gc
pause in the GC log.
3071128K->282000K(3495296K), 0.1144648 secs]
25943529K->23186623K(66409856K), 9.8971781 secs] [Times: user=2.33
sys=0.00, real=9.89 secs]
I'm trying to get a histogram or heap dump.
Thanks!
On Mo
The lion's share of your drops are from cross-node timeouts, which require
clock synchronization, so check that first. If your clocks are synced,
that means not only are you showing eager dropping based on time, but
despite the eager dropping you are still facing overload.
That local, non-gc paus
Dikang,
Did you take a look at the heap health on those nodes? A quick heap
histogram or dump would help you figure out if it is related to data
issue(wide rows, or bad model) where few nodes may be coming under heap
pressure and dropping messages.
Thanks,
Roopa
*Regards,*
*Roopa Tangirala*
Hi Dikang,
Do you have any GC logging or metrics you can correlate with the dropped
messages? A 13 second pause sounds like a bad GC pause.
Thanks,
Blake
On January 22, 2017 at 10:37:22 PM, Dikang Gu (dikan...@gmail.com) wrote:
Btw, the C* version is 2.2.5, with several backported patches.
Btw, the C* version is 2.2.5, with several backported patches.
On Sun, Jan 22, 2017 at 10:36 PM, Dikang Gu wrote:
> Hello there,
>
> We have a 100 nodes ish cluster, I find that there are dropped messages on
> random nodes in the cluster, which caused error spikes and P99 latency
> spikes as wel
Complete information, including everything in tpstats, is available
for your monitoring systems via JMX. For production clusters, it is
essential you at least collect the JMX stats, if not alarm on various
problems (such as backed up stages).
b
On Wed, Sep 22, 2010 at 6:47 AM, Carl Bruecken
wr
that's your cluster's way of telling you to set up monitoring
On Wed, Sep 22, 2010 at 8:47 AM, Carl Bruecken
wrote:
> On 9/22/10 9:37 AM, Jonathan Ellis wrote:
>>
>> it's easy to tell from tpstats which stage(s) are overloaded
>>
>> On Wed, Sep 22, 2010 at 8:29 AM, Carl Bruecken
>> wrote:
>>>
On 9/22/10 9:37 AM, Jonathan Ellis wrote:
it's easy to tell from tpstats which stage(s) are overloaded
On Wed, Sep 22, 2010 at 8:29 AM, Carl Bruecken
wrote:
With current implementation, it's impossible to tell from logs what the
message types (verb) were dropped. I read this was changed f
it's easy to tell from tpstats which stage(s) are overloaded
On Wed, Sep 22, 2010 at 8:29 AM, Carl Bruecken
wrote:
> With current implementation, it's impossible to tell from logs what the
> message types (verb) were dropped. I read this was changed for spamming,
> but I think the behavior shou