Re: delay/stall processing reads

2014-01-07 Thread Thunder Stumpges
Apologies I sent that prior email too soon. I see exceptions in the log during nodetool repair like these: ERROR [AntiEntropySessions:1] 2014-01-06 21:46:45,339 RepairSession.java (line 278) [repair #179e84f0-775f-11e3-9c6a-538c9374b226] session completed with the following error org.apache.cassand

Re: delay/stall processing reads

2014-01-07 Thread Thunder Stumpges
Thanks Lee, I put these in place on one of the three servers, and while doing so I noticed in the logs on the other servers that there were many errors from our weekly nodetool repair of this form: On Thu, Jan 2, 2014 at 9:34 PM, Lee Mighdoll wrote: > Well from what I see in system.log it does

Re: delay/stall processing reads

2014-01-02 Thread Lee Mighdoll
> > Well from what I see in system.log it does not appear that GC aligns with > this delay. > Though it does seem like quite a few GCs take place. Here is my system.log > around the time of the delay: > It does sound like a lot of CMS runs - you'd like most of your garbage to be collected in new s

Re: delay/stall processing reads

2014-01-02 Thread Thunder Stumpges
Thanks Rob, Well from what I see in system.log it does not appear that GC aligns with this delay. Though it does seem like quite a few GCs take place. Here is my system.log around the time of the delay: INFO [ScheduledTasks:1] 2014-01-02 12:30:22,164 GCInspector.java (line 116) GC for Concurrent

Re: delay/stall processing reads

2014-01-02 Thread Robert Coli
On Thu, Jan 2, 2014 at 2:24 PM, Thunder Stumpges wrote: > Excuse my ignorance, but where would I look for the GC info? What logs > contain this? I will start looking for log files and more clues in them. > system.log contains some basic info, you can enable extended gc info via options to the JV

Re: delay/stall processing reads

2014-01-02 Thread Thunder Stumpges
Thanks Rob, we are using Cassandra 2.0.2, CQL3, native protocol. tpstats is nearly all zeros from what I can tell. Even running a load of 100rps I can only ever see 1 or 2 in the active or pending counters, never anything in the blocked. Even in the "blocked all time" column it is zero in all cases

Re: delay/stall processing reads

2014-01-02 Thread Robert Coli
(D'oh, missed your details in the PS.. :D) I don't know whether the .NET client uses thrift or native protocol.. Re 2.0.2 in production : https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/ =Rob On Thu, Jan 2, 2014 at 2:13 PM, Robert Coli wrote: > On Thu, Jan 2, 2014 a

Re: delay/stall processing reads

2014-01-02 Thread Robert Coli
On Thu, Jan 2, 2014 at 2:05 PM, Thunder Stumpges wrote: > I am seeing a read operation delay in our small (3 node) cluster where I > am testing. The "normal" latency for these operations is < 2ms as recorded > by our load client. This holds easily beyond several hundred qps. However > there are t

delay/stall processing reads

2014-01-02 Thread Thunder Stumpges
Hi all, I am seeing a read operation delay in our small (3 node) cluster where I am testing. The "normal" latency for these operations is < 2ms as recorded by our load client. This holds easily beyond several hundred qps. However there are times when all incoming queries (on a node-by-node basis)