Re: Intermittent long application pauses on nodes

2014-02-27 Thread Frank Ng
We have swap disabled. Can death by paging still happen? On Thu, Feb 27, 2014 at 11:32 AM, Benedict Elliott Smith < belliottsm...@datastax.com> wrote: > That sounds a lot like death by paging. > > > On 27 February 2014 16:29, Frank Ng wrote: > >> I just caught th

Re: Intermittent long application pauses on nodes

2014-02-27 Thread Frank Ng
ker Total (ms): Min: 163.8, Avg: 163.8, Max: 163.8, Diff: >>>>>> 0.0, Sum: 327.6] >>>>>> [GC Worker End (ms): Min: 222346382.1, Avg: 222346382.1, Max: >>>>>> 222346382.1, Diff: 0.0] >>>>>> [Code Root Fixup: 0.0 ms] >>

Re: Intermittent long application pauses on nodes

2014-02-14 Thread Frank Ng
Sorry, I have not had a chance to file a JIRA ticket. We have not been able to resolve the issue. But since Joel mentioned that upgrading to Cassandra 2.0.X solved it for them, we may need to upgrade. We are currently on Java 1.7 and Cassandra 1.2.8 On Thu, Feb 13, 2014 at 12:40 PM, Keith Wri

Re: Intermittent long application pauses on nodes

2014-02-03 Thread Frank Ng
+LogVMOutput > > I never figured out what kills stdout for C*. It's a library we depend on, > didn't try too hard to figure out which one. > > > On 29 January 2014 21:07, Frank Ng wrote: > >> Benedict, >> Thanks for the advice. I've tried turning on

Re: Intermittent long application pauses on nodes

2014-01-29 Thread Frank Ng
s; the count is the number of safepoints to aggregate into one log > message) > > > 52s is a very extreme pause, and I would be surprised if revoke bias could > cause this. I wonder if the VM is swapping out. > > > > On 29 January 2014 19:02, Frank Ng wrote: > >&

Re: Intermittent long application pauses on nodes

2014-01-29 Thread Frank Ng
g the safepoint. On Wed, Jan 29, 2014 at 1:20 PM, Shao-Chuan Wang < shaochuan.w...@bloomreach.com> wrote: > We had similar latency spikes when pending compactions can't keep it up or > repair/streaming taking too much cycles. > > > On Wed, Jan 29, 2014 at 10:07 AM, Frank N

Intermittent long application pauses on nodes

2014-01-29 Thread Frank Ng
All, We've been having intermittent long application pauses (version 1.2.8) and not sure if it's a cassandra bug. During these pauses, there are dropped messages in the cassandra log file along with the node seeing other nodes as down. We've turned on gc logging and the following is an example o

Re: Fat Client Commit Log

2012-06-25 Thread Frank Ng
loper > @aaronmorton > http://www.thelastpickle.com > > On 23/06/2012, at 8:07 AM, Frank Ng wrote: > > Hi All, > > We are using the Fat Client and notice that there are files written to the > commit log directory on the Fat Client. Does anyone know what these files > are stori

Fat Client Commit Log

2012-06-22 Thread Frank Ng
Hi All, We are using the Fat Client and notice that there are files written to the commit log directory on the Fat Client. Does anyone know what these files are storing? Are these hinted handoff data? The Fat Client has no files in the data directory, as expected. thanks

Re: repair waiting for something

2012-04-26 Thread Frank Ng
I am having the same issue in 1.0.7 with leveled compation. It seems that the repair is flaky. It either completes relatively fast in a TEST environment (7 minutes) or gets stuck trying to receive a merkle tree from a peer that is already sending it the merkle tree. Only solution is to restart c

Re: user Digest of: get.23021

2012-04-26 Thread Frank Ng
b.26.1335463902287; > Thu, 26 Apr 2012 11:11:42 -0700 (PDT) > Received: by 10.60.143.102 with HTTP; Thu, 26 Apr 2012 11:11:42 -0700 (PDT) > Date: Thu, 26 Apr 2012 14:11:42 -0400 > Message-ID: < > caal7ocavuw1rtaqwlddzbnzosv7-qxqfhot7w6uj8q08m03...@mail.gmail.c

Re: Repair Process Taking too long

2012-04-12 Thread Frank Ng
Thanks for the clarification. I'm running repairs as in case 2 (to avoid deleted data coming back). On Thu, Apr 12, 2012 at 10:59 AM, Sylvain Lebresne wrote: > On Thu, Apr 12, 2012 at 4:06 PM, Frank Ng wrote: > > I also noticed that if I use the -pr option, the repair pro

Re: Repair Process Taking too long

2012-04-12 Thread Frank Ng
I also noticed that if I use the -pr option, the repair process went down from 30 hours to 9 hours. Is the -pr option safe to use if I want to run repair processes in parallel on nodes that are not replication peers? thanks On Thu, Apr 12, 2012 at 12:06 AM, Frank Ng wrote: > Thank you

Re: Repair Process Taking too long

2012-04-11 Thread Frank Ng
l building the merkle hash tree. > > Look at nodetool netstats . Is it streaming data ? If so all hash trees > have been calculated. > > Cheers > > > - > Aaron Morton > Freelance Developer > @aaronmorton > http://www.thelastpickle.com > >

Re: Repair Process Taking too long

2012-04-11 Thread Frank Ng
when you think you should be balanced and repair never ends (I think there > is a 48 hour timeout). > > > On Tuesday, April 10, 2012, Frank Ng wrote: > >> I am not using tier-sized compaction. >> >> >> On Tue, Apr 10, 2012 at 12:56 PM, Jonathan Rhone wrote: &g

Re: Repair Process Taking too long

2012-04-10 Thread Frank Ng
streaming a lot of ranges? > zgrep -E "(Performing streaming repair|out of sync)" > > > On Tue, Apr 10, 2012 at 9:45 AM, Igor wrote: > >> On 04/10/2012 07:16 PM, Frank Ng wrote: >> >> Short answer - yes. >> But you are asking wrong question. >> >

Re: Repair Process Taking too long

2012-04-10 Thread Frank Ng
ompaction on any of the column families that > hold a lot of your data? > > Do your cassandra logs say you are streaming a lot of ranges? > zgrep -E "(Performing streaming repair|out of sync)" > > > On Tue, Apr 10, 2012 at 9:45 AM, Igor wrote: > >> On 04/10/

Re: Repair Process Taking too long

2012-04-10 Thread Frank Ng
h part of repair process is slow - > network streams or verify compactions. use nodetool netstats or > compactionstats. > > > On 04/10/2012 05:16 PM, Frank Ng wrote: > >> Hello, >> >> I am on Cassandra 1.0.7. My repair processes are taking over 30 hours to >>

Repair Process Taking too long

2012-04-10 Thread Frank Ng
Hello, I am on Cassandra 1.0.7. My repair processes are taking over 30 hours to complete. Is it normal for the repair process to take this long? I wonder if it's because I am using the ext3 file system. thanks