Re: Repair Process Taking too long

2012-05-22 Thread aaron morton
It repairs the ranges they have in common. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 20/05/2012, at 4:05 PM, Raj N wrote: > Can I infer from this that if I have 3 replicas, then running repair without > -pr won 1 node will repair th

Re: Repair Process Taking too long

2012-05-19 Thread Raj N
Can I infer from this that if I have 3 replicas, then running repair without -pr won 1 node will repair the other 2 replicas as well. -Raj On Sat, Apr 14, 2012 at 2:54 AM, Zhu Han wrote: > > On Sat, Apr 14, 2012 at 1:57 PM, Igor wrote: > >> Hi! >> >> What is the difference between 'repair' and

Re: Repair Process Taking too long

2012-04-13 Thread Zhu Han
On Sat, Apr 14, 2012 at 1:57 PM, Igor wrote: > Hi! > > What is the difference between 'repair' and '-pr repair'? Simple repair > touch all token ranges (for all nodes) and -pr touch only range for which > given node responsible? > > -pr only touches the primary range of the node. If you executes

Re: Repair Process Taking too long

2012-04-13 Thread Igor
Hi! What is the difference between 'repair' and '-pr repair'? Simple repair touch all token ranges (for all nodes) and -pr touch only range for which given node responsible? On 04/12/2012 05:59 PM, Sylvain Lebresne wrote: On Thu, Apr 12, 2012 at 4:06 PM, Frank Ng wrote: I also noticed tha

Re: Repair Process Taking too long

2012-04-12 Thread Frank Ng
Thanks for the clarification. I'm running repairs as in case 2 (to avoid deleted data coming back). On Thu, Apr 12, 2012 at 10:59 AM, Sylvain Lebresne wrote: > On Thu, Apr 12, 2012 at 4:06 PM, Frank Ng wrote: > > I also noticed that if I use the -pr option, the repair process went down > > from

Re: Repair Process Taking too long

2012-04-12 Thread Sylvain Lebresne
On Thu, Apr 12, 2012 at 4:06 PM, Frank Ng wrote: > I also noticed that if I use the -pr option, the repair process went down > from 30 hours to 9 hours.  Is the -pr option safe to use if I want to run > repair processes in parallel on nodes that are not replication peers? There is pretty much two

Re: Repair Process Taking too long

2012-04-12 Thread Frank Ng
I also noticed that if I use the -pr option, the repair process went down from 30 hours to 9 hours. Is the -pr option safe to use if I want to run repair processes in parallel on nodes that are not replication peers? thanks On Thu, Apr 12, 2012 at 12:06 AM, Frank Ng wrote: > Thank you for conf

Re: Repair Process Taking too long

2012-04-11 Thread Frank Ng
Thank you for confirming that the per node data size is most likely causing the long repair process. I have tried a repair on smaller column families and it was significantly faster. On Wed, Apr 11, 2012 at 9:55 PM, aaron morton wrote: > If you have 1TB of data it will take a long time to repair

Re: Repair Process Taking too long

2012-04-11 Thread aaron morton
If you have 1TB of data it will take a long time to repair. Every bit of data has to be read and a hash generated. This is one of the reasons we often suggest that around 300 to 400Gb per node is a good load in the general case. Look at nodetool compactionstats .Is there a validation compaction

Re: Repair Process Taking too long

2012-04-11 Thread Frank Ng
Can you expand further on your issue? Were you using Random Patitioner? thanks On Tue, Apr 10, 2012 at 5:35 PM, David Leimbach wrote: > I had this happen when I had really poorly generated tokens for the ring. > Cassandra seems to accept numbers that are too big. You get hot spots > when you

Re: Repair Process Taking too long

2012-04-10 Thread David Leimbach
I had this happen when I had really poorly generated tokens for the ring. Cassandra seems to accept numbers that are too big. You get hot spots when you think you should be balanced and repair never ends (I think there is a 48 hour timeout). On Tuesday, April 10, 2012, Frank Ng wrote: > I am no

Re: Repair Process Taking too long

2012-04-10 Thread Frank Ng
I am not using tier-sized compaction. On Tue, Apr 10, 2012 at 12:56 PM, Jonathan Rhone wrote: > Data size, number of nodes, RF? > > Are you using size-tiered compaction on any of the column families that > hold a lot of your data? > > Do your cassandra logs say you are streaming a lot of ranges

Re: Repair Process Taking too long

2012-04-10 Thread Frank Ng
I have 12 nodes with approximately 1TB load per node. The RF is 3. I am considering moving to ext4. I checked the ranges and the numbers go from 1 to the 9000s . On Tue, Apr 10, 2012 at 12:56 PM, Jonathan Rhone wrote: > Data size, number of nodes, RF? > > Are you using size-tiered compaction

Re: Repair Process Taking too long

2012-04-10 Thread Igor
also - JVM heap size, and anything related to memory pressure On 04/10/2012 07:56 PM, Jonathan Rhone wrote: Data size, number of nodes, RF? Are you using size-tiered compaction on any of the column families that hold a lot of your data? Do your cassandra logs say you are streaming a lot of r

Re: Repair Process Taking too long

2012-04-10 Thread Jonathan Rhone
Data size, number of nodes, RF? Are you using size-tiered compaction on any of the column families that hold a lot of your data? Do your cassandra logs say you are streaming a lot of ranges? zgrep -E "(Performing streaming repair|out of sync)" On Tue, Apr 10, 2012 at 9:45 AM, Igor wrote: > O

Re: Repair Process Taking too long

2012-04-10 Thread Igor
On 04/10/2012 07:16 PM, Frank Ng wrote: Short answer - yes. But you are asking wrong question. I think both processes are taking a while. When it starts up, netstats and compactionstats show nothing. Anyone out there successfully using ext3 and their repair processes are faster than this?

Re: Repair Process Taking too long

2012-04-10 Thread Frank Ng
I think both processes are taking a while. When it starts up, netstats and compactionstats show nothing. Anyone out there successfully using ext3 and their repair processes are faster than this? On Tue, Apr 10, 2012 at 10:42 AM, Igor wrote: > Hi > > You can check with nodetool which part of r

Re: Repair Process Taking too long

2012-04-10 Thread Igor
Hi You can check with nodetool which part of repair process is slow - network streams or verify compactions. use nodetool netstats or compactionstats. On 04/10/2012 05:16 PM, Frank Ng wrote: Hello, I am on Cassandra 1.0.7. My repair processes are taking over 30 hours to complete. Is it

Repair Process Taking too long

2012-04-10 Thread Frank Ng
Hello, I am on Cassandra 1.0.7. My repair processes are taking over 30 hours to complete. Is it normal for the repair process to take this long? I wonder if it's because I am using the ext3 file system. thanks