On Wed, Oct 23, 2013 at 5:23 AM, java8964 java8964 wrote:
> We enabled the major repair on every node every 7 days.
>
This is almost certainly the cause of your many duplicates.
If you don't DELETE heavily, consider changing gc_grace_seconds to 34 days
and then doing a repair on the first of the
ate: Tue, 22 Oct 2013 17:52:24 -0700
Subject: Re: Questions related to the data in SSTable files
From: rc...@eventbrite.com
To: user@cassandra.apache.org
On Tue, Oct 22, 2013 at 5:17 PM, java8964 java8964 wrote:
Any way I can verify how often the system being "repaired"? I can ask a
On Tue, Oct 22, 2013 at 5:17 PM, java8964 java8964 wrote:
> Any way I can verify how often the system being "repaired"? I can ask
> another group who maintain the Cassandra cluster. But do you mean that even
> the failed writes will be stored in the SSTable files?
>
"repair" sessions are logged i
he regular good data in memtable, then in the SSTable files.
Yong
Date: Tue, 22 Oct 2013 14:50:07 -0700
Subject: Re: Questions related to the data in SSTable files
From: rc...@eventbrite.com
To: user@cassandra.apache.org
On Tue, Oct 22, 2013 at 2:29 PM, java8964 java8964 wrote:
1) In the da
On Tue, Oct 22, 2013 at 2:29 PM, java8964 java8964 wrote:
> 1) In the data of full snapshot, I see more than 10% of duplication data.
> What I mean duplication is that there are event_activities with the same
> (entity_1_id, entity_2_id, entity_3_id, entity_4_id, created_on_timestamp,
> column_tim
Hi, I have some questions related the data in the SSTable files.
Our production environment has 36 boxes, so in theory 12 of them will make one
group of data without replication.
Right now, I got all the SSTable files from 12 nodes of the cluster (Based on
my understanding, these 12 nodes are one