Re: Repair completes successfully but data is still inconsistent

2014-12-01 Thread Robert Coli
On Thu, Nov 27, 2014 at 2:38 AM, André Cruz wrote: > On 26 Nov 2014, at 19:07, Robert Coli wrote: > > > > Yes. Do you know if 5748 was created as a result of compaction or via a > flush from a memtable? > > It was the result of a compaction: > Ok, so in theory if you had the input SSTables to t

Re: Repair completes successfully but data is still inconsistent

2014-11-27 Thread André Cruz
On 26 Nov 2014, at 19:07, Robert Coli wrote: > > Yes. Do you know if 5748 was created as a result of compaction or via a flush > from a memtable? It was the result of a compaction: INFO [CompactionExecutor:22422] 2014-11-13 13:08:41,926 CompactionTask.java (line 262) Compacted 2 sstables to

Re: Repair completes successfully but data is still inconsistent

2014-11-26 Thread Robert Coli
On Wed, Nov 26, 2014 at 10:17 AM, André Cruz wrote: > Of these, the row in question was present on: > Disco-NamespaceFile2-ic-5337-Data.db - tombstone column > Disco-NamespaceFile2-ic-5719-Data.db - no trace of that column > Disco-NamespaceFile2-ic-5748-Data.db - live column with original timesta

Re: Repair completes successfully but data is still inconsistent

2014-11-26 Thread André Cruz
On 24 Nov 2014, at 18:54, Robert Coli wrote: > > But for any given value on any given node, you can verify the value it has in > 100% of SStables... that's what both the normal read path and repair should > do when reconciling row fragments into the materialized row? Hard to > understand a cas

Re: Repair completes successfully but data is still inconsistent

2014-11-24 Thread Robert Coli
On Mon, Nov 24, 2014 at 10:39 AM, André Cruz wrote: > This data does not use TTLs. What other reason could there be for a mask? > If I connect using cassandra-cli to that specific node, which becomes the > coordinator, is it guaranteed to not ask another node when CL is ONE and it > contains that

Re: Repair completes successfully but data is still inconsistent

2014-11-24 Thread André Cruz
On 21 Nov 2014, at 19:01, Robert Coli wrote: > > 2- Why won’t repair propagate this column value to the other nodes? Repairs > have run everyday and the value is still missing on the other nodes. > > No idea. Are you sure it's not expired via TTL or masked in some other way? > When you ask tha

Re: Repair completes successfully but data is still inconsistent

2014-11-21 Thread Robert Coli
On Fri, Nov 21, 2014 at 3:11 AM, André Cruz wrote: > Can it be that they were all in the middle of a compaction (Leveled > compaction) and the new sstables were written but the old ones were not > deleted? Will Cassandra blindly pick up old and new sstables when it > restarts? > Yes. https://is

Re: Repair completes successfully but data is still inconsistent

2014-11-21 Thread André Cruz
On 19 Nov 2014, at 19:53, Robert Coli wrote: > > My hunch is that you originally triggered this by picking up some obsolete > SSTables during the 1.2 era. Probably if you clean up the existing zombies > you will not encounter them again, unless you encounter another "obsolete > sstables marked

Re: Repair completes successfully but data is still inconsistent

2014-11-19 Thread Robert Coli
On Wed, Nov 19, 2014 at 5:18 AM, André Cruz wrote: > Each node has 4-9 of these exceptions as it is going down after being > drained. It seems Cassandra was trying to delete an sstable. Can this be > related? > That seems plausible, though the versions of the files you indicate have the versions

Re: Repair completes successfully but data is still inconsistent

2014-11-19 Thread André Cruz
On 19 Nov 2014, at 11:37, André Cruz wrote: > > All the nodes were restarted on 21-23 October, for the upgrade (1.2.16 -> > 1.2.19) I mentioned. The delete happened after. I should also point out that > we were experiencing problems related to CASSANDRA-4206 and CASSANDRA-7808. Another possibl

Re: Repair completes successfully but data is still inconsistent

2014-11-19 Thread André Cruz
On 19 Nov 2014, at 00:43, Robert Coli wrote: > > @OP : can you repro if you run a major compaction between the deletion and > the tombstone collection? This happened in production and, AFAIK, for the first time in a system that has been running for 2 years. We have upgraded the Cassandra versi

Re: Repair completes successfully but data is still inconsistent

2014-11-18 Thread Robert Coli
On Tue, Nov 18, 2014 at 12:46 PM, Michael Shuler wrote: > `nodetool cleanup` also looks interesting as an option. I don't understand why cleanup or scrub would help with a case where data is being un-tombstoned. " 1 November - column is deleted - gc_grace_period is 10 days 8 November - all 3 r

Re: Repair completes successfully but data is still inconsistent

2014-11-18 Thread Michael Shuler
`nodetool cleanup` also looks interesting as an option. -- Michael

Re: Repair completes successfully but data is still inconsistent

2014-11-18 Thread André Cruz
On 18 Nov 2014, at 01:08, Michael Shuler wrote: > > André, does `nodetool gossipinfo` show all the nodes in schema agreement? > Yes: $ nodetool -h XXX.XXX.XXX.XXX gossipinfo |grep -i schema SCHEMA:8ef63726-c845-3565-9851-91c0074a9b5e SCHEMA:8ef63726-c845-3565-9851-91c0074a9b5e SCHEMA:8ef

Re: Repair completes successfully but data is still inconsistent

2014-11-17 Thread Michael Shuler
On 11/17/2014 05:22 AM, André Cruz wrote: I have checked the logs of the 3 replicas for that period and nothing really jumps out. Still, repairs have been running daily, the log reports that the CF is synced, and as of this moment one of the replicas still returns the zombie column so they don’t

Re: Repair completes successfully but data is still inconsistent

2014-11-17 Thread André Cruz
On 14 Nov 2014, at 18:44, André Cruz wrote: > > On 14 Nov 2014, at 18:29, Michael Shuler wrote: >> >> On 11/14/2014 12:12 PM, André Cruz wrote: >>> Some extra info. I checked the backups and on the 8th of November, all 3 >>> replicas had the tombstone of the deleted column. So: >>> >>> 1 Nove

Re: Repair completes successfully but data is still inconsistent

2014-11-14 Thread André Cruz
On 14 Nov 2014, at 18:29, Michael Shuler wrote: > > On 11/14/2014 12:12 PM, André Cruz wrote: >> Some extra info. I checked the backups and on the 8th of November, all 3 >> replicas had the tombstone of the deleted column. So: >> >> 1 November - column is deleted - gc_grace_period is 10 days >>

Re: Repair completes successfully but data is still inconsistent

2014-11-14 Thread Michael Shuler
On 11/14/2014 12:12 PM, André Cruz wrote: Some extra info. I checked the backups and on the 8th of November, all 3 replicas had the tombstone of the deleted column. So: 1 November - column is deleted - gc_grace_period is 10 days 8 November - all 3 replicas have tombstone 13/14 November - column

Re: Repair completes successfully but data is still inconsistent

2014-11-14 Thread André Cruz
Some extra info. I checked the backups and on the 8th of November, all 3 replicas had the tombstone of the deleted column. So: 1 November - column is deleted - gc_grace_period is 10 days 8 November - all 3 replicas have tombstone 13/14 November - column/tombstone is gone on 2 replicas, 3rd replic

Repair completes successfully but data is still inconsistent

2014-11-14 Thread André Cruz
Hello. So, I have detected a data inconsistency between my nodes: (Consistency level is ONE) [disco@Disco] get NamespaceFile2['45fc8996-41bc-429b-a382-5da9294eb59c:/XXXDIRXXX']['XXXFILEXXX']; Value was not found Elapsed time: 48 msec(s). [disco@Disco] get NamespaceFile2['45fc8996-41bc-429b-a382