Do not use Cassandra 3.11.0+ or Cassandra 3.0.12+
Hello, Current latest Cassandra version (3.11.0, possibly also 3.0.12+) has a race condition that causes Cassandra to create broken sstables (stats file in sstables to be precise). Bug described here: https://issues.apache.org/jira/browse/CASSANDRA-13752 This change might be causing it (but not sure): https://issues.apache.org/jira/browse/CASSANDRA-13038 Other related issues: https://issues.apache.org/jira/browse/CASSANDRA-13718 https://issues.apache.org/jira/browse/CASSANDRA-13756 I would not recommend using 3.11.0 nor upgrading to 3.0.12 or higher before this is fixed. Cheers, Hannu
Re: CASSANDRA-9472 Reintroduce off heap memtables - patch to 3.0
Hi Andrew, Do you mind sharing the backport patch? We're very interested in that, 20-30% improvement sounds great to us. Thanks, Jay On 7/27/17 11:52 PM, Andrew Whang wrote: > Yes, seeing latency improvement after backporting 9472 to 3.0.13. We are > measuring p99 latency, thus moving objects off heap improved gc stalls, > which directly affects our read/write p99 latency. > > On Thu, Jul 27, 2017 at 10:54 PM, Jeff Jirsa wrote: > >> This is after you backported 9472 to 3.0? >> >> -- >> Jeff Jirsa >> >> >>> On Jul 27, 2017, at 10:33 PM, Andrew Whang >> wrote: >>> >>> Jay, >>> >>> We see ~20% write latency improvement on 3.0.13 in a write-heavy >> workload, >>> using offheap_objects. offheap_buffers only offered minimal improvement. >>> >>> On Thu, Jul 27, 2017 at 10:06 PM, Jay Zhuang >> >>> wrote: >>> Hi Andrew, Do you see performance gain from reintroducing off-heap memtables for 3.0.x? When we were on 2.2.x we saw big improvements from enabling off-heap memtables. Thanks, Jay > On 7/27/17 9:37 PM, Andrew Whang wrote: > I'm wondering if anyone has been able to patch CASSANDRA-9472 to 3.0, > without breaking unit tests. The patch was introduced in 3.4, but 3.0.x > contains unit tests and code from later 3.x releases, which makes debugging > unit test failures difficult - i.e. SSTableCorruptionDetectionTest, which > was introduced in 3.7 and is found in 3.0.14, but not in 3.4. > - To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For additional commands, e-mail: dev-h...@cassandra.apache.org >> >> - >> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org >> For additional commands, e-mail: dev-h...@cassandra.apache.org >> >> > - To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For additional commands, e-mail: dev-h...@cassandra.apache.org
Re: Do not use Cassandra 3.11.0+ or Cassandra 3.0.12+
We're using 3.0.12+ for a few months and haven't seen the issue like that. Do we know what could trigger the problem? Or is 3.0.x really impacted? Thanks, Jay On 8/28/17 6:02 AM, Hannu Kröger wrote: > Hello, > > Current latest Cassandra version (3.11.0, possibly also 3.0.12+) has a race > condition that causes Cassandra to create broken sstables (stats file in > sstables to be precise). > > Bug described here: > https://issues.apache.org/jira/browse/CASSANDRA-13752 > > This change might be causing it (but not sure): > https://issues.apache.org/jira/browse/CASSANDRA-13038 > > Other related issues: > https://issues.apache.org/jira/browse/CASSANDRA-13718 > https://issues.apache.org/jira/browse/CASSANDRA-13756 > > I would not recommend using 3.11.0 nor upgrading to 3.0.12 or higher before > this is fixed. > > Cheers, > Hannu > - To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For additional commands, e-mail: dev-h...@cassandra.apache.org
Re: Do not use Cassandra 3.11.0+ or Cassandra 3.0.12+
For what it's worth, I don't think this impacts 3.0 without adding some other code change (the reporter of the bug on 3.0 had added custom metrics that exposed a concurrency issue). We're looking at it on 3.11. I think 13038 made it far more likely to occur, but I think it could have happened pre-13038 as well (would take some serious luck with your deletion time distribution though - the rounding in 13038 does make it more likely, but the race was already there). -- Jeff Jirsa > On Aug 28, 2017, at 8:24 PM, Jay Zhuang wrote: > > We're using 3.0.12+ for a few months and haven't seen the issue like > that. Do we know what could trigger the problem? Or is 3.0.x really > impacted? > > Thanks, > Jay > >> On 8/28/17 6:02 AM, Hannu Kröger wrote: >> Hello, >> >> Current latest Cassandra version (3.11.0, possibly also 3.0.12+) has a race >> condition that causes Cassandra to create broken sstables (stats file in >> sstables to be precise). >> >> Bug described here: >> https://issues.apache.org/jira/browse/CASSANDRA-13752 >> >> This change might be causing it (but not sure): >> https://issues.apache.org/jira/browse/CASSANDRA-13038 >> >> Other related issues: >> https://issues.apache.org/jira/browse/CASSANDRA-13718 >> https://issues.apache.org/jira/browse/CASSANDRA-13756 >> >> I would not recommend using 3.11.0 nor upgrading to 3.0.12 or higher before >> this is fixed. >> >> Cheers, >> Hannu >> > > - > To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org > For additional commands, e-mail: user-h...@cassandra.apache.org > - To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For additional commands, e-mail: dev-h...@cassandra.apache.org
Re: Do not use Cassandra 3.11.0+ or Cassandra 3.0.12+
I shouldn't actually say I don't think it can happen on 3.0 - I haven't seen this happen on 3.0 without some other code change to enable it, but like I said, we're still investigating. -- Jeff Jirsa > On Aug 28, 2017, at 8:30 PM, Jeff Jirsa wrote: > > For what it's worth, I don't think this impacts 3.0 without adding some other > code change (the reporter of the bug on 3.0 had added custom metrics that > exposed a concurrency issue). > > We're looking at it on 3.11. I think 13038 made it far more likely to occur, > but I think it could have happened pre-13038 as well (would take some serious > luck with your deletion time distribution though - the rounding in 13038 does > make it more likely, but the race was already there). > > -- > Jeff Jirsa > > >> On Aug 28, 2017, at 8:24 PM, Jay Zhuang wrote: >> >> We're using 3.0.12+ for a few months and haven't seen the issue like >> that. Do we know what could trigger the problem? Or is 3.0.x really >> impacted? >> >> Thanks, >> Jay >> >>> On 8/28/17 6:02 AM, Hannu Kröger wrote: >>> Hello, >>> >>> Current latest Cassandra version (3.11.0, possibly also 3.0.12+) has a race >>> condition that causes Cassandra to create broken sstables (stats file in >>> sstables to be precise). >>> >>> Bug described here: >>> https://issues.apache.org/jira/browse/CASSANDRA-13752 >>> >>> This change might be causing it (but not sure): >>> https://issues.apache.org/jira/browse/CASSANDRA-13038 >>> >>> Other related issues: >>> https://issues.apache.org/jira/browse/CASSANDRA-13718 >>> https://issues.apache.org/jira/browse/CASSANDRA-13756 >>> >>> I would not recommend using 3.11.0 nor upgrading to 3.0.12 or higher before >>> this is fixed. >>> >>> Cheers, >>> Hannu >>> >> >> - >> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org >> For additional commands, e-mail: user-h...@cassandra.apache.org >> - To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For additional commands, e-mail: dev-h...@cassandra.apache.org
Re: CASSANDRA-9472 Reintroduce off heap memtables - patch to 3.0
Hi Jay, Here's the backport to 3.0.14 - https://github.com/whangsf/cassandra/commit/8db2e3ed412e42fed1da2d85ee7d086edcc8ae4c. This should pass all unit tests, but please let me know if you have any issues. Thanks, Andrew On Mon, Aug 28, 2017 at 7:35 PM, Jay Zhuang wrote: > Hi Andrew, > > Do you mind sharing the backport patch? We're very interested in that, > 20-30% improvement sounds great to us. > > Thanks, > Jay > > On 7/27/17 11:52 PM, Andrew Whang wrote: > > Yes, seeing latency improvement after backporting 9472 to 3.0.13. We are > > measuring p99 latency, thus moving objects off heap improved gc stalls, > > which directly affects our read/write p99 latency. > > > > On Thu, Jul 27, 2017 at 10:54 PM, Jeff Jirsa wrote: > > > >> This is after you backported 9472 to 3.0? > >> > >> -- > >> Jeff Jirsa > >> > >> > >>> On Jul 27, 2017, at 10:33 PM, Andrew Whang > >> wrote: > >>> > >>> Jay, > >>> > >>> We see ~20% write latency improvement on 3.0.13 in a write-heavy > >> workload, > >>> using offheap_objects. offheap_buffers only offered minimal > improvement. > >>> > >>> On Thu, Jul 27, 2017 at 10:06 PM, Jay Zhuang > >> > >>> wrote: > >>> > Hi Andrew, > > Do you see performance gain from reintroducing off-heap memtables for > 3.0.x? When we were on 2.2.x we saw big improvements from enabling > off-heap memtables. > > Thanks, > Jay > > > On 7/27/17 9:37 PM, Andrew Whang wrote: > > I'm wondering if anyone has been able to patch CASSANDRA-9472 to 3.0, > > without breaking unit tests. The patch was introduced in 3.4, but > 3.0.x > > contains unit tests and code from later 3.x releases, which makes > debugging > > unit test failures difficult - i.e. SSTableCorruptionDetectionTest, > which > > was introduced in 3.7 and is found in 3.0.14, but not in 3.4. > > > > - > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > For additional commands, e-mail: dev-h...@cassandra.apache.org > > > >> > >> - > >> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > >> For additional commands, e-mail: dev-h...@cassandra.apache.org > >> > >> > > > > - > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > For additional commands, e-mail: dev-h...@cassandra.apache.org > >