Do not use Cassandra 3.11.0+ or Cassandra 3.0.12+

2017-08-28 Thread Hannu Kröger
Hello,

Current latest Cassandra version (3.11.0, possibly also 3.0.12+) has a race
condition that causes Cassandra to create broken sstables (stats file in
sstables to be precise).

Bug described here:
https://issues.apache.org/jira/browse/CASSANDRA-13752

This change might be causing it (but not sure):
https://issues.apache.org/jira/browse/CASSANDRA-13038

Other related issues:
https://issues.apache.org/jira/browse/CASSANDRA-13718
https://issues.apache.org/jira/browse/CASSANDRA-13756

I would not recommend using 3.11.0 nor upgrading to 3.0.12 or higher before
this is fixed.

Cheers,
Hannu


Re: CASSANDRA-9472 Reintroduce off heap memtables - patch to 3.0

2017-08-28 Thread Jay Zhuang
Hi Andrew,

Do you mind sharing the backport patch? We're very interested in that,
20-30% improvement sounds great to us.

Thanks,
Jay

On 7/27/17 11:52 PM, Andrew Whang wrote:
> Yes, seeing latency improvement after backporting 9472 to 3.0.13. We are
> measuring p99 latency, thus moving objects off heap improved gc stalls,
> which directly affects our read/write p99 latency.
> 
> On Thu, Jul 27, 2017 at 10:54 PM, Jeff Jirsa  wrote:
> 
>> This is after you backported 9472 to 3.0?
>>
>> --
>> Jeff Jirsa
>>
>>
>>> On Jul 27, 2017, at 10:33 PM, Andrew Whang 
>> wrote:
>>>
>>> Jay,
>>>
>>> We see ~20% write latency improvement on 3.0.13 in a write-heavy
>> workload,
>>> using offheap_objects. offheap_buffers only offered minimal improvement.
>>>
>>> On Thu, Jul 27, 2017 at 10:06 PM, Jay Zhuang
>> 
>>> wrote:
>>>
 Hi Andrew,

 Do you see performance gain from reintroducing off-heap memtables for
 3.0.x? When we were on 2.2.x we saw big improvements from enabling
 off-heap memtables.

 Thanks,
 Jay

> On 7/27/17 9:37 PM, Andrew Whang wrote:
> I'm wondering if anyone has been able to patch CASSANDRA-9472 to 3.0,
> without breaking unit tests. The patch was introduced in 3.4, but 3.0.x
> contains unit tests and code from later 3.x releases, which makes
 debugging
> unit test failures difficult - i.e. SSTableCorruptionDetectionTest,
 which
> was introduced in 3.7 and is found in 3.0.14, but not in 3.4.
>

 -
 To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
 For additional commands, e-mail: dev-h...@cassandra.apache.org


>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>>
>>
> 

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Do not use Cassandra 3.11.0+ or Cassandra 3.0.12+

2017-08-28 Thread Jay Zhuang
We're using 3.0.12+ for a few months and haven't seen the issue like
that. Do we know what could trigger the problem? Or is 3.0.x really
impacted?

Thanks,
Jay

On 8/28/17 6:02 AM, Hannu Kröger wrote:
> Hello,
> 
> Current latest Cassandra version (3.11.0, possibly also 3.0.12+) has a race
> condition that causes Cassandra to create broken sstables (stats file in
> sstables to be precise).
> 
> Bug described here:
> https://issues.apache.org/jira/browse/CASSANDRA-13752
> 
> This change might be causing it (but not sure):
> https://issues.apache.org/jira/browse/CASSANDRA-13038
> 
> Other related issues:
> https://issues.apache.org/jira/browse/CASSANDRA-13718
> https://issues.apache.org/jira/browse/CASSANDRA-13756
> 
> I would not recommend using 3.11.0 nor upgrading to 3.0.12 or higher before
> this is fixed.
> 
> Cheers,
> Hannu
> 

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Do not use Cassandra 3.11.0+ or Cassandra 3.0.12+

2017-08-28 Thread Jeff Jirsa
For what it's worth, I don't think this impacts 3.0 without adding some other 
code change (the reporter of the bug on 3.0 had added custom metrics that 
exposed a concurrency issue).

We're looking at it on 3.11. I think 13038 made it far more likely to occur, 
but I think it could have happened pre-13038 as well (would take some serious 
luck with your deletion time distribution though - the rounding in 13038 does 
make it more likely, but the race was already there). 

-- 
Jeff Jirsa


> On Aug 28, 2017, at 8:24 PM, Jay Zhuang  wrote:
> 
> We're using 3.0.12+ for a few months and haven't seen the issue like
> that. Do we know what could trigger the problem? Or is 3.0.x really
> impacted?
> 
> Thanks,
> Jay
> 
>> On 8/28/17 6:02 AM, Hannu Kröger wrote:
>> Hello,
>> 
>> Current latest Cassandra version (3.11.0, possibly also 3.0.12+) has a race
>> condition that causes Cassandra to create broken sstables (stats file in
>> sstables to be precise).
>> 
>> Bug described here:
>> https://issues.apache.org/jira/browse/CASSANDRA-13752
>> 
>> This change might be causing it (but not sure):
>> https://issues.apache.org/jira/browse/CASSANDRA-13038
>> 
>> Other related issues:
>> https://issues.apache.org/jira/browse/CASSANDRA-13718
>> https://issues.apache.org/jira/browse/CASSANDRA-13756
>> 
>> I would not recommend using 3.11.0 nor upgrading to 3.0.12 or higher before
>> this is fixed.
>> 
>> Cheers,
>> Hannu
>> 
> 
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
> 

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Do not use Cassandra 3.11.0+ or Cassandra 3.0.12+

2017-08-28 Thread Jeff Jirsa
I shouldn't actually say I don't think it can happen on 3.0 - I haven't seen 
this happen on 3.0 without some other code change to enable it, but like I 
said, we're still investigating. 

-- 
Jeff Jirsa


> On Aug 28, 2017, at 8:30 PM, Jeff Jirsa  wrote:
> 
> For what it's worth, I don't think this impacts 3.0 without adding some other 
> code change (the reporter of the bug on 3.0 had added custom metrics that 
> exposed a concurrency issue).
> 
> We're looking at it on 3.11. I think 13038 made it far more likely to occur, 
> but I think it could have happened pre-13038 as well (would take some serious 
> luck with your deletion time distribution though - the rounding in 13038 does 
> make it more likely, but the race was already there). 
> 
> -- 
> Jeff Jirsa
> 
> 
>> On Aug 28, 2017, at 8:24 PM, Jay Zhuang  wrote:
>> 
>> We're using 3.0.12+ for a few months and haven't seen the issue like
>> that. Do we know what could trigger the problem? Or is 3.0.x really
>> impacted?
>> 
>> Thanks,
>> Jay
>> 
>>> On 8/28/17 6:02 AM, Hannu Kröger wrote:
>>> Hello,
>>> 
>>> Current latest Cassandra version (3.11.0, possibly also 3.0.12+) has a race
>>> condition that causes Cassandra to create broken sstables (stats file in
>>> sstables to be precise).
>>> 
>>> Bug described here:
>>> https://issues.apache.org/jira/browse/CASSANDRA-13752
>>> 
>>> This change might be causing it (but not sure):
>>> https://issues.apache.org/jira/browse/CASSANDRA-13038
>>> 
>>> Other related issues:
>>> https://issues.apache.org/jira/browse/CASSANDRA-13718
>>> https://issues.apache.org/jira/browse/CASSANDRA-13756
>>> 
>>> I would not recommend using 3.11.0 nor upgrading to 3.0.12 or higher before
>>> this is fixed.
>>> 
>>> Cheers,
>>> Hannu
>>> 
>> 
>> -
>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: user-h...@cassandra.apache.org
>> 

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: CASSANDRA-9472 Reintroduce off heap memtables - patch to 3.0

2017-08-28 Thread Andrew Whang
Hi Jay,

Here's the backport to 3.0.14 -
https://github.com/whangsf/cassandra/commit/8db2e3ed412e42fed1da2d85ee7d086edcc8ae4c.
This should pass all unit tests, but please let me know if you have any
issues.

Thanks,
Andrew

On Mon, Aug 28, 2017 at 7:35 PM, Jay Zhuang 
wrote:

> Hi Andrew,
>
> Do you mind sharing the backport patch? We're very interested in that,
> 20-30% improvement sounds great to us.
>
> Thanks,
> Jay
>
> On 7/27/17 11:52 PM, Andrew Whang wrote:
> > Yes, seeing latency improvement after backporting 9472 to 3.0.13. We are
> > measuring p99 latency, thus moving objects off heap improved gc stalls,
> > which directly affects our read/write p99 latency.
> >
> > On Thu, Jul 27, 2017 at 10:54 PM, Jeff Jirsa  wrote:
> >
> >> This is after you backported 9472 to 3.0?
> >>
> >> --
> >> Jeff Jirsa
> >>
> >>
> >>> On Jul 27, 2017, at 10:33 PM, Andrew Whang 
> >> wrote:
> >>>
> >>> Jay,
> >>>
> >>> We see ~20% write latency improvement on 3.0.13 in a write-heavy
> >> workload,
> >>> using offheap_objects. offheap_buffers only offered minimal
> improvement.
> >>>
> >>> On Thu, Jul 27, 2017 at 10:06 PM, Jay Zhuang
> >> 
> >>> wrote:
> >>>
>  Hi Andrew,
> 
>  Do you see performance gain from reintroducing off-heap memtables for
>  3.0.x? When we were on 2.2.x we saw big improvements from enabling
>  off-heap memtables.
> 
>  Thanks,
>  Jay
> 
> > On 7/27/17 9:37 PM, Andrew Whang wrote:
> > I'm wondering if anyone has been able to patch CASSANDRA-9472 to 3.0,
> > without breaking unit tests. The patch was introduced in 3.4, but
> 3.0.x
> > contains unit tests and code from later 3.x releases, which makes
>  debugging
> > unit test failures difficult - i.e. SSTableCorruptionDetectionTest,
>  which
> > was introduced in 3.7 and is found in 3.0.14, but not in 3.4.
> >
> 
>  -
>  To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>  For additional commands, e-mail: dev-h...@cassandra.apache.org
> 
> 
> >>
> >> -
> >> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> >> For additional commands, e-mail: dev-h...@cassandra.apache.org
> >>
> >>
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>