Added one nit to the PR. Otherwise, this is awesome :)

On Wed, Nov 15, 2023 at 11:01 AM Jordan West <jw...@apache.org> wrote:

> I would also like to back this proposal. We change this default because
> several incidents have occurred by leaving the default of auto. There are
> rare cases where auto/mmap is the better option but as for a default
> mmap_index_only is safer.
>
> On Wed, Nov 15, 2023 at 6:35 AM Paulo Motta <pa...@apache.org> wrote:
>
>> Hi,
>>
>> I would like to get back to this. I proposed this default configuration
>> change on the user list ~1 month ago and there were no comments [1].
>>
>> I created CASSANDRA-19021 [2] to make the proposed change and Stefan
>> kindly submitted a patch, CI is looking good.
>>
>> Any objections to making this change in 5.0? If not, we will merge in 24
>> hours.
>>
>> Thanks,
>>
>> Paulo
>>
>> [1] - https://lists.apache.org/thread/w0gkdj7fhylycqwmd73p0kfck7jr8qth
>> [2] - https://issues.apache.org/jira/browse/CASSANDRA-19021
>>
>> On Wed, Sep 6, 2023 at 5:12 PM Paulo Motta <pauloricard...@gmail.com>
>> wrote:
>>
>>> > I wonder why disk_access_mode property is not in cassandra.yaml
>>> (looking into trunk right now)
>>>
>>> I think there's a prehistoric reason why it was removed but I can't
>>> remember right now.
>>>
>>> > Do you all think we can add it there with brief explanation what each
>>> option does?
>>>
>>> We could reinclude it as long as we provide a clear recommendation on
>>> when to change from the default since this is an advanced setting which
>>> should be rarely changed. But I still think we should provide a more
>>> stable/foolproof default (mmap_index_only) since the current default (mmap)
>>> is known to cause instability in some scenarios.
>>>
>>> Also there is a technicality with changing the default, if we change the
>>> "auto" behavior from mmap to mmap_index_only this may affect users relying
>>> on the default "mmap" behavior. Not sure the best way to address that, is a
>>> big NEWS note sufficient? Even though users are expected to read NEWS when
>>> upgrading we know well not all users read it.
>>>
>>> > Shall we also share this thread with @user?
>>>
>>> Thanks Ekaterina! If we decide to change the default we can run this
>>> through the user@ list to see what the user community thinks.
>>>
>>> On Wed, Sep 6, 2023 at 4:45 PM Ekaterina Dimitrova <
>>> e.dimitr...@gmail.com> wrote:
>>>
>>>> Thanks for starting this discussion, Paulo!
>>>>
>>>> Shall we also share this thread with @user?
>>>>
>>>> On Wed, 6 Sep 2023 at 16:35, C. Scott Andreas <sc...@paradoxica.net>
>>>> wrote:
>>>>
>>>>> Supportive of switching the default to mmap_index_only as well.
>>>>>
>>>>> I don’t have numbers handy to share, but my experience has been
>>>>> significantly lower read latency and I wouldn’t run with auto. I’ve also
>>>>> not observed substantial heap pressure after switching - it was strictly 
>>>>> an
>>>>> improvement.
>>>>>
>>>>> - Scott
>>>>>
>>>>> —
>>>>> Mobile
>>>>>
>>>>> On Sep 6, 2023, at 8:50 AM, Paulo Motta <pauloricard...@gmail.com>
>>>>> wrote:
>>>>>
>>>>> 
>>>>>
>>>>> Hi,
>>>>>
>>>>> I've been bitten by OOMs with disk_access_mode:auto/mmap that were
>>>>> fixed by changing to disk_access_mode:mmap_index_only. In a particular
>>>>> benchmark I got 5x more read throughput on 3.11.x with disk_access_mode:
>>>>> mmap_index_only vs disk_access_mode: auto/mmap.
>>>>>
>>>>> Changing disk_access_mode to mmap_index_only seems to be a common
>>>>> recommendation on forums[1][2][3][4] and slack (find by searching
>>>>> disk_access_mode in the #cassandra channel on
>>>>> https://the-asf.slack.com/).
>>>>>
>>>>> It's not clear to me when using the default
>>>>> disk_access_mode:auto/mmap is beneficial, perhaps only when the read set
>>>>> fits in memory? Mick seems to think on CASSANDRA-15531 [5], that
>>>>> mmap_index_only has a higher heap cost and should be only used when
>>>>> warranted. However it's not uncommon to see people being bitten with OOMs
>>>>> or lower read performance due to the default disk_access_mode, so it makes
>>>>> me think it's not the best fool-proof default.
>>>>>
>>>>> Should we consider changing default "auto" behavior of
>>>>> "disk_access_mode" to be "mmap_index_only" instead of "mmap" in 5.0 since
>>>>> it's likely safer and perhaps more performant?
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Paulo
>>>>>
>>>>> [1]
>>>>> https://stackoverflow.com/questions/72272035/troubleshooting-and-fixing-cassandra-oom-issue
>>>>> [2] https://phabricator.wikimedia.org/T137419
>>>>> [3] https://stackoverflow.com/a/55975471
>>>>> [4]
>>>>> https://support.datastax.com/s/article/FAQ-Use-of-disk-access-mode-in-DSE-51-and-earlier
>>>>> [5] https://issues.apache.org/jira/browse/CASSANDRA-15531
>>>>>
>>>>>

Reply via email to