> I wonder why disk_access_mode property is not in cassandra.yaml (looking
into trunk right now)

I think there's a prehistoric reason why it was removed but I can't
remember right now.

> Do you all think we can add it there with brief explanation what each
option does?

We could reinclude it as long as we provide a clear recommendation on when
to change from the default since this is an advanced setting which should
be rarely changed. But I still think we should provide a more
stable/foolproof default (mmap_index_only) since the current default (mmap)
is known to cause instability in some scenarios.

Also there is a technicality with changing the default, if we change the
"auto" behavior from mmap to mmap_index_only this may affect users relying
on the default "mmap" behavior. Not sure the best way to address that, is a
big NEWS note sufficient? Even though users are expected to read NEWS when
upgrading we know well not all users read it.

> Shall we also share this thread with @user?

Thanks Ekaterina! If we decide to change the default we can run this
through the user@ list to see what the user community thinks.

On Wed, Sep 6, 2023 at 4:45 PM Ekaterina Dimitrova <e.dimitr...@gmail.com>
wrote:

> Thanks for starting this discussion, Paulo!
>
> Shall we also share this thread with @user?
>
> On Wed, 6 Sep 2023 at 16:35, C. Scott Andreas <sc...@paradoxica.net>
> wrote:
>
>> Supportive of switching the default to mmap_index_only as well.
>>
>> I don’t have numbers handy to share, but my experience has been
>> significantly lower read latency and I wouldn’t run with auto. I’ve also
>> not observed substantial heap pressure after switching - it was strictly an
>> improvement.
>>
>> - Scott
>>
>> —
>> Mobile
>>
>> On Sep 6, 2023, at 8:50 AM, Paulo Motta <pauloricard...@gmail.com> wrote:
>>
>> 
>>
>> Hi,
>>
>> I've been bitten by OOMs with disk_access_mode:auto/mmap that were fixed
>> by changing to disk_access_mode:mmap_index_only. In a particular benchmark
>> I got 5x more read throughput on 3.11.x with disk_access_mode:
>> mmap_index_only vs disk_access_mode: auto/mmap.
>>
>> Changing disk_access_mode to mmap_index_only seems to be a common
>> recommendation on forums[1][2][3][4] and slack (find by searching
>> disk_access_mode in the #cassandra channel on https://the-asf.slack.com/
>> ).
>>
>> It's not clear to me when using the default disk_access_mode:auto/mmap is
>> beneficial, perhaps only when the read set fits in memory? Mick seems to
>> think on CASSANDRA-15531 [5], that mmap_index_only has a higher heap cost
>> and should be only used when warranted. However it's not uncommon to see
>> people being bitten with OOMs or lower read performance due to the default
>> disk_access_mode, so it makes me think it's not the best fool-proof default.
>>
>> Should we consider changing default "auto" behavior of "disk_access_mode"
>> to be "mmap_index_only" instead of "mmap" in 5.0 since it's likely safer
>> and perhaps more performant?
>>
>> Thanks,
>>
>> Paulo
>>
>> [1]
>> https://stackoverflow.com/questions/72272035/troubleshooting-and-fixing-cassandra-oom-issue
>> [2] https://phabricator.wikimedia.org/T137419
>> [3] https://stackoverflow.com/a/55975471
>> [4]
>> https://support.datastax.com/s/article/FAQ-Use-of-disk-access-mode-in-DSE-51-and-earlier
>> [5] https://issues.apache.org/jira/browse/CASSANDRA-15531
>>
>>

Reply via email to