> I wonder why disk_access_mode property is not in cassandra.yaml (looking into trunk right now)
I think there's a prehistoric reason why it was removed but I can't remember right now. > Do you all think we can add it there with brief explanation what each option does? We could reinclude it as long as we provide a clear recommendation on when to change from the default since this is an advanced setting which should be rarely changed. But I still think we should provide a more stable/foolproof default (mmap_index_only) since the current default (mmap) is known to cause instability in some scenarios. Also there is a technicality with changing the default, if we change the "auto" behavior from mmap to mmap_index_only this may affect users relying on the default "mmap" behavior. Not sure the best way to address that, is a big NEWS note sufficient? Even though users are expected to read NEWS when upgrading we know well not all users read it. > Shall we also share this thread with @user? Thanks Ekaterina! If we decide to change the default we can run this through the user@ list to see what the user community thinks. On Wed, Sep 6, 2023 at 4:45 PM Ekaterina Dimitrova <e.dimitr...@gmail.com> wrote: > Thanks for starting this discussion, Paulo! > > Shall we also share this thread with @user? > > On Wed, 6 Sep 2023 at 16:35, C. Scott Andreas <sc...@paradoxica.net> > wrote: > >> Supportive of switching the default to mmap_index_only as well. >> >> I don’t have numbers handy to share, but my experience has been >> significantly lower read latency and I wouldn’t run with auto. I’ve also >> not observed substantial heap pressure after switching - it was strictly an >> improvement. >> >> - Scott >> >> — >> Mobile >> >> On Sep 6, 2023, at 8:50 AM, Paulo Motta <pauloricard...@gmail.com> wrote: >> >> >> >> Hi, >> >> I've been bitten by OOMs with disk_access_mode:auto/mmap that were fixed >> by changing to disk_access_mode:mmap_index_only. In a particular benchmark >> I got 5x more read throughput on 3.11.x with disk_access_mode: >> mmap_index_only vs disk_access_mode: auto/mmap. >> >> Changing disk_access_mode to mmap_index_only seems to be a common >> recommendation on forums[1][2][3][4] and slack (find by searching >> disk_access_mode in the #cassandra channel on https://the-asf.slack.com/ >> ). >> >> It's not clear to me when using the default disk_access_mode:auto/mmap is >> beneficial, perhaps only when the read set fits in memory? Mick seems to >> think on CASSANDRA-15531 [5], that mmap_index_only has a higher heap cost >> and should be only used when warranted. However it's not uncommon to see >> people being bitten with OOMs or lower read performance due to the default >> disk_access_mode, so it makes me think it's not the best fool-proof default. >> >> Should we consider changing default "auto" behavior of "disk_access_mode" >> to be "mmap_index_only" instead of "mmap" in 5.0 since it's likely safer >> and perhaps more performant? >> >> Thanks, >> >> Paulo >> >> [1] >> https://stackoverflow.com/questions/72272035/troubleshooting-and-fixing-cassandra-oom-issue >> [2] https://phabricator.wikimedia.org/T137419 >> [3] https://stackoverflow.com/a/55975471 >> [4] >> https://support.datastax.com/s/article/FAQ-Use-of-disk-access-mode-in-DSE-51-and-earlier >> [5] https://issues.apache.org/jira/browse/CASSANDRA-15531 >> >>