I would add that, by changing the default to 0, we can then skip all of the "special" logic that almost no customers use. With a default of 1, we go into this logic every time unnecessarily, even when customers have not explicitly told us to "tolerate" an eviction or critical state change. I am in favor of this default change to 0, and also add that there are no customers who would even realize such a change in behavior has occurred. I would also suggest that tolerating 1 critical reading, delaying the subsequent behaviors in GemFire when above critical, could make us more vulnerable to OOME's than would be the case by immediately transitioning state.
My 2 cents. Thanks for the email Ryan. On Tue, Jan 22, 2019 at 10:22 AM Ryan McMahon <rmcma...@pivotal.io> wrote: > Hi all, > > I am currently fixing a bug > <https://issues.apache.org/jira/browse/GEODE-6304> with the > HeapMemoryMonitor event tolerance feature, and came across a decision that > I thought would be more appropriate for the Geode dev list. > > For those familiar with the feature, we are proposing that the default > gemfire.memoryEventTolerance config parameter value is changed from 1 to 0 > so state transitions from normal to eviction or critical occur immediately > after reading a single heap-used-bytes event above threshold. If you are > unfamiliar with the feature, read on. > > The memory event tolerance feature addresses issues with some JVM distros > that result in sporadic, erroneously high heap-bytes-used readings. The > feature was introduced to address this issue in the JRockit JVM, but it has > been found that other JVM distros are susceptible to this problem as well. > > The feature prevents an "unexpected" state transition from a normal state > to an eviction or critical state by requiring N (configurable) consecutive > heap-used-byte events above threshold before changing states. The current > default configuration is N = 5 for JRockit and N = 1 for all other JVMs. > In a non-JRockit JVM, this configuration permits a single event above > threshold WITHOUT causing a state transition. In other words, by default, > we allow for a single bad outlier heap-used-bytes reading without going > into an eviction or critical state. > > As part of this bug fix (which involves a failure to reset the tolerance > counter under some conditions), we opted to remove the special handling for > JRockit because JRockit is no longer supported. After removing the JRockit > handling, we started re-evaluating if a default value of 1 is appropriate > for all other JVMs. We are considering changing the default to 0, so state > transitions would occur immediately if an event above the threshold is > received. If a user is facing one of these problematic JVMs, they can then > change the gemfire.memoryEventTolerance config parameter to increase the > tolerance. Our concern is that the default today is potentially masking > bad heap readings without the user ever knowing. > > To summarize, if we change the default from 1 to 0 it would potentially be > a change in behavior in that we would no longer be masking a single bad > heap-used-bytes reading i.e. no longer permitting a single outlier without > changing states. The user can then decide whether to configure a non-zero > tolerance to address the situation. Any thoughts on this change in > behavior? > > Thanks, > Ryan > > > > > > > > -- David Wisler | GemFire Support Product Manager | 503-810-7840 cell Support.Pivotal.io <http://www.google.com/url?q=http%3A%2F%2Fsupport.pivotal.io%2F&sa=D&sntz=1&usg=AFQjCNGDBr_XSKC18wot5h3OkKoZ84Vn7Q> | Mon-Fri 8:00am to 5:00pm PST | 1-877-477-2269 [image: support] <https://www.google.com/url?q=https%3A%2F%2Fsupport.pivotal.io%2F&sa=D&sntz=1&usg=AFQjCNEvwKLjzu29inKwy4jJjKsboqGMCg> [image: twitter] <https://www.google.com/url?q=https%3A%2F%2Ftwitter.com%2Fpivotal&sa=D&sntz=1&usg=AFQjCNG1FcqkH5ghKsSG6UkdeUzjSuDSHg> [image: linkedin] <https://www.google.com/url?q=https%3A%2F%2Fwww.linkedin.com%2Fcompany%2F3048967&sa=D&sntz=1&usg=AFQjCNHOQGYmDYIQz06S3-vAuqzf8bN8Yw> [image: facebook] <https://www.google.com/url?q=https%3A%2F%2Fwww.facebook.com%2Fpivotalsoftware&sa=D&sntz=1&usg=AFQjCNFQnPFtec1Rp3lKf6MuY1jcbA8j2A> [image: google plus] <https://plus.google.com/+Pivotal> [image: youtube] <https://www.youtube.com/playlist?list=PLAdzTan_eSPScpj2J50ErtzR9ANSzv3kl>