[ https://issues.apache.org/jira/browse/GEODE-1672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15848824#comment-15848824 ]
ASF subversion and git services commented on GEODE-1672: -------------------------------------------------------- Commit e606f3e6ec0828f5fc30e20a9dbdf3aa8c3c8620 in geode's branch refs/heads/develop from [~agingade] [ https://git-wip-us.apache.org/repos/asf?p=geode.git;h=e606f3e ] GEODE-1672: Disabled recovering values for LRU region during startup. When recovering persistent files, system stores the values in temp maps (for regions) using a background thread, as these maps are not actual regions, the temp-regions are not considered/included for LRU eviction, which causes the system to run OOM. The problem is fixed by skipping recovering values for LRU regions. A new system property ""disk.recoverLruValues" is added to support reading values for lru regions. > When amount of overflowed persisted data exceeds heap size startup may run > out of memory > ---------------------------------------------------------------------------------------- > > Key: GEODE-1672 > URL: https://issues.apache.org/jira/browse/GEODE-1672 > Project: Geode > Issue Type: Bug > Components: persistence > Reporter: Darrel Schneider > > Basically, when the amount of data overflowed approaches the heap size, ,such > that the total amount of data is very close to or actually surpasses your > total tenured heap, it is possible that you will not be able to restart. > The algorithm during recovery of oplogs/buckets is such that we don't "evict" > in the normal sense as data fills the heap during early stages of recovery > prior to creating the regions. When the data is first created in the heap, > it's not yet official in the region. > At any rate, if during this early phase of recovery, or during subsequent > phase where eviction is working as usual, it is possible that the total data > or an early imbalance of buckets prior to the opportunity to rebalance causes > us to surpass the critical threshold which will kill us before successful > startup. > To reproduce, you could have 1 region with tons of data that evicts and > overflows with persistence. Call it R1. Then another region with persistence > that does not evict. Call it R2. > List R1 fist in the cache.xml file. Start running the system and add data > over time until you have overflowed tons of data approaching the heap size in > the evicted region, and also have enough data in the R2 region. > Once you fill these regions with enough data and have overflowed enough to > disk and persisted the other region, then shutdown, and then attempt to > restart. If you put enough data in, you will hit the critical threshold > before being able to complete startup. > You can work around this issue by configuring geode to not recovery values by > setting this system property: -Dgemfire.disk.recoverValues=false > Values will not be faulted into memory until a read operation is done on that > value's key. > If you have regions that do not use overflow and some that do then another > work around is the create the regions that do not use overflow first. -- This message was sent by Atlassian JIRA (v6.3.15#6346)