Barry Oglesby created GEODE-6564: ------------------------------------ Summary: Clearing a replicated region with expiration causes a memory leak Key: GEODE-6564 URL: https://issues.apache.org/jira/browse/GEODE-6564 Project: Geode Issue Type: Bug Components: regions Reporter: Barry Oglesby
Clearing a replicated region with expiration causes a memory leak Both the RegionEntries and EntryExpiryTasks are still live after loading entries into the region and then clearing it. Server Startup: {noformat} num #instances #bytes class name ---------------------------------------------- 1: 29856 2797840 [C 4: 2038 520600 [B Total 187711 10089624 {noformat} Load 100 entries with 600k payload (representing a session): {noformat} num #instances #bytes class name ---------------------------------------------- 1: 2496 60666440 [B 2: 30157 2828496 [C 73: 100 7200 org.apache.geode.internal.cache.entries.VersionedStatsRegionEntryHeapStringKey1 93: 100 4800 org.apache.geode.internal.cache.EntryExpiryTask Total 190737 70240472 {noformat} Clear region: {noformat} num #instances #bytes class name ---------------------------------------------- 1: 2398 60505944 [B 2: 30448 2849456 [C 74: 100 7200 org.apache.geode.internal.cache.entries.VersionedStatsRegionEntryHeapStringKey1 100: 100 4800 org.apache.geode.internal.cache.EntryExpiryTask Total 192199 70373048 {noformat} Load and clear another 100 entries: {noformat} num #instances #bytes class name ---------------------------------------------- 1: 2503 120511688 [B 2: 30506 2854384 [C 46: 200 14400 org.apache.geode.internal.cache.entries.VersionedStatsRegionEntryHeapStringKey1 61: 200 9600 org.apache.geode.internal.cache.EntryExpiryTask Total 193272 130421432 {noformat} Load and clear another 100 entries: {noformat} num #instances #bytes class name ---------------------------------------------- 1: 2600 180517240 [B 2: 30562 2859584 [C 33: 300 21600 org.apache.geode.internal.cache.entries.VersionedStatsRegionEntryHeapStringKey1 47: 300 14400 org.apache.geode.internal.cache.EntryExpiryTask Total 194310 190468176 {noformat} A heap dump shows the VersionedStatsRegionEntryHeapStringKey1 instances are referenced by the DistributedRegion entryExpiryTasks: {noformat} --> org.apache.geode.internal.cache.DistributedRegion@0x76adbbb88 (816 bytes) (field entryExpiryTasks:) --> java.util.concurrent.ConcurrentHashMap@0x76adbc028 (100 bytes) (field table:) --> [Ljava.util.concurrent.ConcurrentHashMap$Node;@0x76ee85358 (4112 bytes) (Element 276 of [Ljava.util.concurrent.ConcurrentHashMap$Node;@0x76ee85358:) --> java.util.concurrent.ConcurrentHashMap$Node@0x76edc4e20 (44 bytes) (field next:) --> java.util.concurrent.ConcurrentHashMap$Node@0x76edc32f0 (44 bytes) (field key:) --> org.apache.geode.internal.cache.entries.VersionedStatsRegionEntryHeapStringKey1@0x76edc3210 (86 bytes) {noformat} LocalRegion.cancelAllEntryExpiryTasks is called when the region is cleared: {noformat} java.lang.Exception: Stack trace at java.lang.Thread.dumpStack(Thread.java:1333) at org.apache.geode.internal.cache.LocalRegion.cancelAllEntryExpiryTasks(LocalRegion.java:8202) at org.apache.geode.internal.cache.LocalRegion.clearRegionLocally(LocalRegion.java:9094) at org.apache.geode.internal.cache.DistributedRegion.cmnClearRegion(DistributedRegion.java:1962) at org.apache.geode.internal.cache.LocalRegion.basicClear(LocalRegion.java:8998) at org.apache.geode.internal.cache.DistributedRegion.basicClear(DistributedRegion.java:1939) at org.apache.geode.internal.cache.LocalRegion.basicBridgeClear(LocalRegion.java:8988) at org.apache.geode.internal.cache.tier.sockets.command.ClearRegion.cmdExecute(ClearRegion.java:123) {noformat} But it doesn't clear the entryExpiryTasks map: {noformat} LocalRegion.clearRegionLocally before cancelAllEntryExpiryTasks entryExpiryTasks=100 LocalRegion.clearRegionLocally after cancelAllEntryExpiryTasks entryExpiryTasks=100 {noformat} As a test, I added this call to the bottom of the cancelAllEntryExpiryTasks method: {noformat} this.entryExpiryTasks.clear(); {noformat} This addressed the leak: {noformat} Server Startup: Total 182414 9855616 Load/Clear 1: Total 191049 10315832 Load/Clear 2: Total 191978 10329664 Load/Clear 3: Total 192638 10360360 {noformat} As a work-around, a Function that clears the region by using removeAll on batches of keys also addresses the leak: {noformat} Server Startup: Total 182297 9849312 Load/Clear 1: Total 185932 10019248 Load/Clear 2: Total 191855 10278816 Load/Clear 3: Total 192511 10313168 Load/Clear 4: Total 193424 10352008 {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)