[
https://issues.apache.org/jira/browse/GEODE-10286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17537839#comment-17537839
]
ASF subversion and git services commented on GEODE-10286:
---------------------------------------------------------
Commit 5936977165efa969368d35400e8f521d7bf0add9 in geode's branch
refs/heads/support/1.15 from Jinmei Liao
[ https://gitbox.apache.org/repos/asf?p=geode.git;h=5936977165 ]
GEODE-10286: handle CancelException in PersistenceAdvisor.close (#7677)
(cherry picked from commit e1860051f978cbd02d2bccd648175a7b79252f75)
> cache close in response to a forced disconnect with persistent regions may
> skip some cleanup
> ---------------------------------------------------------------------------------------------
>
> Key: GEODE-10286
> URL: https://issues.apache.org/jira/browse/GEODE-10286
> Project: Geode
> Issue Type: Bug
> Components: core
> Reporter: Darrel Schneider
> Assignee: Jinmei Liao
> Priority: Major
> Labels: needsTriage, pull-request-available
> Fix For: 1.16.0
>
>
> During a cache close, persistent regions may not cleanup as much as they
> should. This is because when the PersistentAdvisor is closed, CancelException
> is not handled causing other parts of the close to be skipped. I think the
> place to handle it is:
> DistributedRegion.distributedRegionCleanup(DistributedRegion.java:2564). Here
> is an exception showing what it looks like when this happens:
> {noformat}
> org.apache.geode.distributed.DistributedSystemDisconnectedException:
> Distribution manager on rs-RunItNow-ZH1504a1i3xlarge-hydra-client-10(dataStor
> egemfire2_host1_421:421)<ec><v22>:41004 started at Wed Mar 23 17:11:48 PDT
> 2022: Member isn't responding to heartbeat requests, caused by org.apac
> he.geode.ForcedDisconnectException: Member isn't responding to heartbeat
> requests
> at
> org.apache.geode.distributed.internal.ClusterDistributionManager$Stopper.generateCancelledException(ClusterDistributionManager.java:289
> 3)
> at
> org.apache.geode.distributed.internal.InternalDistributedSystem$Stopper.generateCancelledException(InternalDistributedSystem.java:1177)
> at
> org.apache.geode.CancelCriterion.checkCancelInProgress(CancelCriterion.java:83)
> at
> org.apache.geode.distributed.internal.ClusterElderManager.getElderId(ClusterElderManager.java:76)
> at
> org.apache.geode.distributed.internal.ClusterDistributionManager.getElderId(ClusterDistributionManager.java:2085)
> at
> org.apache.geode.distributed.internal.locks.DLockService.getElderId(DLockService.java:254)
> at
> org.apache.geode.distributed.internal.locks.DLockService.notLockGrantorId(DLockService.java:824)
> at
> org.apache.geode.distributed.internal.locks.DLockService.unlock(DLockService.java:1807)
> at
> org.apache.geode.internal.cache.persistence.PersistenceAdvisorImpl.releaseTieLock(PersistenceAdvisorImpl.java:1181)
> at
> org.apache.geode.internal.cache.persistence.PersistenceAdvisorImpl.close(PersistenceAdvisorImpl.java:1158)
> at
> org.apache.geode.internal.cache.DistributedRegion.distributedRegionCleanup(DistributedRegion.java:2564)
> at
> org.apache.geode.internal.cache.DistributedRegion.postDestroyRegion(DistributedRegion.java:2657)
> at
> org.apache.geode.internal.cache.LocalRegion.recursiveDestroyRegion(LocalRegion.java:2732)
> at
> org.apache.geode.internal.cache.LocalRegion.basicDestroyRegion(LocalRegion.java:6241)
> at
> org.apache.geode.internal.cache.DistributedRegion.basicDestroyRegion(DistributedRegion.java:1834)
> at
> org.apache.geode.internal.cache.LocalRegion.handleCacheClose(LocalRegion.java:7320)
> at
> org.apache.geode.internal.cache.DistributedRegion.handleCacheClose(DistributedRegion.java:2691)
> at
> org.apache.geode.internal.cache.GemFireCacheImpl.doClose(GemFireCacheImpl.java:2308)
> at
> org.apache.geode.internal.cache.GemFireCacheImpl.close(GemFireCacheImpl.java:2154)
> at
> org.apache.geode.distributed.internal.InternalDistributedSystem.disconnect(InternalDistributedSystem.java:1538)
> at
> org.apache.geode.distributed.internal.InternalDistributedSystem.reconnect(InternalDistributedSystem.java:2545)
> at
> org.apache.geode.distributed.internal.InternalDistributedSystem.tryReconnect(InternalDistributedSystem.java:2408)
> at
> org.apache.geode.distributed.internal.InternalDistributedSystem.disconnect(InternalDistributedSystem.java:1254)
> at
> org.apache.geode.distributed.internal.ClusterDistributionManager$DMListener.membershipFailure(ClusterDistributionManager.java:2329)
> at
> org.apache.geode.distributed.internal.membership.gms.GMSMembership.uncleanShutdown(GMSMembership.java:1190)
> at
> org.apache.geode.distributed.internal.membership.gms.GMSMembership$ManagerImpl.lambda$uncleanShutdownDS$0(GMSMembership.java:1793)
> at java.base/java.lang.Thread.run(Thread.java:833)
> Caused by: org.apache.geode.ForcedDisconnectException: Member isn't
> responding to heartbeat requests
> at
> org.apache.geode.distributed.internal.ClusterDistributionManager$DMListener.membershipFailure(ClusterDistributionManager.java:2319)
> ... 3 more
> {noformat}
--
This message was sent by Atlassian Jira
(v8.20.7#820007)