[ https://issues.apache.org/jira/browse/GEODE-5547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16692383#comment-16692383 ]
Kirk Lund commented on GEODE-5547: ---------------------------------- In the middle of the method LocalRegion.basicDestroyRegion is the invocation of InternalDistributedSystem.handleResourceEvent: {noformat} if (!isInternalRegion()) { InternalDistributedSystem system = this.cache.getInternalDistributedSystem(); system.handleResourceEvent(ResourceEvent.REGION_REMOVE, this); } {noformat} I think something would have to prevent this block from being reached or executed when closing the region or maybe even an early return from system.handleResourceEvent without doing anything: {noformat} public void handleResourceEvent(ResourceEvent event, Object resource) { if (disableManagement) { return; } if (resourceListeners.size() == 0) { return; } notifyResourceEventListeners(event, resource); } {noformat} disableManagement is final and should be false for this test (it's a management test). resourceListeners is not final but it is a CopyOnWriteArrayList, so unless checking the size() found a null value this should be ok. Since the manager received a create region notification, I'm going to say this was not null. Beyond that InternalDistributedSystem.notifyResourceEventListeners would have thrown or logged a warning unless ManagementListener threw CancelException which it shouldn't do based on the info level logs from stdout. That leaves the small chance of some subtle bug (probably a race condition or concurrency bug) in ManagementListener.java. The first line of ManagementListener.shouldProceed is: {noformat} InternalDistributedSystem.getConnectedInstance(); {noformat} ...looks weird but the point of this line is to check for cancelation. It will either return (and do nothing) or if the cache is being closed for any reason then it'll throw a CancelException. Again, based on the lines of the test and the info level logs from stdout, it really doesn't look like it would have a thread invoking Cache.close or DistributedSystem.disconnect -- but if it did then that would explain the failure. So far, I can't reproduce the failure and it's not obvious what changes to the test would help prevent this since there's no action that would be closing the Cache or DistributedSystem in vm3 while vm0 awaits the JMX notifications. > RegionManagementDUnitTest > testFixedPRRegionMBean FAILED > --------------------------------------------------------- > > Key: GEODE-5547 > URL: https://issues.apache.org/jira/browse/GEODE-5547 > Project: Geode > Issue Type: Bug > Reporter: Jinmei Liao > Assignee: Kirk Lund > Priority: Minor > Labels: Flaky, pull-request-available, swat > Fix For: 1.7.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > recent failure: > https://concourse.apachegeode-ci.info/teams/main/pipelines/develop/jobs/DistributedTest/builds/231#L5b5be314:763 > {noformat} > org.apache.geode.management.RegionManagementDUnitTest > > testFixedPRRegionMBean FAILED > {noformat} > {noformat} > org.apache.geode.test.dunit.RMIException: While invoking > org.apache.geode.test.dunit.NamedRunnable.run in VM 0 running on Host > 0f4308c1376f with 4 VMs > at org.apache.geode.test.dunit.VM.invoke(VM.java:443) > at org.apache.geode.test.dunit.VM.invoke(VM.java:412) > at org.apache.geode.test.dunit.VM.invoke(VM.java:343) > at > org.apache.geode.management.RegionManagementDUnitTest.verifyMemberNotifications(RegionManagementDUnitTest.java:565) > at > org.apache.geode.management.RegionManagementDUnitTest.testFixedPRRegionMBean(RegionManagementDUnitTest.java:209) > > Caused by: > org.awaitility.core.ConditionTimeoutException: Condition defined as a lambda > expression in org.apache.geode.management.RegionManagementDUnitTest that uses > int > Expected size:<6> but was:<4> in: > > <[javax.management.Notification[source=172.17.0.28(265)<v46>-32770][type=gemfire.distributedsystem.cache.region.created][message=Region > Created With Name /MANAGEMENT_FIXED_PR], > > javax.management.Notification[source=172.17.0.28(276)<v47>-32771][type=gemfire.distributedsystem.cache.region.created][message=Region > Created With Name /MANAGEMENT_FIXED_PR], > > javax.management.Notification[source=172.17.0.28(265)<v46>-32770][type=gemfire.distributedsystem.cache.region.closed][message=Region > Destroyed/Closed With Name /MANAGEMENT_FIXED_PR], > > javax.management.Notification[source=172.17.0.28(276)<v47>-32771][type=gemfire.distributedsystem.cache.region.closed][message=Region > Destroyed/Closed With Name /MANAGEMENT_FIXED_PR]]> within 2 minutes. > at org.awaitility.core.ConditionAwaiter.await(ConditionAwaiter.java:104) > at org.awaitility.core.AssertionCondition.await(AssertionCondition.java:117) > at org.awaitility.core.AssertionCondition.await(AssertionCondition.java:32) > at org.awaitility.core.ConditionFactory.until(ConditionFactory.java:809) > at org.awaitility.core.ConditionFactory.until(ConditionFactory.java:648) > at > org.apache.geode.management.RegionManagementDUnitTest.lambda$verifyMemberNotifications$3d0515b3$1(RegionManagementDUnitTest.java:566) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)