[ 
https://issues.apache.org/jira/browse/GEODE-5547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16692383#comment-16692383
 ] 

Kirk Lund commented on GEODE-5547:
----------------------------------

In the middle of the method LocalRegion.basicDestroyRegion is the invocation of 
InternalDistributedSystem.handleResourceEvent:
{noformat}
          if (!isInternalRegion()) {
            InternalDistributedSystem system = 
this.cache.getInternalDistributedSystem();
            system.handleResourceEvent(ResourceEvent.REGION_REMOVE, this);
          }
{noformat}
I think something would have to prevent this block from being reached or 
executed when closing the region or maybe even an early return from 
system.handleResourceEvent without doing anything:
{noformat}
  public void handleResourceEvent(ResourceEvent event, Object resource) {
    if (disableManagement) {
      return;
    }
    if (resourceListeners.size() == 0) {
      return;
    }
    notifyResourceEventListeners(event, resource);
  }
{noformat}
disableManagement is final and should be false for this test (it's a management 
test). resourceListeners is not final but it is a CopyOnWriteArrayList, so 
unless checking the size() found a null value this should be ok. Since the 
manager received a create region notification, I'm going to say this was not 
null.

Beyond that InternalDistributedSystem.notifyResourceEventListeners would have 
thrown or logged a warning unless ManagementListener threw CancelException 
which it shouldn't do based on the info level logs from stdout. 

That leaves the small chance of some subtle bug (probably a race condition or 
concurrency bug) in ManagementListener.java. The first line of 
ManagementListener.shouldProceed is:
{noformat}
    InternalDistributedSystem.getConnectedInstance();
{noformat}
...looks weird but the point of this line is to check for cancelation. It will 
either return (and do nothing) or if the cache is being closed for any reason 
then it'll throw a CancelException. Again, based on the lines of the test and 
the info level logs from stdout, it really doesn't look like it would have a 
thread invoking Cache.close or DistributedSystem.disconnect -- but if it did 
then that would explain the failure.

So far, I can't reproduce the failure and it's not obvious what changes to the 
test would help prevent this since there's no action that would be closing the 
Cache or DistributedSystem in vm3 while vm0 awaits the JMX notifications.

> RegionManagementDUnitTest > testFixedPRRegionMBean FAILED
> ---------------------------------------------------------
>
>                 Key: GEODE-5547
>                 URL: https://issues.apache.org/jira/browse/GEODE-5547
>             Project: Geode
>          Issue Type: Bug
>            Reporter: Jinmei Liao
>            Assignee: Kirk Lund
>            Priority: Minor
>              Labels: Flaky, pull-request-available, swat
>             Fix For: 1.7.0
>
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> recent failure: 
> https://concourse.apachegeode-ci.info/teams/main/pipelines/develop/jobs/DistributedTest/builds/231#L5b5be314:763
> {noformat}
> org.apache.geode.management.RegionManagementDUnitTest > 
> testFixedPRRegionMBean FAILED
> {noformat}
> {noformat}
> org.apache.geode.test.dunit.RMIException: While invoking 
> org.apache.geode.test.dunit.NamedRunnable.run in VM 0 running on Host 
> 0f4308c1376f with 4 VMs
>  at org.apache.geode.test.dunit.VM.invoke(VM.java:443)
>  at org.apache.geode.test.dunit.VM.invoke(VM.java:412)
>  at org.apache.geode.test.dunit.VM.invoke(VM.java:343)
>  at 
> org.apache.geode.management.RegionManagementDUnitTest.verifyMemberNotifications(RegionManagementDUnitTest.java:565)
>  at 
> org.apache.geode.management.RegionManagementDUnitTest.testFixedPRRegionMBean(RegionManagementDUnitTest.java:209)
>    
>  Caused by:
>  org.awaitility.core.ConditionTimeoutException: Condition defined as a lambda 
> expression in org.apache.geode.management.RegionManagementDUnitTest that uses 
> int 
>  Expected size:<6> but was:<4> in:
>  
> <[javax.management.Notification[source=172.17.0.28(265)<v46>-32770][type=gemfire.distributedsystem.cache.region.created][message=Region
>  Created With Name /MANAGEMENT_FIXED_PR],
>  
> javax.management.Notification[source=172.17.0.28(276)<v47>-32771][type=gemfire.distributedsystem.cache.region.created][message=Region
>  Created With Name /MANAGEMENT_FIXED_PR],
>  
> javax.management.Notification[source=172.17.0.28(265)<v46>-32770][type=gemfire.distributedsystem.cache.region.closed][message=Region
>  Destroyed/Closed With Name /MANAGEMENT_FIXED_PR],
>  
> javax.management.Notification[source=172.17.0.28(276)<v47>-32771][type=gemfire.distributedsystem.cache.region.closed][message=Region
>  Destroyed/Closed With Name /MANAGEMENT_FIXED_PR]]> within 2 minutes.
>  at org.awaitility.core.ConditionAwaiter.await(ConditionAwaiter.java:104)
>  at org.awaitility.core.AssertionCondition.await(AssertionCondition.java:117)
>  at org.awaitility.core.AssertionCondition.await(AssertionCondition.java:32)
>  at org.awaitility.core.ConditionFactory.until(ConditionFactory.java:809)
>  at org.awaitility.core.ConditionFactory.until(ConditionFactory.java:648)
>  at 
> org.apache.geode.management.RegionManagementDUnitTest.lambda$verifyMemberNotifications$3d0515b3$1(RegionManagementDUnitTest.java:566)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to