[ 
https://issues.apache.org/jira/browse/GEODE-7739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17248207#comment-17248207
 ] 

ASF GitHub Bot commented on GEODE-7739:
---------------------------------------

kirklund commented on a change in pull request #5778:
URL: https://github.com/apache/geode/pull/5778#discussion_r541332662



##########
File path: 
geode-core/src/main/java/org/apache/geode/management/internal/ManagementCacheListener.java
##########
@@ -104,13 +101,19 @@ public void afterUpdate(EntryEvent<String, Object> event) 
{
       if (logger.isDebugEnabled()) {
         logger.debug("Proxy Update failed for {} with exception {}", 
objectName, e.getMessage(), e);
       }
-
     }
+  }
 
+  private void blockUntilReady() {
+    try {
+      readyForEvents.await();
+    } catch (InterruptedException e) {
+      Thread.interrupted();

Review comment:
       Another option which might be more correct and it matches most of the 
Geode code more closely:
   ```
   boolean interrupted = false;
   try {
     readyForEvents.await();
   } catch (InterruptedException e) {
     interrupted = true;
     getCancelCriterion().checkCancelInProgress(ie);
     throw new RuntimeException(e);
   } finally {
     if (interrupted) {
       Thread.currentThread().interrupt();
     }
   }
   ```
   The effect is that if the thread is interrupted, then we prefer Cache closed 
exception or DistributedSystem disconnected exception first. If neither is 
occurring then we throw new RuntimeException as a last resort. And then finally 
we reset the thread interrupt flag since we didn't actually handle the 
interrupt locally here by halting the Runnable that the thread is executing. 
The code is basically passing along a runtime exception and interrupt flag back 
to the calling code to let it deal with both.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JMX managers may fail to federate mbeans for other members
> ----------------------------------------------------------
>
>                 Key: GEODE-7739
>                 URL: https://issues.apache.org/jira/browse/GEODE-7739
>             Project: Geode
>          Issue Type: Bug
>          Components: jmx
>            Reporter: Kirk Lund
>            Assignee: Kirk Lund
>            Priority: Major
>              Labels: GeodeOperationAPI, pull-request-available
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> JMX Manager may fail to federate one or more MXBeans for other members 
> because of a race condition during startup. When ManagementCacheListener is 
> first constructed, it is in a state that will ignore all callbacks because 
> the field readyForEvents is false.
> ----
> Debugging with JMXMBeanReconnectDUnitTest revealed this bug.
> The test starts two locators with jmx manager configured and started. 
> Locator1 always has all of locator2's mbeans, but locator2 is intermittently 
> missing the personal mbeans of locator1. 
> I think this is caused by some sort of race condition in the code that 
> creates the monitoring regions for other members in locator2.
> It's possible that the jmx manager that hits this bug might fail to have 
> mbeans for servers as well as other locators but I haven't seen a test case 
> for this scenario.
> The exposure of this bug means that a user running more than one locator 
> might have a locator that is missing one or more mbeans for the cluster.
> ----
> Studying the JMX code also reveals the existence of *GEODE-8012*.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to