[ https://issues.apache.org/jira/browse/GEODE-7861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17252945#comment-17252945 ]
ASF GitHub Bot commented on GEODE-7861: --------------------------------------- bschuchardt commented on a change in pull request #5839: URL: https://github.com/apache/geode/pull/5839#discussion_r546773728 ########## File path: geode-membership/src/integrationTest/java/org/apache/geode/distributed/internal/membership/gms/membership/GMSJoinLeaveJUnitTest.java ########## @@ -1441,24 +1435,74 @@ public void testCoordinatorFindRequestSuccess() throws Exception { public void testCoordinatorFindRequestFailure() throws Exception { try { initMocks(false); - HashSet<MemberIdentifier> registrants = new HashSet<>(); - registrants.add(mockMembers[0]); - FindCoordinatorResponse fcr = new FindCoordinatorResponse(mockMembers[0], mockMembers[0], - false, null, registrants, false, true, null); + mockRequestToServer(eq(new HostAndPort("localhost", 12346))); GMSMembershipView view = createView(); JoinResponseMessage jrm = new JoinResponseMessage(mockMembers[0], view, 0); gmsJoinLeave.setJoinResponseMessage(jrm); - when(locatorClient.requestToServer(eq(new HostAndPort("localhost", 12346)), - isA(FindCoordinatorRequest.class), anyInt(), anyBoolean())) - .thenReturn(fcr); - - assertFalse("Should not be able to join ", gmsJoinLeave.join()); + assertThatThrownBy(() -> gmsJoinLeave.join()) + .isInstanceOf(MembershipConfigurationException.class); } finally { - } } + @Test + public void testJoinFailureWhenSleepInterrupted() throws Exception { + initMocks(false); + mockRequestToServer(isA(HostAndPort.class)); + + when(mockConfig.getMemberTimeout()).thenReturn(100L); + when(mockConfig.getJoinTimeout()).thenReturn(1000L); + + GMSJoinLeave spyGmsJoinLeave = spy(gmsJoinLeave); + when(spyGmsJoinLeave.hasCoordinatorJoinedCluster(-1, GMSJoinLeave.JOIN_RETRY_SLEEP)) + .thenThrow(new InterruptedException()); Review comment: I like it - using a "when" to throw an InterruptedException ########## File path: geode-membership/src/main/java/org/apache/geode/distributed/internal/membership/gms/GMSMembership.java ########## @@ -565,12 +565,7 @@ private void join() throws MemberStartupException { this.isJoining = true; // added for bug #44373 // connect - boolean ok = services.getJoinLeave().join(); - - if (!ok) { - throw new MembershipConfigurationException("Unable to join the distributed system. " - + "Operation either timed out, was stopped or Locator does not exist."); - } + services.getJoinLeave().join(); Review comment: nice change to make this a "void" method that throws an exception if it doesn't succeed in joining the cluster ########## File path: geode-membership/src/main/java/org/apache/geode/distributed/internal/membership/gms/membership/GMSJoinLeave.java ########## @@ -428,6 +432,24 @@ public boolean join() throws MemberStartupException { } } + boolean hasCoordinatorJoinedCluster(int viewId, long retrySleep) Review comment: How about naming this something indicating that a sleep is going to be performed, like "pauseIfThereIsNoCoordinator"? ########## File path: geode-membership/src/main/java/org/apache/geode/distributed/internal/membership/gms/interfaces/JoinLeave.java ########## @@ -29,7 +29,7 @@ * joins the distributed system and returns true if successful, false if not. Throws Review comment: This comment needs to be changed to reflect that the method now throws an exception if it's unable to join the cluster. The comment should say which exception is thrown. Should the exception be changed to be a checked exception instead of a runtime exception? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Improve error reporting when a member is unable to contact a locator > -------------------------------------------------------------------- > > Key: GEODE-7861 > URL: https://issues.apache.org/jira/browse/GEODE-7861 > Project: Geode > Issue Type: Improvement > Components: membership > Reporter: Dan Smith > Assignee: Kamilla Aslami > Priority: Major > Labels: pull-request-available > > When a member is unable to contact a join the system due to a failure to > contact a locator, we could be a lot more specific, which would help users > debug issues in their environment. We currently print out > {noformat} > Unable to join the distributed system. Operation either timed out, was > stopped or Locator does not exist. > {noformat} > We should include the underlying exception from the last locator we tried to > contact as a cause, so that users can see why it failed (DNS error, ???). > Perhaps also the list of locators we tried to contact. -- This message was sent by Atlassian Jira (v8.3.4#803005)