[ 
https://issues.apache.org/jira/browse/GEODE-8241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17138813#comment-17138813
 ] 

ASF GitHub Bot commented on GEODE-8241:
---------------------------------------

aaronlindsey commented on a change in pull request #5236:
URL: https://github.com/apache/geode/pull/5236#discussion_r441823437



##########
File path: 
geode-membership/src/integrationTest/java/org/apache/geode/distributed/internal/membership/gms/MembershipIntegrationTest.java
##########
@@ -172,6 +181,92 @@ public void 
secondMembershipCanJoinUsingTheSecondLocatorToStart()
     stop(locator2, locator1);
   }
 
+  @Test
+  public void secondMembershipPausesForLocatorWaitTime()
+      throws IOException, MemberStartupException {
+
+    /*
+     * Start a locator for the coordinator (membership) so we have a port for 
it.
+     *
+     * Its locator-wait-time is set to 0 so it eventually (soon after 
membership is started) forms a
+     * distributed system and becomes a coordinator.
+     */
+
+    final MembershipLocator<MemberIdentifier> coordinatorLocator = 
createLocator(0);
+    coordinatorLocator.start();
+    final int coordinatorLocatorPort = coordinatorLocator.getPort();
+
+    final Membership<MemberIdentifier> coordinatorMembership =
+        createMembership(coordinatorLocator, coordinatorLocatorPort);
+
+    /*
+     * We have not even started the membership yet — connection attempts will 
certainly fail until
+     * we do. This is a bit like the locator (host) not being present in DNS 
(yet).
+     */
+
+    /*
+     * Start a second locator and membership trying to join via the 
coordinator (membership) that
+     * hasn't yet started behind the port.
+     *
+     * Set its locator-wait-time so it'll not become a coordinator right away, 
allowing time for the
+     * other member to start and become a coordinator.
+     *
+     * Calculate the locator-wait-time to be greater than the minimum wait 
time for connecting to a
+     * locator.
+     */
+
+    final MembershipLocator<MemberIdentifier> lateJoiningLocator = 
createLocator(0);
+    lateJoiningLocator.start();
+    final int lateJoiningLocatorPort = lateJoiningLocator.getPort();
+
+    final int[] lateJoiningMembershipLocatorPorts =
+        new int[] {coordinatorLocatorPort, lateJoiningLocatorPort};
+
+    final Duration minimumJoinWaitTime = Duration
+        .ofMillis(JOIN_RETRY_SLEEP + FIND_LOCATOR_RETRY_SLEEP) // amount of 
sleep time per retry
+        .multipliedBy(lateJoiningMembershipLocatorPorts.length * 2); // 
expected number of retries
+    final int locatorWaitTime = (int) (3 * minimumJoinWaitTime.getSeconds());
+
+    final MembershipConfig lateJoiningMembershipConfig =
+        createMembershipConfig(true, locatorWaitTime, 
lateJoiningMembershipLocatorPorts);
+    final Membership<MemberIdentifier> lateJoiningMembership =
+        createMembership(lateJoiningMembershipConfig, lateJoiningLocator);
+
+    CompletableFuture<Void> lateJoiningMembershipStartup = 
executorServiceRule.runAsync(() -> {
+      try {
+        start(lateJoiningMembership);
+      } catch (MemberStartupException e) {
+        throw new RuntimeException(e);
+      }
+    });
+
+    /*
+     * Now start the coordinator (membership), after waiting longer than the 
minimum wait time for
+     * connecting to a locator but shorter than the locator-wait-time.
+     */
+
+    CompletableFuture<Void> coordinatorMembershipStartup = 
executorServiceRule.runAsync(() -> {
+      try {
+        Thread.sleep(2 * minimumJoinWaitTime.toMillis());

Review comment:
       I added a comment to the code in the latest commit.

##########
File path: 
geode-membership/src/integrationTest/java/org/apache/geode/distributed/internal/membership/gms/MembershipIntegrationTest.java
##########
@@ -172,6 +181,92 @@ public void 
secondMembershipCanJoinUsingTheSecondLocatorToStart()
     stop(locator2, locator1);
   }
 
+  @Test
+  public void secondMembershipPausesForLocatorWaitTime()
+      throws IOException, MemberStartupException {
+
+    /*
+     * Start a locator for the coordinator (membership) so we have a port for 
it.
+     *
+     * Its locator-wait-time is set to 0 so it eventually (soon after 
membership is started) forms a
+     * distributed system and becomes a coordinator.
+     */
+
+    final MembershipLocator<MemberIdentifier> coordinatorLocator = 
createLocator(0);
+    coordinatorLocator.start();
+    final int coordinatorLocatorPort = coordinatorLocator.getPort();
+
+    final Membership<MemberIdentifier> coordinatorMembership =
+        createMembership(coordinatorLocator, coordinatorLocatorPort);
+
+    /*
+     * We have not even started the membership yet — connection attempts will 
certainly fail until
+     * we do. This is a bit like the locator (host) not being present in DNS 
(yet).
+     */
+
+    /*
+     * Start a second locator and membership trying to join via the 
coordinator (membership) that
+     * hasn't yet started behind the port.
+     *
+     * Set its locator-wait-time so it'll not become a coordinator right away, 
allowing time for the
+     * other member to start and become a coordinator.
+     *
+     * Calculate the locator-wait-time to be greater than the minimum wait 
time for connecting to a
+     * locator.
+     */
+
+    final MembershipLocator<MemberIdentifier> lateJoiningLocator = 
createLocator(0);
+    lateJoiningLocator.start();
+    final int lateJoiningLocatorPort = lateJoiningLocator.getPort();
+
+    final int[] lateJoiningMembershipLocatorPorts =
+        new int[] {coordinatorLocatorPort, lateJoiningLocatorPort};
+
+    final Duration minimumJoinWaitTime = Duration
+        .ofMillis(JOIN_RETRY_SLEEP + FIND_LOCATOR_RETRY_SLEEP) // amount of 
sleep time per retry
+        .multipliedBy(lateJoiningMembershipLocatorPorts.length * 2); // 
expected number of retries
+    final int locatorWaitTime = (int) (3 * minimumJoinWaitTime.getSeconds());

Review comment:
       I added a comment to the code in the latest commit.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Locator does not observe locator-wait-time
> ------------------------------------------
>
>                 Key: GEODE-8241
>                 URL: https://issues.apache.org/jira/browse/GEODE-8241
>             Project: Geode
>          Issue Type: Bug
>            Reporter: Aaron Lindsey
>            Assignee: Aaron Lindsey
>            Priority: Major
>
> In the case where a locator starts up and is unable to connect to any other 
> locators, it may decide to become the membership coordinator even if 
> locator-wait-time has not elapsed.
> The following conditional from GMSJoinLeave.java causes the issue. There 
> should be an additional check for locator-wait-time before becoming 
> coordinator.
> {code:java}
>             if (state.joinedMembersContacted <= 0 &&
>                 (tries >= minimumRetriesBeforeBecomingCoordinator ||
>                     state.locatorsContacted >= locators.size())) {
>               synchronized (viewInstallationLock) {
>                 becomeCoordinator();
>               }
>               return true;
>             }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to