[
https://issues.apache.org/jira/browse/GEODE-8589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17547029#comment-17547029
]
Geode Integration commented on GEODE-8589:
------------------------------------------
Seen in [integration-test-openjdk11
#395|https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/integration-test-openjdk11/builds/395]
... see [test
results|http://files.apachegeode-ci.info/builds/apache-develop-main/1.16.0-build.0310/test-results/integrationTest/1654277786/]
or download
[artifacts|http://files.apachegeode-ci.info/builds/apache-develop-main/1.16.0-build.0310/test-artifacts/1654277786/integrationtestfiles-openjdk11-1.16.0-build.0310.tgz].
> make locatorsStopWaitingForLocatorWaitTimeIfAllLocatorsContacted test
> deterministic
> -----------------------------------------------------------------------------------
>
> Key: GEODE-8589
> URL: https://issues.apache.org/jira/browse/GEODE-8589
> Project: Geode
> Issue Type: Improvement
> Components: membership
> Reporter: Bill Burcham
> Priority: Major
> Labels: membership, pull-request-available
>
> A recent commit added a new test:
> {{MembershipIntegrationTest.locatorsStopWaitingForLocatorWaitTimeIfAllLocatorsContacted()}}
> That's a good test. But it would be better if it ran faster and was less
> susceptible to timing issues. The problem is that the logic we are trying to
> test, {{GMSJoinLeave.leave()}} uses {{System.currentTimeMillis()}} to get the
> current time and it uses {{Thread.sleep()}} to sleep:
> {code:java}
> long now = System.currentTimeMillis();
> ...
> if (state.joinedMembersContacted <= 0 && ((now >=
> locatorGiveUpTime &&
> tries >= minimumRetriesBeforeBecomingCoordinator) ||
> state.locatorsContacted >= locators.size())) {
> ...
> if (System.currentTimeMillis() > giveupTime) {
> break;
> }
> ...
> Thread.sleep(retrySleep);
> ...
> giveupTime = System.currentTimeMillis() + timeout;
> ...
> if (!this.isJoined) {
> logger.debug("giving up attempting to join the distributed system
> after "
> + (System.currentTimeMillis() - startTime) + "ms");
> }
> ...
> if (!this.isJoined && state.hasContactedAJoinedLocator) {
> throw new MemberStartupException("Unable to join the distributed
> system in "
> + (System.currentTimeMillis() - startTime) + "ms");
> }
> {code}
> The opportunity is to _inject_ objects that handle these two functions (time
> keeping and sleeping). This will enable us to artifically control these in
> our test! Sleeping doesn't have to take any time at all. And we can make time
> pass as quickly as we want to.
> For an example of how this can be done, see [PR
> #5422|https://github.com/apache/geode/pull/5422] for GEODE-6950.
> This particular test
> ({{locatorsStopWaitingForLocatorWaitTimeIfAllLocatorsContacted()}}) is a
> little bit more involved than that one in that more objects are involved. In
> addition to {{GMSJoinLeave}}, other classes involved in the test also need
> current time and need to sleep.
> When this ticket is complete:
> * functional interfaces {{MillisecondProvider}} and {{Sleeper}}, currently
> defined inside {{PrimaryHandler}} will be moved higher in the (internal)
> hierarchy for use by other membership classes
> * {{GMSJoinLeave}} will take a {{MillisecondProvider}} and {{Sleeper}} in its
> constructor and will delegate to those instead of calling
> {{System.currentTimeMillis()}} and {{Thread.sleep()}} directly
> * TBD other classes may require injection of {{MillisecondProvider}} and
> {{Sleeper}}
> * TBD other changes may be necessary in cases where collaborating classes
> currently spin up threads or synchronize between threads e.g. calling
> {{wait()}}
> * {{locatorsStopWaitingForLocatorWaitTimeIfAllLocatorsContacted()}} will no
> longer employ {{Thread.sleep()}} nor will it call
> {{CompletableFuture.get(long timeout, TimeUnit unit)}}—instead it will
> operate deterministically (ideally in the same thread but not necessarily)
> with respect to the unit under test.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)