[ https://issues.apache.org/jira/browse/GEODE-10017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17486926#comment-17486926 ]
ASF GitHub Bot commented on GEODE-10017: ---------------------------------------- gaussianrecurrence opened a new pull request #919: URL: https://github.com/apache/geode-native/pull/919 - Due to a rare race-condition described on the Jira case, it could happen that some test cases that needed to restart a/some server(s) could end up getting stuck. - This change introduces a mechanism to verify the server process has been actually stopped. - Also, removed sleeps on some TCs which were put in place as a WA. - Modified all of the new IT TCs that were restarting servers in order to use the above mentioned mechanism, so the race-condition is avoided. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: notifications-unsubscr...@geode.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Fix new ITs unstability for TCs that involve members restart > ------------------------------------------------------------ > > Key: GEODE-10017 > URL: https://issues.apache.org/jira/browse/GEODE-10017 > Project: Geode > Issue Type: Bug > Components: native client > Reporter: Mario Salazar de Torres > Assignee: Mario Salazar de Torres > Priority: Major > Labels: needsTriage > > *GIVEN* an integration TC on which a server restart needs to be restarted > *WHEN* the server is starting up again > *THEN* +{color:#172b4d}it might{color}+{color:#172b4d} happen that the TC > gets stuck{color} > ---- > *Additional information.* This issue does not always happens, and I've seen > it happening more frequently with the latest version of Geode server (1.15.0) > Some examples of this TC are: > * RegisterKeysTest.RegisterKeySetAndClusterRestart > * PartitionRegionWithRedundancyTest.putgetWithSingleHop > * ··· > Also, this is normally the exec flow for the TCs that get stuck: > # Setup cluster > # Do TC specific ops > # Stop server(s)/Cluster shutdown > # Start server(s) > In all cases, the server gets stuck at step 4 -- This message was sent by Atlassian Jira (v8.20.1#820001)