[ https://issues.apache.org/jira/browse/GEODE-5546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Bruce Schuchardt updated GEODE-5546: ------------------------------------ Affects Version/s: 1.6.0 > auto-reconnecting member reuses old address including vmViewId > -------------------------------------------------------------- > > Key: GEODE-5546 > URL: https://issues.apache.org/jira/browse/GEODE-5546 > Project: Geode > Issue Type: Bug > Components: membership > Affects Versions: 1.6.0 > Reporter: Bruce Schuchardt > Assignee: Bruce Schuchardt > Priority: Major > > During network-down testing I found that if I restore the network immediately > after all "losing side" servers go into auto-reconnect that sometimes they > receive a view-preparation message from the surviving cluster that holds > their old membership ID. They use this ID instead of waiting for a valid new > ID and end up being shut down as rogue processes. > For instance, this process used to have an identifier with <v3> before it > went into auto-reconnect. When it tried to rejoin it ended up using that > same identifier due to receiving a view-preparation message holding it: > [info 2018/07/28 22:17:14.588 PDT > gemfire1_rs-FullRegression29040205a1i3xlarge-hydra-client-18_15643 > <ReconnectThread> tid=0x2d2] Attempting to join the distributed system > through coordinator > 10.32.110.93(gemfire6_rs-FullRegression29040205a1i3xlarge-hydra-client-50_13624:13624:locator)<ec><v1>:1024 > using address > 10.32.108.125(gemfire1_rs-FullRegression29040205a1i3xlarge-hydra-client-18_15643:15643)<v3>:1026 > In this run it then proceeded to hang trying to send startup messages to the > cluster. Cluster members rejected all of its attempts to contact them but > were also unsuccessful in getting the rogue process to shut down. -- This message was sent by Atlassian JIRA (v7.6.3#76005)