[ 
https://issues.apache.org/jira/browse/GEODE-5546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce Schuchardt updated GEODE-5546:
------------------------------------
    Affects Version/s: 1.6.0

> auto-reconnecting member reuses old address including vmViewId
> --------------------------------------------------------------
>
>                 Key: GEODE-5546
>                 URL: https://issues.apache.org/jira/browse/GEODE-5546
>             Project: Geode
>          Issue Type: Bug
>          Components: membership
>    Affects Versions: 1.6.0
>            Reporter: Bruce Schuchardt
>            Assignee: Bruce Schuchardt
>            Priority: Major
>
> During network-down testing I found that if I restore the network immediately 
> after all "losing side" servers go into auto-reconnect that sometimes they 
> receive a view-preparation message from the surviving cluster that holds 
> their old membership ID.  They use this ID instead of waiting for a valid new 
> ID and end up being shut down as rogue processes.
> For instance, this process used to have an identifier with <v3> before it 
> went into auto-reconnect.  When it tried to rejoin it ended up using that 
> same identifier due to receiving a view-preparation message holding it:
> [info 2018/07/28 22:17:14.588 PDT 
> gemfire1_rs-FullRegression29040205a1i3xlarge-hydra-client-18_15643 
> <ReconnectThread> tid=0x2d2] Attempting to join the distributed system 
> through coordinator 
> 10.32.110.93(gemfire6_rs-FullRegression29040205a1i3xlarge-hydra-client-50_13624:13624:locator)<ec><v1>:1024
>  using address 
> 10.32.108.125(gemfire1_rs-FullRegression29040205a1i3xlarge-hydra-client-18_15643:15643)<v3>:1026
> In this run it then proceeded to hang trying to send startup messages to the 
> cluster.  Cluster members rejected all of its attempts to contact them but 
> were also unsuccessful in getting the rogue process to shut down.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to