[ https://issues.apache.org/jira/browse/GEODE-9906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17534595#comment-17534595 ]
Hale Bales commented on GEODE-9906: ----------------------------------- Hi [~mb1977], thank you for raising this issue. Given that it is on such an old version of Geode, I am going to close this ticket. If you can, please try to reproduce this issue on develop or the most recent release and reopen the issue if it is still a problem. Please reach out on the dev list if you have issues using the most recent version. Thanks! > Unable to reconnect a node after SO patching "15 seconds have elapsed while > waiting for replies" > ------------------------------------------------------------------------------------------------ > > Key: GEODE-9906 > URL: https://issues.apache.org/jira/browse/GEODE-9906 > Project: Geode > Issue Type: Bug > Reporter: Marco Baldessari > Priority: Major > > I have a cluster situation consisting of 4 total nodes, 3 servers and 1 > management node, working properly. > At the beginning of the month we planned to patch the OS and we started from > the first server node with this procedure: > - Stop service > - S.O. patching > - Server restart > - Start service > The service of the first patched node named "serverA" fails to restart with > this error: > Log entries cluster join: > serverA: > | INFO | region-dm-12 | ache.geode.internal.tcp.Connection | > --> Connection: shared=true ordered=false failed to connect to peer > 10.237.110.195( Server serverB:9993)<ec><v127>:1024 because: > java.net.ConnectException: Connection timed out (Connection timed out) > | WARN | region-dm-12 | ache.geode.internal.tcp.Connection | > --> Connection: Attempting reconnect to peer 10.237.110.195( Server > serverB:9993)<ec><v127>:1024 > > ServerMgmt: > | WARN | pool-3-thread-1 | tributed.internal.ReplyProcessor21 > | --> 15 seconds have elapsed while waiting for replies: > <CreateRegionProcessor$CreateRegionReplyProcessor 44180 waiting for 1 replies > from [10.237.110.194( Server serverA:632)<ec><v174>:1024]> on 10.237.110.225( > Management:6033)<ec><v111>:1024 whose current membership list is: > [[10.237.110.196( Server serverC:16805)<ec><v136>:1024, 10.237.110.225( > Management:6033)<ec><v111>:1024, 10.237.110.195( Server > serverB:9993)<ec><v127>:1024, 10.237.110.194( Server > serverA:632)<ec><v174>:1024]] > > The connection between the systems was verified with tcpdumps, udp 1024 is > running fine. > > We have tried redeploying the service and making numerous attempts but we > always get the same error during startup. > Any idea? Thank you. > Marco. -- This message was sent by Atlassian Jira (v8.20.7#820007)