[
https://issues.apache.org/jira/browse/GEODE-2125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Kirk Lund updated GEODE-2125:
-----------------------------
Issue Type: Bug (was: Improvement)
> GFSH cannot information about Locators that go into reconnect mode
> ------------------------------------------------------------------
>
> Key: GEODE-2125
> URL: https://issues.apache.org/jira/browse/GEODE-2125
> Project: Geode
> Issue Type: Bug
> Components: management
> Affects Versions: 1.0.0-incubating
> Reporter: Kirk Lund
> Assignee: Kirk Lund
> Attachments: locator_failure-logs.txt, thread_dump.txt
>
>
> If the Locator is started from GFSH and the cluster's only server is killed,
> network partition detection will initiate forceDisconnect in the Locator and
> leave it in reconnect mode. To the User it will appear that the Locator
> crashed and GFSH lost connection:
> {noformat}
> gfsh>
> No longer connected to 192.168.1.72[1099].
> {noformat}
> During the time in which the Locator is in reconnect mode, the User cannot
> connect via GFSH, nor can they issue status or stop commands against it:
> {noformat}
> $ cd locator1
> $ cat vf.gf.locator.pid
> 33959
> $ ps 33959
> PID TT STAT TIME COMMAND
> 33959 s001 S 0:19.97
> /Library/Java/JavaVirtualMachines/jdk1.8.0_66.jdk/Co
> {noformat}
> In GFSH:
> {noformat}
> gfsh>connect --locator=localhost[10334]
> Connecting to Locator at [host=localhost, port=10334] ..
> Connection refused
> gfsh>status locator --pid=33959
> null
> gfsh>status locator --dir=locator1
> null
> gfsh>stop locator --dir=locator1
> Locator in /Users/klund/dev/geode/locator1 on null is currently not
> responding.
> gfsh>stop locator --pid=33959
> Locator in /Users/klund/dev/geode on null is currently not responding.
> {noformat}
> If a Locator has GFSH connected then it should notify GFSH that it is going
> to forceDisconnect and go into reconnect mode. Then GFSH can notify the User
> so the User is not suprised.
> In addition, GFSH status and stop commands should be modified to be able to
> talk to a Locator in reconnect mode. GFSH start could also be modified to
> report that the Locator is running in reconnect mode instead of reporting a
> hung process in the Locator's directory.
> Attachments:
> * The Locator log file is attached as locator_failure-logs.txt
> * The Locator thread dump (via jstack) AFTER it has shut down due to
> forceDisconnect is attached as thread_dump.txt
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)