Hi Andrew!
Andrew Beekhof wrote:
[snip]
No, I don't mean that.
Take on-fail=stop for example...
If we detect the resource failed, we stop it. (so far so good).
However now that its stopped, the failed operation is no longer
considered and we "forget" that the resource is supposed to _stay_ stopped.
You mean "a resource which is stopping by failed action would start
by contraries"?
With on_fail="standby", certainly it seems that we "forget" the node is supposed
to stay standby.
Because the failed node status which crm_mon shows changes like this,
"online" --[resource failed]--> "standby" --[resource restart(F/O)]--> "online".
Then I tried to give raise to the behavior like that with on_fail="stop",
but I couldn't.
On-fail-stopped resource stayed stopped even if I deleted the fail-count of it.
When I restarted the Heartbeat service on failed node,
the resource restarted (on other node) at last...
The solution is to check the "old" operations for this sort of condition.
It sounds a large-scale modification...
Regards,
Satomi TANIGUCHI
_______________________________________________
Pacemaker mailing list
[email protected]
http://list.clusterlabs.org/mailman/listinfo/pacemaker