[ClusterLabs] Antw: Re: Q: ordering for a monitoring op only?

Ulrich Windl Thu, 23 Aug 2018 03:59:44 -0700

>>> Ryan Thomas <[email protected]> schrieb am 21.08.2018 um 17:38 in
Nachricht
<cae_qajk4gnnablwa-wep-x_4nu640y8dhbp3d3qcz_rdqc7...@mail.gmail.com>:
> You could accomplish this be creating a custom RA which normally acts as a
> pass-through and calls the "real" RA.  However, it intercepts "monitor"
> actions, checks nfs, and if nfs is down it returns success, otherwise it
> passes though the monitor action to the real RA.  If nfs fails the monitor
> action is in-flight, the customer RA can intercept the failure, check if
> nfs is down, and if so change the failure to a success.


Hi!

This sounds like an interesting approach, but I wonder how to avoid a 
monitoring timeout: I.e. what value to return when NFS is down? I'm missing a 
return value like 
CANNOT_CHECK_AT_THE_MOMENT_SO_PLEASE_ASSUME_RESOURCE_STILL_HAS_ITS_LAST_STATE 
;-)

Unless I can return such a value, the wrapper RA will have to wait (possibly 
causing a timeout). OK, the wrapper RA could cache its last return value and 
reuse that when NFS is down.

Regards,
Ulrich

> 
> On Mon, Aug 20, 2018 at 3:51 AM Ulrich Windl <
> [email protected]> wrote:
> 
>> Hi!
>>
>> I wonder whether it's possible to run a monitoring op only if some
>> specific resource is up.
>> Background: We have some resource that runs fine without NFS, but the
>> start, stop and monitor operations will just hang if NFS is down. In effect
>> the monitor operation will time out, the cluster will try to recover,
>> calling the stop operation, which in turn will time out, making things
>> worse (i.e.: causing a node fence).
>>
>> So my idea was to pause the monitoing operation while NFS is down (NFS
>> itself is controlled by the cluster and should recover "rather soon" TM).
>>
>> Is that possible?
>> And before you ask: No, I have not written that RA that has the problem; a
>> multi-million-dollar company wrote it (Years before I had written a monitor
>> for HP-UX' cluster that did not have this problem, even though the
>> configuration files were read from NFS (It's not magic: Just periodically
>> copy them to shared memory, and read the config from shared memory).
>>
>> Regards,
>> Ulrich
>>
>>
>> _______________________________________________
>> Users mailing list: [email protected] 
>> https://lists.clusterlabs.org/mailman/listinfo/users 
>>
>> Project Home: http://www.clusterlabs.org 
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
>> Bugs: http://bugs.clusterlabs.org 
>>




_______________________________________________
Users mailing list: [email protected]
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

[ClusterLabs] Antw: Re: Q: ordering for a monitoring op only?

Reply via email to