Re: [Pacemaker] RFC: What part of the XML configuration do you hate the most?

Satomi Taniguchi Thu, 07 Aug 2008 00:50:49 -0700

Hi Dejan,

I reconsider and finally conclude to agree with you.
To realize this function, it should be implemented in lrmd in the end.

Then, a new attribute in cib.xml is necessary.
I want to discuss about this.

you wrote before:
> > The b) part has already been discussed on the list and it's
> > supposed to be implemented in lrmd. I still don't have the API
> > defined, but thought about something like
> >
> >       max-total-failures (how many times a monitor may fail)
> >       max-consecutive-failures (how many times in a row a monitor may fail)

I think that only the second one is enough.
The first one requires the time dimension as you wrote.
It confuses users to add many new complex settings.
The second one is more simple and enough to achieve the purpose
which is to avoid unnecessary suspension of services
when sudden high load happened.

I would like to hear any opinion about this.

Regards,
Satomi Taniguchi

Dejan Muhamedagic wrote:

Hi Keisuke-san,

On Fri, Jun 27, 2008 at 09:19:33PM +0900, Keisuke MORI wrote:

Hi,

Dejan Muhamedagic <[EMAIL PROTECTED]> writes:

On Tue, Jun 24, 2008 at 04:02:06PM +0200, Lars Marowsky-Bree wrote:

On 2008-06-24T15:48:12, Dejan Muhamedagic <[EMAIL PROTECTED]> wrote:

   But precisely we have two scenarios to configure to:
   a) monitor NG -> stop -> start on the same node
      -> monitor NG (Nth time) -> stop -> failover to another node
   b) monitor NG -> monitor NG (Nth times) -> stop -> failover to another node

   The current pacemaker behaves as a), I think, but b) is also
   useful when you want to ignore a transient error.

The b) part has already been discussed on the list and it's
supposed to be implemented in lrmd. I still don't have the API
defined, but thought about something like

        max-total-failures (how many times a monitor may fail)
        max-consecutive-failures (how many times in a row a monitor may fail)

I also thought that it should be implemented in lrmd at first,
but now I think it would be better to handle it in crm.

If we would implement it in lrmd, it would have two kinds of
fail-counts in different modules (cib and lrmd) and users have
to understand and use both tools for cib and lrmd depending on
the kind of the fails even though they are for very similar
purpose. I think it's confusing for users.


The fail-counts in lrmd will probably be available for
inspection. And they would probably also expire after some time.
What I suggested in the previous messages is actually missing
the time dimension: There should be maximum failures within
a period.

So I think that lrmd should always report failures like now,
and crm/cib should hold all the failed status and make a decision.


Of course, it could be done like that as well, though that could
make processing in crm much more complex.

These should probably be attributes defined on the monitor
operation level.

The "ignore failure reports" clashes a bit with the "react to failures
ASAP" requirement.

It is my belief that this should be handled by the RA, not in the LRM
nor the CRM. The monitor op implementation is the place to handle this.


Yes, it can be implemented in RAs, and that's what we've done actually.

But in that case, such RAs would have a similar retry loop in
each scripts and would have their own retry parameters for each RA types.

I think it's worth having a common way to handle this.


Yes, I also think that having this handled in one place would be
beneficial. The resource agents, though they should know the best
the resources they manage, may not always take into account
all environment peculiarities. Then it is up to the user to
decide if they want to allow a monitor for the resource to fail
now and then.

Beyond that, I strongly feel that "transient errors" are a bad
foundation to build clusters on.

Of course, all that is right. However, there are some situations
where we could bend the rules. I'm not sure what Keisuke-san had
in mind, but for example one could be more forgiving when
monitoring certain stonith resources.

One situation in my mind is when sudden high load happened in
very short time. The application may fail to respond to the
monitor op by the RA when the load is very high, but if such
'spark of the load' ceases shortly then we don't want to rush to the failover.


These situations are tricky to handle. Such a high load may also
be a sign that resources should indeed move elsewhere. Or it may
even be considered as a service disruption. Though there are most
probably shops which would prefer not to do a failover in such
cases. At any rate, this feature, if it gets implemented, would
have to be used with utmost care.

Another case we've met was when we wrote a RA to check for some hardware.
The status from the hardware rarely failed in very specific timing,
and retrying the check was just fine.


That's what I often observed with some stonith devices.

Cheers,

Dejan

Thanks,
--
Keisuke MORI
NTT DATA Intellilink Corporation


_______________________________________________
Pacemaker mailing list
[email protected]
http://list.clusterlabs.org/mailman/listinfo/pacemaker


_______________________________________________
Pacemaker mailing list
[email protected]
http://list.clusterlabs.org/mailman/listinfo/pacemaker



_______________________________________________
Pacemaker mailing list
[email protected]
http://list.clusterlabs.org/mailman/listinfo/pacemaker

Re: [Pacemaker] RFC: What part of the XML configuration do you hate the most?

Reply via email to