Re: [Pacemaker] failcount always resets at 15m mark regardless of cluster-recheck-interval

Andrew Beekhof Thu, 22 May 2014 03:32:01 -0700

On 22 May 2014, at 5:36 pm, David Nguyen <[email protected]> wrote:


> Hi all,
> 
> I'm having the following problem.  I have the following settings for testing 
> purposes:
> 
> migration-threshold=1
> failure-timeout=15s
> cluster-recheck-interval=30s
> 
> and verified those are in the running config via cibadmin --query

can we see that output?

> 
> The issue is that even with failure-timeout and cluster-recheck-interval set, 
> I've noticed that failcount resets at the default value of minutes.
> 
> The way I tested this was to force a resource failure on both nodes (2 node 
> cluster), then watch syslog and sure enough, the service rights itself after 
> the 15minute mark.
> 
> May 22 00:09:22 sac-prod1-ops-web-09 crmd[16843]:   notice: 
> do_state_transition: State transition S_TRANSITION_ENGINE -> S_IDLE [ 
> input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd ]
> 
> May 22 00:24:22 sac-prod1-ops-web-09 crmd[16843]:   notice: 
> do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE [ 
> input=I_PE_CALC cause=C_TIMER_POPPED origin=crm_timer_popped ]
> 
> 
> Any ideas what I'm doing wrong here?  I would like failcount to reset much 
> faster
> 
> 
> My setup:
> 
> 2 node centos6.5
> pacemaker-1.1.10-14.el6_5.3.x86_64
> corosync-1.4.1-17.el6_5.1.x86_64
> 
> 
> _______________________________________________
> Pacemaker mailing list: [email protected]
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

signature.asc
Description: Message signed with OpenPGP using GPGMail

_______________________________________________
Pacemaker mailing list: [email protected]
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [Pacemaker] failcount always resets at 15m mark regardless of cluster-recheck-interval

Reply via email to