Re: [Pacemaker] Resources not failing over, ERROR: RecurringOp: Invalid recurring action ... wth name: 'start'

Andrew Beekhof Wed, 02 Jul 2014 01:04:31 -0700

1.1.6 is really too old
in any case, rc=5 'not installed' means we cant find an init script of that 
name in /etc/init.d


On 2 Jul 2014, at 2:07 pm, Vijay B <[email protected]> wrote:

> Hi,
> 
> I'm puppetizing resource deployment for pacemaker and corosync, and as part 
> of it, am creating a resource on one of three nodes of a cluster. The problem 
> is that I'm seeing RecurringOp errors during resource creation, which are 
> probably not allowing failover a resource. The resource creation seems to go 
> through fine, but these recurringOp errors always result after resource 
> creation (I'm pasting outputs of two different commands below):
> 
> 
> ***************************
> vagrant@precise64b:/vagrant/puppet-environments/modules/f5_lbaas/tests$ sudo 
> crm status
> ============
> Last updated: Wed Jul  2 03:52:30 2014
> Last change: Wed Jul  2 03:38:20 2014 via cibadmin on precise64b
> Stack: cman
> Current DC: precise64b - partition with quorum
> Version: 1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c
> 3 Nodes configured, unknown expected votes
> 3 Resources configured.
> ============
> 
> Online: [ precise64b precise64c precise64a ]
> 
>  f5-lbaas-agent-10.6.143.121_resource (lsb:f5-lbaas-agent-10.6.143.121):      
> Started precise64c
>  f5-lbaas-agent-10.6.143.122_resource (lsb:f5-lbaas-agent-10.6.143.122):      
> Started precise64b
>  f5-lbaas-agent-10.6.143.123_resource (lsb:f5-lbaas-agent-10.6.143.123):      
> Started precise64b
> 
> Failed actions:
>     f5-lbaas-agent-10.6.143.120_resource_monitor_0 (node=precise64b, call=2, 
> rc=5, status=complete): not installed
>     f5-lbaas-agent-10.6.143.121_resource_monitor_0 (node=precise64b, call=3, 
> rc=5, status=complete): not installed
>     f5-lbaas-agent-10.6.143.122_resource_monitor_0 (node=precise64c, call=7, 
> rc=5, status=complete): not installed
>     f5-lbaas-agent-10.6.143.123_resource_monitor_0 (node=precise64c, call=8, 
> rc=5, status=complete): not installed
>     f5-lbaas-agent-10.6.143.120_resource_monitor_0 (node=precise64a, call=2, 
> rc=5, status=complete): not installed
>     f5-lbaas-agent-10.6.143.121_resource_monitor_0 (node=precise64a, call=3, 
> rc=5, status=complete): not installed
>     f5-lbaas-agent-10.6.143.122_resource_monitor_0 (node=precise64a, call=4, 
> rc=5, status=complete): not installed
>     f5-lbaas-agent-10.6.143.123_resource_monitor_0 (node=precise64a, call=5, 
> rc=5, status=complete): not installed
> vagrant@precise64b:/vagrant/puppet-environments/modules/f5_lbaas/tests$ 
> 
> 
> ***************************
> 
> vagrant@precise64b:/vagrant/puppet-environments/modules/f5_lbaas/tests$ sudo 
> crm_verify -L -V
> crm_verify[15183]: 2014/07/02_03:39:13 ERROR: RecurringOp: Invalid recurring 
> action f5-lbaas-agent-10.6.143.121_resource-start-10 wth name: 'start'
> crm_verify[15183]: 2014/07/02_03:39:13 ERROR: RecurringOp: Invalid recurring 
> action f5-lbaas-agent-10.6.143.121_resource-stop-10 wth name: 'stop'
> crm_verify[15183]: 2014/07/02_03:39:13 ERROR: RecurringOp: Invalid recurring 
> action f5-lbaas-agent-10.6.143.122_resource-start-10 wth name: 'start'
> crm_verify[15183]: 2014/07/02_03:39:13 ERROR: RecurringOp: Invalid recurring 
> action f5-lbaas-agent-10.6.143.122_resource-stop-10 wth name: 'stop'
> Errors found during check: config not valid
> vagrant@precise64b:/vagrant/puppet-environments/modules/f5_lbaas/tests$
> ***************************
> 
> 
> What do these errors signify? I found one email exchange on a pacemaker ML 
> that suggested that we shouldn't be using start intervals and timeouts, and 
> same with stop, since that would mean that pacemaker would attempt to restart 
> the resource every x seconds, timeout every y seconds, and repeat that. 
> (Link: http://lists.linbit.com/pipermail/drbd-user/2011-September/016938.html)
> 
> My understanding was that the start interval would apply in case of restart 
> attempts upon detection of a resource as being down. Nevertheless, I removed 
> these parameters and created a third resource (the first two, I created with 
> these parameters), and I still see the same monitor related errors for the 
> third resource (f5-lbaas-agent-10.6.143.123_resource_monitor_0) in the sudo 
> crm status command output. I don't however understand why this resource 
> doesn't show up in the crm_verify -L -V output.
> 
> Here are the two CLIs I use to create the resources:
> 
> sudo crm configure primitive $pmk_res_name $pmk_cont_type:$service_name op 
> monitor interval="$mon_interval" timeout="$mon_timeout" op start 
> interval="$start_interval" timeout="$start_timeout" op stop 
> interval="$stop_interval" timeout="$stop_timeout
> 
> 
> sudo crm configure primitive $pmk_res_name $pmk_cont_type:$service_name op 
> monitor interval="$mon_interval" timeout="$mon_timeout"
> 
> 
> The bottom-line is that if I halt the VM running any of these resources, the 
> resource isn't failing over to another VM. I'm not sure what the exact cause 
> is - any help would be greatly appreciated!
> 
> 
> Thanks,
> Regards,
> Vijay
> _______________________________________________
> Pacemaker mailing list: [email protected]
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

signature.asc
Description: Message signed with OpenPGP using GPGMail

_______________________________________________
Pacemaker mailing list: [email protected]
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [Pacemaker] Resources not failing over, ERROR: RecurringOp: Invalid recurring action ... wth name: 'start'

Reply via email to