1.1.6 is really too old in any case, rc=5 'not installed' means we cant find an init script of that name in /etc/init.d
On 2 Jul 2014, at 2:07 pm, Vijay B <[email protected]> wrote: > Hi, > > I'm puppetizing resource deployment for pacemaker and corosync, and as part > of it, am creating a resource on one of three nodes of a cluster. The problem > is that I'm seeing RecurringOp errors during resource creation, which are > probably not allowing failover a resource. The resource creation seems to go > through fine, but these recurringOp errors always result after resource > creation (I'm pasting outputs of two different commands below): > > > *************************** > vagrant@precise64b:/vagrant/puppet-environments/modules/f5_lbaas/tests$ sudo > crm status > ============ > Last updated: Wed Jul 2 03:52:30 2014 > Last change: Wed Jul 2 03:38:20 2014 via cibadmin on precise64b > Stack: cman > Current DC: precise64b - partition with quorum > Version: 1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c > 3 Nodes configured, unknown expected votes > 3 Resources configured. > ============ > > Online: [ precise64b precise64c precise64a ] > > f5-lbaas-agent-10.6.143.121_resource (lsb:f5-lbaas-agent-10.6.143.121): > Started precise64c > f5-lbaas-agent-10.6.143.122_resource (lsb:f5-lbaas-agent-10.6.143.122): > Started precise64b > f5-lbaas-agent-10.6.143.123_resource (lsb:f5-lbaas-agent-10.6.143.123): > Started precise64b > > Failed actions: > f5-lbaas-agent-10.6.143.120_resource_monitor_0 (node=precise64b, call=2, > rc=5, status=complete): not installed > f5-lbaas-agent-10.6.143.121_resource_monitor_0 (node=precise64b, call=3, > rc=5, status=complete): not installed > f5-lbaas-agent-10.6.143.122_resource_monitor_0 (node=precise64c, call=7, > rc=5, status=complete): not installed > f5-lbaas-agent-10.6.143.123_resource_monitor_0 (node=precise64c, call=8, > rc=5, status=complete): not installed > f5-lbaas-agent-10.6.143.120_resource_monitor_0 (node=precise64a, call=2, > rc=5, status=complete): not installed > f5-lbaas-agent-10.6.143.121_resource_monitor_0 (node=precise64a, call=3, > rc=5, status=complete): not installed > f5-lbaas-agent-10.6.143.122_resource_monitor_0 (node=precise64a, call=4, > rc=5, status=complete): not installed > f5-lbaas-agent-10.6.143.123_resource_monitor_0 (node=precise64a, call=5, > rc=5, status=complete): not installed > vagrant@precise64b:/vagrant/puppet-environments/modules/f5_lbaas/tests$ > > > *************************** > > vagrant@precise64b:/vagrant/puppet-environments/modules/f5_lbaas/tests$ sudo > crm_verify -L -V > crm_verify[15183]: 2014/07/02_03:39:13 ERROR: RecurringOp: Invalid recurring > action f5-lbaas-agent-10.6.143.121_resource-start-10 wth name: 'start' > crm_verify[15183]: 2014/07/02_03:39:13 ERROR: RecurringOp: Invalid recurring > action f5-lbaas-agent-10.6.143.121_resource-stop-10 wth name: 'stop' > crm_verify[15183]: 2014/07/02_03:39:13 ERROR: RecurringOp: Invalid recurring > action f5-lbaas-agent-10.6.143.122_resource-start-10 wth name: 'start' > crm_verify[15183]: 2014/07/02_03:39:13 ERROR: RecurringOp: Invalid recurring > action f5-lbaas-agent-10.6.143.122_resource-stop-10 wth name: 'stop' > Errors found during check: config not valid > vagrant@precise64b:/vagrant/puppet-environments/modules/f5_lbaas/tests$ > *************************** > > > What do these errors signify? I found one email exchange on a pacemaker ML > that suggested that we shouldn't be using start intervals and timeouts, and > same with stop, since that would mean that pacemaker would attempt to restart > the resource every x seconds, timeout every y seconds, and repeat that. > (Link: http://lists.linbit.com/pipermail/drbd-user/2011-September/016938.html) > > My understanding was that the start interval would apply in case of restart > attempts upon detection of a resource as being down. Nevertheless, I removed > these parameters and created a third resource (the first two, I created with > these parameters), and I still see the same monitor related errors for the > third resource (f5-lbaas-agent-10.6.143.123_resource_monitor_0) in the sudo > crm status command output. I don't however understand why this resource > doesn't show up in the crm_verify -L -V output. > > Here are the two CLIs I use to create the resources: > > sudo crm configure primitive $pmk_res_name $pmk_cont_type:$service_name op > monitor interval="$mon_interval" timeout="$mon_timeout" op start > interval="$start_interval" timeout="$start_timeout" op stop > interval="$stop_interval" timeout="$stop_timeout > > > sudo crm configure primitive $pmk_res_name $pmk_cont_type:$service_name op > monitor interval="$mon_interval" timeout="$mon_timeout" > > > The bottom-line is that if I halt the VM running any of these resources, the > resource isn't failing over to another VM. I'm not sure what the exact cause > is - any help would be greatly appreciated! > > > Thanks, > Regards, > Vijay > _______________________________________________ > Pacemaker mailing list: [email protected] > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org
signature.asc
Description: Message signed with OpenPGP using GPGMail
_______________________________________________ Pacemaker mailing list: [email protected] http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
