On Sep 24, 2008, at 3:56 PM, Bruno Voigt wrote:

Hi Andrew,
thanks  a lot.
I just reread the  PDF and using  xsltproc I just got my test resource
hbtest1a configured.
But the CRM tells me that it could not be started on either of my both
Nodes (running Xen+DRBD).
(Under the debian heartbeat it worked as expected.)

The RA is reporting that the start action failed - which has very little to do with Pacemaker.

On the other hand, I imagine that Xen might need more than one parameter:

       <instance_attributes id="instance_attributes.hbtest1a">
         <nvpair id="hbtest1a.ia03" name="monitor_scripts"
value="/opt/gucky/monitor/monitor_ssh"/>
       </instance_attributes>


Try adding some debug to the Xen RA to figure out why the start action fails.



It tells me:

Sep 24 15:20:55 xen20a crm_verify: [7184]: info: main: =#=#=#=#= Getting
XML =#=#=#=#=
Sep 24 15:20:55 xen20a crm_verify: [7184]: info: main: Reading XML from:
live cluster
Sep 24 15:20:55 xen20a crm_verify: [7184]: notice: main: Required
feature set: 2.0
Sep 24 15:20:55 xen20a crm_verify: [7184]: WARN: main: Your
configuration was internally updated to the latest version (pacemaker-1.0)
Sep 24 15:20:55 xen20a crm_verify: [7184]: WARN: unpack_resources: No
STONITH resources have been defined
Sep 24 15:20:55 xen20a crm_verify: [7184]: info:
determine_online_status: Node xen20b.test.mytld.com is online
Sep 24 15:20:55 xen20a crm_verify: [7184]: info: unpack_rsc_op:
Remapping hbtest1a_start_0 (rc=5) on xen20b.test.mytld.com to an ERROR
(expected 0)
Sep 24 15:20:55 xen20a crm_verify: [7184]: ERROR: unpack_rsc_op: Hard
error - hbtest1a_start_0 failed with rc=5: Preventing hbtest1a from
re-starting on xen20b.test.mytld.com
Sep 24 15:20:55 xen20a crm_verify: [7184]: WARN: unpack_rsc_op:
Processing failed op hbtest1a_start_0 on xen20b.test.mytld.com: Error
Sep 24 15:20:55 xen20a crm_verify: [7184]: WARN: unpack_rsc_op:
Compatability handling for failed op hbtest1a_start_0 on
xen20b.test.mytld.com
Sep 24 15:20:55 xen20a crm_verify: [7184]: info:
determine_online_status: Node xen20a.test.mytld.com is online
Sep 24 15:20:55 xen20a crm_verify: [7184]: info: unpack_rsc_op:
Remapping hbtest1a_start_0 (rc=5) on xen20a.test.mytld.com to an ERROR
(expected 0)
Sep 24 15:20:55 xen20a crm_verify: [7184]: ERROR: unpack_rsc_op: Hard
error - hbtest1a_start_0 failed with rc=5: Preventing hbtest1a from
re-starting on xen20a.test.mytld.com
Sep 24 15:20:55 xen20a crm_verify: [7184]: WARN: unpack_rsc_op:
Processing failed op hbtest1a_start_0 on xen20a.test.mytld.com: Error
Sep 24 15:20:55 xen20a crm_verify: [7184]: WARN: unpack_rsc_op:
Compatability handling for failed op hbtest1a_start_0 on
xen20a.test.mytld.com
Sep 24 15:20:55 xen20a crm_verify: [7184]: info: get_failcount: hbtest1a
has failed 1000000 times on xen20b.test.mytld.com
Sep 24 15:20:55 xen20a crm_verify: [7184]: info: get_failcount: hbtest1a
has failed 1000000 times on xen20a.test.mytld.com
Sep 24 15:20:55 xen20a crm_verify: [7184]: WARN: native_color: Resource
hbtest1a cannot run anywhere

Perhaps you could me give me a hint how to find out,
why the resource could not be started.

Thanks a lot,
Bruno

cibadmin -Q yields:

<cib epoch="36" num_updates="7" admin_epoch="0"
validate-with="pacemaker-1.0" have-quorum="1" crm_feature_set="3.0"
dc-uuid="278bebc6-2a59-4fa9-be2f-f6e262ce8936">
 <configuration>
   <crm_config>
     <cluster_property_set id="cib-bootstrap-options">
       <nvpair id="cib-bootstrap-options-dc-version" name="dc-version"
value="1.0.0-rc1-node: 9b1e9d2785edf8eadd60e1e89f0ecacf67b2bc8c"/>
       <nvpair id="cib-bootstrap-options-last-lrm-refresh"
name="last-lrm-refresh" value="1222261375"/>
     </cluster_property_set>
   </crm_config>
   <nodes>
     <node id="278bebc6-2a59-4fa9-be2f-f6e262ce8936"
uname="xen20b.test.mytld.com" type="normal"/>
     <node id="d06bdc02-b705-4bf9-9cd4-6f827ffdfe2e"
uname="xen20a.test.mytld.com" type="normal"/>
   </nodes>
   <resources>
<primitive id="hbtest1a" class="ocf" type="Xen" provider="heartbeat">
       <operations>
<op name="start" interval="0" id="hbtest1a-op01" timeout="60s"
start-delay="0"/>
<op name="stop" interval="0" id="hbtest1a-op02" timeout="300s"/>
         <op name="monitor" interval="30s" id="hbtest1a-op03"
timeout="60s" start-delay="300s" requires="nothing">
<instance_attributes id="instance_attributes.hbtest1a.op03">
             <nvpair id="hbtest1a-op03-inst-01" name="check_xyz"
value="something"/>
           </instance_attributes>
         </op>
       </operations>
       <instance_attributes id="instance_attributes.hbtest1a">
         <nvpair id="hbtest1a.ia03" name="monitor_scripts"
value="/opt/gucky/monitor/monitor_ssh"/>
       </instance_attributes>
       <meta_attributes id="meta_attributes.hbtest1a">
         <nvpair id="hbtest1a.me01" name="is-managed" value="true"/>
<nvpair id="hbtest1a.me02" name="allow-migrate" value="true"/> <nvpair id="hbtest1a.me03" name="target-role" value="started"/>
       </meta_attributes>
       <instance_attributes id="hbtest1a-instance_attributes"/>
     </primitive>
   </resources>
   <constraints/>
 </configuration>
 <status>
   <node_state id="278bebc6-2a59-4fa9-be2f-f6e262ce8936"
uname="xen20b.test.mytld.com" ha="active" in_ccm="true" crmd="online"
join="member" expected="member" crm-debug-origin="do_update_resource"
shutdown="0">
     <lrm id="278bebc6-2a59-4fa9-be2f-f6e262ce8936">
       <lrm_resources>
         <lrm_resource id="hbtest1a" type="Xen" class="ocf"
provider="heartbeat">
           <lrm_rsc_op id="hbtest1a_monitor_0" operation="monitor"
crm-debug-origin="do_update_resource" crm_feature_set="3.0"
transition-key="4:38:7:41183756-ab7d-45c5-9114-f6af4d0ab52f"
transition-magic="0:7;4:38:7:41183756-ab7d-45c5-9114-f6af4d0ab52f"
call-id="13" rc-code="7" op-status="0" interval="0"
last-run="1222261375" last-rc-change="1222261375" exec-time="400"
queue-time="0" op-digest="49a8ad7917b43502c13560299454cb0c"/>
           <lrm_rsc_op id="hbtest1a_start_0" operation="start"
crm-debug-origin="do_update_resource" crm_feature_set="3.0"
transition-key="6:38:0:41183756-ab7d-45c5-9114-f6af4d0ab52f"
transition-magic="0:5;6:38:0:41183756-ab7d-45c5-9114-f6af4d0ab52f"
call-id="14" rc-code="5" op-status="0" interval="0"
last-run="1222261375" last-rc-change="1222261375" exec-time="410"
queue-time="0" op-digest="61f0950751fea1b5f7a5c22a0923c546"/>
           <lrm_rsc_op id="hbtest1a_stop_0" operation="stop"
crm-debug-origin="do_update_resource" crm_feature_set="3.0"
transition-key="1:39:0:41183756-ab7d-45c5-9114-f6af4d0ab52f"
transition-magic="0:0;1:39:0:41183756-ab7d-45c5-9114-f6af4d0ab52f"
call-id="15" rc-code="0" op-status="0" interval="0"
last-run="1222261376" last-rc-change="1222261376" exec-time="460"
queue-time="0" op-digest="b5aefbd8964a51dea7e69fac50cd2db7"/>
         </lrm_resource>
       </lrm_resources>
     </lrm>
     <transient_attributes id="278bebc6-2a59-4fa9-be2f-f6e262ce8936">
       <instance_attributes
id="status-278bebc6-2a59-4fa9-be2f-f6e262ce8936">
         <nvpair
id="status-278bebc6-2a59-4fa9-be2f-f6e262ce8936-probe_complete"
name="probe_complete" value="true"/>
         <nvpair
id="status-278bebc6-2a59-4fa9-be2f-f6e262ce8936-last-failure-hbtest1a"
name="last-failure-hbtest1a" value="1222261376"/>
         <nvpair
id="status-278bebc6-2a59-4fa9-be2f-f6e262ce8936-fail-count-hbtest1a"
name="fail-count-hbtest1a" value="INFINITY"/>
       </instance_attributes>
     </transient_attributes>
   </node_state>
   <node_state id="d06bdc02-b705-4bf9-9cd4-6f827ffdfe2e"
uname="xen20a.test.mytld.com" ha="active" in_ccm="true" crmd="online"
join="member" expected="member" crm-debug-origin="do_update_resource"
shutdown="0">
     <lrm id="d06bdc02-b705-4bf9-9cd4-6f827ffdfe2e">
       <lrm_resources>
         <lrm_resource id="hbtest1a" type="Xen" class="ocf"
provider="heartbeat">
           <lrm_rsc_op id="hbtest1a_monitor_0" operation="monitor"
crm-debug-origin="do_update_resource" crm_feature_set="3.0"
transition-key="5:35:7:41183756-ab7d-45c5-9114-f6af4d0ab52f"
transition-magic="0:7;5:35:7:41183756-ab7d-45c5-9114-f6af4d0ab52f"
call-id="13" rc-code="7" op-status="0" interval="0"
last-run="1222261368" last-rc-change="1222261368" exec-time="410"
queue-time="0" op-digest="49a8ad7917b43502c13560299454cb0c"/>
           <lrm_rsc_op id="hbtest1a_start_0" operation="start"
crm-debug-origin="do_update_resource" crm_feature_set="3.0"
transition-key="5:36:0:41183756-ab7d-45c5-9114-f6af4d0ab52f"
transition-magic="0:5;5:36:0:41183756-ab7d-45c5-9114-f6af4d0ab52f"
call-id="14" rc-code="5" op-status="0" interval="0"
last-run="1222261369" last-rc-change="1222261369" exec-time="410"
queue-time="0" op-digest="61f0950751fea1b5f7a5c22a0923c546"/>
           <lrm_rsc_op id="hbtest1a_stop_0" operation="stop"
crm-debug-origin="do_update_resource" crm_feature_set="3.0"
transition-key="1:37:0:41183756-ab7d-45c5-9114-f6af4d0ab52f"
transition-magic="0:0;1:37:0:41183756-ab7d-45c5-9114-f6af4d0ab52f"
call-id="15" rc-code="0" op-status="0" interval="0"
last-run="1222261371" last-rc-change="1222261371" exec-time="490"
queue-time="0" op-digest="b5aefbd8964a51dea7e69fac50cd2db7"/>
         </lrm_resource>
       </lrm_resources>
     </lrm>
     <transient_attributes id="d06bdc02-b705-4bf9-9cd4-6f827ffdfe2e">
       <instance_attributes
id="status-d06bdc02-b705-4bf9-9cd4-6f827ffdfe2e">
         <nvpair
id="status-d06bdc02-b705-4bf9-9cd4-6f827ffdfe2e-probe_complete"
name="probe_complete" value="true"/>
         <nvpair
id="status-d06bdc02-b705-4bf9-9cd4-6f827ffdfe2e-last-failure-hbtest1a"
name="last-failure-hbtest1a" value="1222261371"/>
         <nvpair
id="status-d06bdc02-b705-4bf9-9cd4-6f827ffdfe2e-fail-count-hbtest1a"
name="fail-count-hbtest1a" value="INFINITY"/>
       </instance_attributes>
     </transient_attributes>
   </node_state>
 </status>
</cib>



Andrew Beekhof wrote:

On Sep 24, 2008, at 12:09 PM, Bruno Voigt wrote:

Hi,
I've just installed pacemaker unstable from
http://download.opensuse.org/repositories/server:/ha-clustering:/UNSTABLE/Debian_Etch/amd64/

replacing a previous configuration of debian heartbeat.

The CIB contains only the node configuration.

If I try to add a resource definition I get the following error
and was not able yet to figure out, why it is rejected.

root# cibadmin -V -C -o resources -x ./test-resource.xml
cibadmin[25909]: 2008/09/24_12:06:40 info: main: Starting mainloop
cibadmin[25909]: 2008/09/24_12:06:41 WARN: cibadmin_op_callback: Call
cib_create failed (-47): Update does not conform to the configured
schema/DTD
Call cib_create failed (-47): Update does not conform to the configured
schema/DTD
<null>


test-resource.xml contains:

<resources>
<primitive id="hbtest1a" class="ocf" type="Xen" provider="heartbeat">
<operations>
 <op id="hbtest1a-op01" name="start" timeout="60s" start_delay="0"/>

start_delay -> start-delay


 <op id="hbtest1a-op02" name="stop" timeout="300s"/>
 <op id="hbtest1a-op03" name="monitor" interval="30s" timeout="60s"
start_delay="300s" prereq="nothing"/>
</operations>


prereq -> requires



<instance_attributes id="hbtest1a">
<attributes>

the attributes scaffolding is no longer used


 <nvpair id="hbtest1a-attr01" name="xmfile"
value="/etc/xen/hbtest1a.cfg"/>
</attributes>
</instance_attributes>

<meta_attributes id="hbtest1a-meta01">
<attributes>
 <nvpair id="hbtest1a-meta-attr01" name="is_managed" value="true"/>
<nvpair id="hbtest1a-meta-attr02" name="allow_migrate" value="true"/> <nvpair id="hbtest1a-meta-attr03" name="target_role" value="stopped"/>

_ -> - for all options


</attributes>
</meta_attributes>

</primitive>
</resources>

How must a similar resource definition look like for pacemaker?

http://clusterlabs.org/mw/Image:Configuration_Explained_1.0.pdf

specifically, you might want to read the bit on upgrading an old
configuration



TIA for any hints,
Bruno
--
[EMAIL PROTECTED]


_______________________________________________
Pacemaker mailing list
[email protected]
http://list.clusterlabs.org/mailman/listinfo/pacemaker


_______________________________________________
Pacemaker mailing list
[email protected]
http://list.clusterlabs.org/mailman/listinfo/pacemaker


_______________________________________________
Pacemaker mailing list
[email protected]
http://list.clusterlabs.org/mailman/listinfo/pacemaker


_______________________________________________
Pacemaker mailing list
[email protected]
http://list.clusterlabs.org/mailman/listinfo/pacemaker

Reply via email to