I set debug 1 but that does not bring too much news,
or what else should I do?
ct 14 14:16:38 n02asp7 attrd: [7855]: info: main: Starting up....
Oct 14 14:16:38 n02asp7 attrd: [7855]: ERROR: main: HA Signon failed
Oct 14 14:16:38 n02asp7 cib: [7852]: info: G_main_add_SignalHandler:
Added signal handler for signal 15
Oct 14 14:16:38 n02asp7 attrd: [7855]: ERROR: main: Aborting startup
Oct 14 14:16:38 n02asp7 cib: [7852]: info: G_main_add_TriggerHandler:
Added signal manual handler
Oct 14 14:16:38 n02asp7 cib: [7852]: info: G_main_add_SignalHandler:
Added signal handler for signal 17
Oct 14 14:16:38 n02asp7 cib: [7852]: info: retrieveCib: Reading cluster
configuration from: /var/lib/heartbeat/crm/cib.xml
(digest: /var/lib/heartbeat/crm/cib.xml.sig)
Oct 14 14:16:38 n02asp7 cib: [7852]: WARN: retrieveCib: Cluster
configuration not found: /var/lib/heartbeat/crm/cib.xml
Oct 14 14:16:38 n02asp7 cib: [7852]: WARN: readCibXmlFile: Primary
configuration corrupt or unusable, trying backup...
Oct 14 14:16:38 n02asp7 cib: [7852]: info: retrieveCib: Reading cluster
configuration from: /var/lib/heartbeat/crm/cib.xml.last (digest:
/var/lib/heartbeat/crm/cib.xml.sig.last)
Oct 14 14:16:38 n02asp7 heartbeat: [7839]: WARN: Managed
/usr/lib64/heartbeat/attrd process 7855 exited with return code 100.
Oct 14 14:16:38 n02asp7 heartbeat: [7839]: ERROR: Client
/usr/lib64/heartbeat/attrd exited with return code 100.
Andrew Beekhof schrieb:
On Oct 14, 2008, at 1:59 PM, Rainer Traut wrote:
answering myself, this was obviously the wrong part of the log.
here we go.
Status is now, one node has the older release:
working node n01:
# rpm -qa|grep heartbeat
heartbeat-common-2.99.0-3.1
heartbeat-resources-2.99.0-3.1
heartbeat-2.99.0-3.1
heartbeat-ldirectord-2.99.0-3.1
pacemaker-heartbeat-0.6.6-18.1
[EMAIL PROTECTED] ~]# rpm -qa|grep pacemaker
pacemaker-heartbeat-0.6.6-18.1
pacemaker-pygui-1.4-7.2
not working incl. logs node n02:
# rpm -qa|grep heartbeat
heartbeat-common-2.99.2-2.1
heartbeat-ldirectord-2.99.2-2.1
heartbeat-2.99.2-2.1
heartbeat-resources-2.99.2-2.1
libheartbeat2-2.99.2-2.1
[EMAIL PROTECTED] ~]# rpm -qa|grep pacema
pacemaker-1.0.0-1.2
libpacemaker3-1.0.0-1.2
pacemaker-pygui-1.4-8.1
Oct 14 13:49:23 n02asp7 attrd: [7901]: info: main: Starting up....
Oct 14 13:49:23 n02asp7 attrd: [7901]: ERROR: main: HA Signon failed
Oct 14 13:49:23 n02asp7 attrd: [7901]: ERROR: main: Aborting startup
Oct 14 13:49:23 n02asp7 heartbeat: [7885]: WARN: Managed
/usr/lib64/heartbeat/attrd process 7901 exited with return code 100.
Oct 14 13:49:23 n02asp7 cib: [7898]: info: G_main_add_SignalHandler:
Added signal handler for signal 15
Oct 14 13:49:23 n02asp7 cib: [7898]: info: G_main_add_TriggerHandler:
Added signal manual handler
Oct 14 13:49:23 n02asp7 cib: [7898]: info: G_main_add_SignalHandler:
Added signal handler for signal 17
Oct 14 13:49:23 n02asp7 cib: [7898]: info: retrieveCib: Reading
cluster configuration from: /var/lib/heartbeat/crm/cib.xml
(digest: /var/lib/heartbeat/crm/cib.xml.sig)
Oct 14 13:49:23 n02asp7 cib: [7898]: WARN: retrieveCib: Cluster
configuration not found: /var/lib/heartbeat/crm/cib.xml
Oct 14 13:49:23 n02asp7 cib: [7898]: WARN: readCibXmlFile: Primary
configuration corrupt or unusable, trying backup...
Oct 14 13:49:23 n02asp7 cib: [7898]: info: retrieveCib: Reading
cluster configuration from: /var/lib/heartbeat/crm/cib.xml.last
(digest: /var/lib/heartbeat/crm/cib.xml.sig.last)
Oct 14 13:49:23 n02asp7 ccm: [7897]: info: Hostname: n02asp7
Oct 14 13:49:23 n02asp7 stonithd: [7900]: info:
G_main_add_SignalHandler: Added signal handler for signal 10
Oct 14 13:49:23 n02asp7 stonithd: [7900]: info:
G_main_add_SignalHandler: Added signal handler for signal 12
Oct 14 13:49:23 n02asp7 cib: [7898]: ERROR: validate_cib_digest:
Digest comparision failed: expected dc90f2e743db61688a8cd6610c845ed2
(/var/lib/heartbeat/crm/cib.xml.sig.last), calculated
19e33575c865951da4f9cbf417207136
Oct 14 13:49:23 n02asp7 cib: [7898]: ERROR: retrieveCib: Checksum of
/var/lib/heartbeat/crm/cib.xml.last failed! Configuration contents
ignored!
Oct 14 13:49:23 n02asp7 cib: [7898]: ERROR: retrieveCib: Usually this
is caused by manual changes, please refer to
http://linux-ha.org/v2/faq/cib_changes_detected
Oct 14 13:49:23 n02asp7 cib: [7898]: WARN: retrieveCib: Continuing but
/var/lib/heartbeat/crm/cib.xml.last will NOT used.
Oct 14 13:49:23 n02asp7 cib: [7898]: WARN: readCibXmlFile: Continuing
with an empty configuration.
Oct 14 13:49:23 n02asp7 cib: [7898]: info: startCib: CIB
Initialization completed successfully
Oct 14 13:49:23 n02asp7 cib: [7898]: CRIT: cib_init: Cannot sign in to
the cluster... terminating
Thats not good.
Can you turn debug on and re-post the complete log?
Oct 14 13:49:23 n02asp7 heartbeat: [7885]: WARN: Managed
/usr/lib64/heartbeat/cib process 7898 exited with return code 100.
Oct 14 13:49:23 n02asp7 heartbeat: [7885]: EMERG: Rebooting system.
Reason: /usr/lib64/heartbeat/cib
change "crm yes" to "crm respawn" and heartbeat wont reboot the node
every time a process exits.
Oct 14 13:49:23 n02asp7 lrmd: [7899]: info: G_main_add_SignalHandler:
Added signal handler for signal 17
Oct 14 13:49:23 n02asp7 lrmd: [7899]: info: G_main_add_SignalHandler:
Added signal handler for signal 10
Rainer Traut schrieb:
ok, after doing so und updating to latest from repo this node keeps
rebooting itself.
only error I see is:
pengine: [12564]: WARN: text2task: Unsupported action: status
...
Oct 14 12:01:46 n02asp7 pengine: [12564]: notice: group_print:
Resource Group: group_2
Oct 14 12:01:46 n02asp7 pengine: [12564]: notice: native_print:
r-httpd (lsb:httpd): Started n01asp7
Oct 14 12:01:46 n02asp7 pengine: [12564]: notice: native_print:
r-named (lsb:named): Started n01asp7
Oct 14 12:01:46 n02asp7 pengine: [12564]: notice: clone_print: Clone
Set: ntpd-clone
Oct 14 12:01:46 n02asp7 pengine: [12564]: notice: native_print:
c-ntpd:0 (lsb:ntpd): Started n01asp7
Oct 14 12:01:46 n02asp7 pengine: [12564]: notice: native_print:
c-ntpd:1 (lsb:ntpd): Stopped
Oct 14 12:01:46 n02asp7 pengine: [12564]: WARN: text2task:
Unsupported action: status
Oct 14 12:01:46 n02asp7 last message repeated 2 times
Oct 14 12:01:46 n02asp7 pengine: [12564]: WARN: native_color:
Resource pingd-child:0 cannot run anywhere
Oct 14 12:01:46 n02asp7 pengine: [12564]: WARN: native_color:
Resource drbd0:0 cannot run anywhere
Oct 14 12:01:46 n02asp7 pengine: [12564]: info: master_color:
Promoting drbd0:1 (Master n01asp7)
Oct 14 12:01:46 n02asp7 pengine: [12564]: info: master_color:
ms-drbd0: Promoted 1 instances of a possible 1 to master
Oct 14 12:01:46 n02asp7 last message repeated 2 times
Oct 14 12:01:46 n02asp7 pengine: [12564]: WARN: native_color:
Resource c-ntpd:1 cannot run anywhere
Oct 14 12:01:46 n02asp7 pengine: [12564]: notice: NoRoleChange: Leave
resource pingd-child:1 (Started n01asp7)
Oct 14 12:01:46 n02asp7 pengine: [12564]: notice: NoRoleChange: Leave
resource drbd0:1 (Master n01asp7)
Oct 14 12:01:46 n02asp7 pengine: [12564]: notice: NoRoleChange: Leave
resource drbd0:1 (Master n01asp7)
Oct 14 12:01:46 n02asp7 pengine: [12564]: notice: NoRoleChange: Leave
resource r-srv (Started n01asp7)
Oct 14 12:01:46 n02asp7 pengine: [12564]: notice: NoRoleChange: Leave
resource r-oeIP (Started n01asp7)
Oct 14 12:01:46 n02asp7 pengine: [12564]: notice: NoRoleChange: Leave
resource r-email (Started n01asp7)
Oct 14 12:01:46 n02asp7 pengine: [12564]: notice: NoRoleChange: Leave
resource r-httpd (Started n01asp7)
Oct 14 12:01:46 n02asp7 pengine: [12564]: WARN: text2task:
Unsupported action: status
Oct 14 12:01:46 n02asp7 pengine: [12564]: notice: NoRoleChange: Leave
resource r-named (Started n01asp7)
Oct 14 12:01:46 n02asp7 pengine: [12564]: WARN: text2task:
Unsupported action: status
Oct 14 12:01:46 n02asp7 pengine: [12564]: notice: NoRoleChange: Leave
resource c-ntpd:0 (Started n01asp7)
Oct 14 12:01:46 n02asp7 pengine: [12564]: WARN: text2task:
Unsupported action: status
Oct 14 12:01:46 n02asp7 pengine: [12564]: info: stage6: Scheduling
Node n02asp7 for shutdown
Oct 14 12:01:46 n02asp7 pengine: [12564]: WARN: text2task:
Unsupported action: status
Oct 14 12:01:46 n02asp7 last message repeated 2 times
Oct 14 12:01:46 n02asp7 crmd: [7195]: info: do_state_transition:
State transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [
input=I_PE_SUCCESS cause=C_IPC_MESSAGE origin=route_message ]
Oct 14 12:01:46 n02asp7 tengine: [12563]: info: process_te_message:
Processing graph derived from
/var/lib/heartbeat/pengine/pe-warn-1146.bz2
Oct 14 12:01:46 n02asp7 tengine: [12563]: info: unpack_graph:
Unpacked transition 47: 1 actions in 1 synapses
Oct 14 12:01:46 n02asp7 tengine: [12563]: info: te_crm_command:
Executing crm-event (70): do_shutdown on n02asp7
Oct 14 12:01:46 n02asp7 crmd: [7195]: info: handle_request: Shutting
ourselves down (DC)
Andrew Beekhof schrieb:
I'm in the middle of a heap of package changes at the moment which
probably isn't helping.
For now, try removing pacemaker-heartbeat (in retrospect, the way I
implemented single-stack packages wasn't the most ideal) and just
installing pacemaker.
On Oct 14, 2008, at 11:27 AM, Rainer Traut wrote:
Hi,
OS: Centos5 x86_64
When running yum update:
--> Running transaction check
---> Package heartbeat-common.x86_64 0:2.99.2-2.1 set to be updated
--> Processing Dependency: libplumbgpl.so.2()(64bit) for package:
heartbeat-common
--> Processing Dependency: libapphb.so.2()(64bit) for package:
heartbeat-common
--> Processing Dependency: libplumb.so.2()(64bit) for package:
heartbeat-common
--> Processing Dependency: libpils.so.2()(64bit) for package:
heartbeat-common
--> Processing Dependency: libapphb.so.0()(64bit) for package:
pacemaker-heartbeat
--> Processing Dependency: libpils.so.1()(64bit) for package:
pacemaker-heartbeat
--> Processing Dependency: libplumb.so.1()(64bit) for package:
pacemaker-heartbeat
---> Package heartbeat-resources.x86_64 0:2.99.2-2.1 set to be updated
---> Package heartbeat-ldirectord.x86_64 0:2.99.2-2.1 set to be
updated
---> Package pacemaker-pygui.x86_64 0:1.4-8.1 set to be updated
---> Package heartbeat.x86_64 0:2.99.2-2.1 set to be updated
--> Running transaction check
--> Processing Dependency: libapphb.so.0()(64bit) for package:
pacemaker-heartbeat
--> Processing Dependency: libplumb.so.1()(64bit) for package:
pacemaker-heartbeat
---> Package libheartbeat2.x86_64 0:2.99.2-2.1 set to be updated
---> Package heartbeat-pils.x86_64 0:2.1.3-3.el5.centos set to be
updated
--> Processing Conflict: libheartbeat2 conflicts heartbeat-pils
--> Finished Dependency Resolution
Error: Missing Dependency: libplumb.so.1()(64bit) is needed by
package pacemaker-heartbeat
Error: libheartbeat2 conflicts with heartbeat-pils
Error: Missing Dependency: libapphb.so.0()(64bit) is needed by
package pacemaker-heartbeat
_______________________________________________
Pacemaker mailing list
[email protected]
http://list.clusterlabs.org/mailman/listinfo/pacemaker
_______________________________________________
Pacemaker mailing list
[email protected]
http://list.clusterlabs.org/mailman/listinfo/pacemaker
_______________________________________________
Pacemaker mailing list
[email protected]
http://list.clusterlabs.org/mailman/listinfo/pacemaker
_______________________________________________
Pacemaker mailing list
[email protected]
http://list.clusterlabs.org/mailman/listinfo/pacemaker