Hi! While analyzing some odd cluster problem in SLES11 SP4, I found this message repeating quite a lot (several times per second) with the same text:
[...more...] Nov 10 22:10:47 h05 cmirrord[17741]: [yEa32lLX] Retry #1 of cpg_mcast_joined: SA_AIS_ERR_TRY_AGAIN Nov 10 22:10:47 h05 cmirrord[17741]: [yEa32lLX] Retry #1 of cpg_mcast_joined: SA_AIS_ERR_TRY_AGAIN Nov 10 22:10:47 h05 cmirrord[17741]: [yEa32lLX] Retry #1 of cpg_mcast_joined: SA_AIS_ERR_TRY_AGAIN Nov 10 22:10:47 h05 cmirrord[17741]: [yEa32lLX] Retry #1 of cpg_mcast_joined: SA_AIS_ERR_TRY_AGAIN Nov 10 22:10:47 h05 cmirrord[17741]: [yEa32lLX] Retry #1 of cpg_mcast_joined: SA_AIS_ERR_TRY_AGAIN Nov 10 22:10:47 h05 cmirrord[17741]: [yEa32lLX] Retry #1 of cpg_mcast_joined: SA_AIS_ERR_TRY_AGAIN Nov 10 22:10:47 h05 cmirrord[17741]: [yEa32lLX] Retry #1 of cpg_mcast_joined: SA_AIS_ERR_TRY_AGAIN Nov 10 22:10:47 h05 cmirrord[17741]: [yEa32lLX] Retry #1 of cpg_mcast_joined: SA_AIS_ERR_TRY_AGAIN Nov 10 22:10:47 h05 cmirrord[17741]: [yEa32lLX] Retry #1 of cpg_mcast_joined: SA_AIS_ERR_TRY_AGAIN Nov 10 22:10:47 h05 cmirrord[17741]: [yEa32lLX] Retry #1 of cpg_mcast_joined: SA_AIS_ERR_TRY_AGAIN Nov 10 22:10:47 h05 cmirrord[17741]: [yEa32lLX] Retry #1 of cpg_mcast_joined: SA_AIS_ERR_TRY_AGAIN [...many more...] I wonder: Shouldn't the retry number be incremented? Or are these different retries? If so, where is it visible? The situation I'm analyzing is when a node should have been fenced, but somehow it wasn't, but also just stopped working (seemed like frozen). The last message a few minutes(!) before the other rnodes complained was: Nov 10 22:04:18 h01 crmd[16596]: notice: throttle_mode: High CIB load detected: 1.246333 (After this the node seemed dead/frozen). Regards, Ulrich _______________________________________________ Users mailing list: [email protected] https://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
