[Bug 1368737] Re: Pacemaker can seg fault on crm node online/standby

2015-01-20 Thread Rafael David Tinoco
I just found one upstream commit fixing this: ## commit 0326f05c9e26f39a394fa30830e31a76306f49c7 Author: Andrew Beekhof Date: Thu Aug 7 13:49:24 2014 +1000 Fix: stonith-ng: Reset mainloop source IDs after removing them diff --git a/lib/fencing/st_client.c b/lib/fencing/st_cli

[Bug 1368737] Re: Pacemaker can seg fault on crm node online/standby

2015-01-20 Thread Rafael David Tinoco
Okay, So the cherry-pick (for version trusty_pacemaker_1.1.10+git20130802-1ubuntu2.2, based on a upstream commit) seems ok since it makes lrmd (services, services_linux) to avoid repeating a timer when the source was already removed from glib main loop context: example: + if (op->opaque->rep

[Bug 1368737] Re: Pacemaker can seg fault on crm node online/standby

2015-01-20 Thread Peter Parzer
Here is the crash file from my virtual testing environment, including core. I updated the nodes today with all current packages from trusty- proposed. The core dump happened just after restarting the nodes after the updates. ** Attachment added: "_usr_lib_pacemaker_stonithd.0.crash" https://bu

[Bug 1368737] Re: Pacemaker can seg fault on crm node online/standby

2015-01-20 Thread Peter Parzer
And this is the current cluster configuration of the testing environment. ** Attachment added: "cib" https://bugs.launchpad.net/ubuntu/+source/pacemaker/+bug/1368737/+attachment/4302101/+files/cib -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscri

[Bug 1368737] Re: Pacemaker can seg fault on crm node online/standby

2015-01-20 Thread Peter Parzer
And here is the last core dump of my production system from today in the morning, also core included. ** Attachment added: "_usr_lib_pacemaker_stonithd.0.crash" https://bugs.launchpad.net/ubuntu/+source/pacemaker/+bug/1368737/+attachment/4302102/+files/_usr_lib_pacemaker_stonithd.0.crash --

[Bug 1368737] Re: Pacemaker can seg fault on crm node online/standby

2015-01-19 Thread Rafael David Tinoco
For now, I have tested the following scenarios: - 4 nodes - stonith-enabled=true - no-quorum-policy=stop AND - 2 nodes only - stonith-enabled=true - no-quorum-policy=ignore I ran the test case (bug description) for hours and could not get a crash, although I do get the following messages (expe

[Bug 1368737] Re: Pacemaker can seg fault on crm node online/standby

2015-01-19 Thread Rafael David Tinoco
Peter, (1) During the test execution, does using more then 2 nodes AND/OR changing "no-quorum-policy" to something else (freeze, stop, suicide) does help ? (2) Your crash files do not contain the core file, could you please provide me the core file (probably changing ulimit inside /etc/security/l

[Bug 1368737] Re: Pacemaker can seg fault on crm node online/standby

2015-01-14 Thread Rafael David Tinoco
Okay, I'm revisiting this today. Tks for the crash, I'll try to reproduce what you are getting also. Tinoco -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1368737 Title: Pacemaker can seg fault on

[Bug 1368737] Re: Pacemaker can seg fault on crm node online/standby

2015-01-14 Thread Peter Parzer
If I configure the test environment like my production system (with the exception of the stonith agent, of course) I get additional core dumps of lrmd. Peter ** Attachment added: "_usr_lib_pacemaker_lrmd.0.crash" https://bugs.launchpad.net/ubuntu/+source/pacemaker/+bug/1368737/+attachment/429

[Bug 1368737] Re: Pacemaker can seg fault on crm node online/standby

2015-01-14 Thread Peter Parzer
** Attachment added: "dpkg-versions" https://bugs.launchpad.net/ubuntu/+source/pacemaker/+bug/1368737/+attachment/4298137/+files/dpkg-versions -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1368737

[Bug 1368737] Re: Pacemaker can seg fault on crm node online/standby

2015-01-14 Thread Peter Parzer
I set up a testing environment with 2 VMs, trusty-proposed enabled, all updates installed and the following minimal cluster configuration: node $id="168427521" kjpnode1 \ attributes standby="off" node $id="168427522" kjpnode2 \ attributes standby="on" primitive st_kjpnode1 stonith:

[Bug 1368737] Re: Pacemaker can seg fault on crm node online/standby

2015-01-08 Thread Peter Parzer
The cluster consists of 2 HP ProLiant DL120 G7 Rack-Server as file server with DRBD and Samba. I used the same configuration with 12.04 for 2 years without any problems. The cluster configuration: node $id="167772161" kjp02 \ attributes standby="off" node $id="167772162" kjp03 \ a

[Bug 1368737] Re: Pacemaker can seg fault on crm node online/standby

2015-01-08 Thread Rafael David Tinoco
Could you provide your cluster configuration (cib file with configured stonith resources and parameters) and all packages versions (dpkg -l) ? I'll try to reproduce what you are facing.. tks -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubun

[Bug 1368737] Re: Pacemaker can seg fault on crm node online/standby

2015-01-08 Thread Peter Parzer
I also experience crashes of stonithd, alone 2 times yesterday, always on both nodes at the same time. Here is the stack trace: root@kjp03:/var/crash# apport-retrace -Rs _usr_lib_pacemaker_stonithd.0.crash E: Can not find version '1.1.10+git20130802-1ubuntu2.2' of package 'pacemaker' E: Quellpake

[Bug 1368737] Re: Pacemaker can seg fault on crm node online/standby

2015-01-07 Thread Rafael David Tinoco
Per comment #10: " Those error messages from glib (not being able to remove the resource), that are still there : Oct 31 00:30:20 [2054] clustertrusty03 stonith-ng: error: crm_abort: crm_glib_handler: Forked child 2 197 to record non-fatal assert at logging.c:63 : Source ID 15 was not found when

[Bug 1368737] Re: Pacemaker can seg fault on crm node online/standby

2015-01-06 Thread Peter Parzer
Hi Brian, this fix did not solve the bug for me. I still get the following error message every 2 minutes: Jan 7 08:28:25 kjp02 stonith-ng[1868]:error: crm_abort: crm_glib_handler: Forked child 4647 to record non-fatal assert at logging.c:63 : Source ID 28 was not found when attempting to rem

[Bug 1368737] Re: Pacemaker can seg fault on crm node online/standby

2014-12-18 Thread Launchpad Bug Tracker
** Branch linked: lp:ubuntu/trusty-proposed/pacemaker ** Branch linked: lp:ubuntu/utopic-proposed/pacemaker -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1368737 Title: Pacemaker can seg fault on c

[Bug 1368737] Re: Pacemaker can seg fault on crm node online/standby

2014-12-18 Thread Brian Murray
Hello Rafael, or anyone else affected, Accepted pacemaker into trusty-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/pacemaker/1.1.10+git20130802-1ubuntu2.2 in a few hours, and then in the -proposed repository. Please help us by testing this new pack

[Bug 1368737] Re: Pacemaker can seg fault on crm node online/standby

2014-12-18 Thread James Page
utopic and trusty fixes uploaded for sru team review. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1368737 Title: Pacemaker can seg fault on crm node online/standby To manage notifications about t

[Bug 1368737] Re: Pacemaker can seg fault on crm node online/standby

2014-12-18 Thread Launchpad Bug Tracker
** Branch linked: lp:ubuntu/pacemaker -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1368737 Title: Pacemaker can seg fault on crm node online/standby To manage notifications about this bug go to: h

[Bug 1368737] Re: Pacemaker can seg fault on crm node online/standby

2014-12-18 Thread Launchpad Bug Tracker
This bug was fixed in the package pacemaker - 1.1.11-1ubuntu1 --- pacemaker (1.1.11-1ubuntu1) vivid; urgency=medium * Merge from Debian experimental, remaining changes: - d/control: Build-Depends on libcfg-dev. - Corosync's pacemaker plugin is disabled, hence not built:

[Bug 1368737] Re: Pacemaker can seg fault on crm node online/standby

2014-12-18 Thread Launchpad Bug Tracker
** Branch linked: lp:ubuntu/vivid-proposed/pacemaker -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1368737 Title: Pacemaker can seg fault on crm node online/standby To manage notifications about th

[Bug 1368737] Re: Pacemaker can seg fault on crm node online/standby

2014-12-18 Thread James Page
** Changed in: pacemaker (Ubuntu Vivid) Importance: Undecided => High ** Changed in: pacemaker (Ubuntu Utopic) Importance: Undecided => High ** Changed in: pacemaker (Ubuntu Trusty) Importance: Undecided => High -- You received this bug notification because you are a member of Ubuntu B

[Bug 1368737] Re: Pacemaker can seg fault on crm node online/standby

2014-12-18 Thread James Page
** Also affects: pacemaker (Ubuntu Vivid) Importance: Undecided Assignee: Rafael David Tinoco (inaddy) Status: In Progress ** Also affects: pacemaker (Ubuntu Trusty) Importance: Undecided Status: New ** Also affects: pacemaker (Ubuntu Utopic) Importance: Undecided

[Bug 1368737] Re: Pacemaker can seg fault on crm node online/standby

2014-11-11 Thread Rafael David Tinoco
Trusty fix. ** Patch added: "trusty_pacemaker_1.1.10+git20130802-1ubuntu2.2.debdiff" https://bugs.launchpad.net/ubuntu/+source/pacemaker/+bug/1368737/+attachment/4258483/+files/trusty_pacemaker_1.1.10%2Bgit20130802-1ubuntu2.2.debdiff -- You received this bug notification because you are a me

[Bug 1368737] Re: Pacemaker can seg fault on crm node online/standby

2014-11-11 Thread Rafael David Tinoco
The way this package's versioning was made makes the tool "dh_makeshlibs" (debian helper) not to append proper suffix to dependencies (using (>= 1.1.10+git20130802) instead of (>= 1.1.10+git20130802-1ubuntu2.1) for example). I changed debian/rules so the proper version is considered for dependenci

[Bug 1368737] Re: Pacemaker can seg fault on crm node online/standby

2014-11-11 Thread Rafael David Tinoco
I recommend, if possible, Vivid to use 1.1.12 (from upstream) and to use a different versioning scheme. Asking for sponsorship. Thank you Rafael Tinoco -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/

[Bug 1368737] Re: Pacemaker can seg fault on crm node online/standby

2014-11-11 Thread Rafael David Tinoco
Utopic fix. ** Patch added: "utopic_pacemaker_1.1.10+git20130802-4ubuntu4.debdiff" https://bugs.launchpad.net/ubuntu/+source/pacemaker/+bug/1368737/+attachment/4258484/+files/utopic_pacemaker_1.1.10%2Bgit20130802-4ubuntu4.debdiff -- You received this bug notification because you are a member

[Bug 1368737] Re: Pacemaker can seg fault on crm node online/standby

2014-11-11 Thread Rafael David Tinoco
It looks like the format chosen for SRU for this package : pacemaker (1.1.10+git20130802-1ubuntu2.1) trusty pacemaker (1.1.10+git20130802-1ubuntu2) trusty pacemaker (1.1.10+git20130802-1ubuntu1) saucy makes dh helpers not to calculate shlibs version properly: $ fakeroot dh_makeshlibs -a -V $ fi

[Bug 1368737] Re: Pacemaker can seg fault on crm node online/standby

2014-11-10 Thread Rafael David Tinoco
** Summary changed: - Pacemaker can seg fault on crm node online/standy + Pacemaker can seg fault on crm node online/standby -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1368737 Title: Pacemaker c