>>>>> On Fri, 18 Mar 2011 11:37:33 +0100, Christian Manal said: > > Am 18.03.2011 10:40, schrieb Christian Manal: > > Am 16.03.2011 09:14, schrieb Christian Manal: > >> Am 15.03.2011 19:12, schrieb Christian Manal: > >>> Am 15.03.2011 17:49, schrieb Kjetil Torgrim Homme: > >>>> Christian Manal <[email protected]> writes: > >>>> >>>>> Also, after several accurate jobs running without restarting Bacula, >>>>> the total memory usage of the director and fd didn't go up anymore, so >>>>> I presume it comes down to the behavior of Solaris' free(), as >>>>> described in the above quoted manpage. > >>>> > >>>> libumem may work better -- just set LD_PRELOAD, you don't have to > >>>> recompile. I'd appreciate it if you report back if you try it. > >>>> > >>> > >>> Actually, I already did that. Modified the startup script for the > >>> affected fd (don't want the director crashing if things go wrong) and > >>> restarted. I will report the results tomorrow. > >> > >> Looks good. > > > > Maybe I spoke too soon. Last night my director crashed with a segfault, > > after switching to libumem. Leading to that was an unusually long > > running job (the accurate one) which, going by the size, looked like it > > was doing a full instead of incremental for some reason. > > > > I have some output from mdb and pstack attached. > > And going by dbx, the dir went kaboom in Jmsg(). > ... > =>[1] Jmsg(0xbefe5be0, 0x1, 0x0, 0x0, 0xfee8e25e, 0xf6caddb0), at 0xfee6a580 > [2] j_msg(0x80c360e, 0x154, 0xbefe5be0, 0x1, 0x0, 0x0), at 0xfee6a7ad > [3] start_storage_daemon_message_thread(0xbefe5be0, 0x80bc7f5, 0xfdc7f960, > 0x0, 0x80bc798, 0xfde8fe6c), at 0x80834bc > [4] do_backup(0xbefe5be0, 0x4, 0x0, 0xfdf91200, 0xfeea26e4, 0xfdf91200), at > 0x80658b0 > [5] _ZL10job_threadPv(0xbefe5be0, 0x1, 0xfe7c0dc7, 0xfe8422cc, 0xfe8422c0, > 0xfdf91200), at 0x807a96e > [6] jobq_server(0x80e5080), at 0x807d127 > [7] _thr_setup(0xfdf91200), at 0xfe7c7e66 > [8] _lwp_start(0xfee8e708, 0x0, 0x0, 0xfde8ea00, 0x7, 0x0), at 0xfe7c8150
It looks like it ran out of memory (the segfault is deliberate, due to failure to create a thread in start_storage_daemon_message_thread). Did it write any info to the Bacula log? It should say "Cannot create message thread:" followed by the error message. __Martin ------------------------------------------------------------------------------ Colocation vs. Managed Hosting A question and answer guide to determining the best fit for your organization - today and in the future. http://p.sf.net/sfu/internap-sfd2d _______________________________________________ Bacula-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/bacula-users
