Hi.

David Marcin wrote:

Bacula exits unexpectedly, the only thing I can think of is that the
database has somehow become corrupted in such a way to kill the director.

As far as I know the system has been running for about 2 months
unchanged, however I am not the only person to have administrator rights
on the machine so I cannot be certain.  I have upgraded to the latest
version of bacula available via debian's apt system.  Details follow.

# bacula-dir -?
Copyright (C) 2000-2004 Kern Sibbald and John Walker

Version: 1.36.2 (28 February 2005)

And the log of the error:

# sed 's/quateams/backups/g' file
# bacula-dir -f -d99
bacula-dir: dird.c:131 Debug level = 99
backups-dir: cram-md5.c:52 send: auth cram-md5
<[EMAIL PROTECTED]> ssl=0
backups-dir: cram-md5.c:70 Authenticate OK K4+L5x5dVisA4+Erjy4IeB
backups-dir: cram-md5.c:120 sending resp to challenge:
fUcX5UYxq5/44iNvf8/pMA
backups-dir: ua_run.c:481 JobType=B
backups-dir: job.c:108 Open database
backups-dir: job.c:121 DB opened
backups-dir: btimers.c:169 Start bsock timer 0x80d0488 tid=0x10005 for
600 secs at 1115840968
backups-dir: cram-md5.c:120 sending resp to challenge:
YxpC6AB+3j/VhB04dV+H+A
backups-dir: cram-md5.c:52 send: auth cram-md5
<[EMAIL PROTECTED]> ssl=0
backups-dir: cram-md5.c:70 Authenticate OK L74xUAR2vHEaNiUN3CwJuC
backups-dir: btimers.c:183 Stop bsock timer 0x80d0488 tid=0x10005 at
1115840969.
backups-dir: fd_cmds.c:87 Opened connection with File daemon
backups-dir: btimers.c:169 Start bsock timer 0x80d2508 tid=0x10005 for
600 secs at 1115840969
backups-dir: cram-md5.c:120 sending resp to challenge:
3kdFKAdWj6Ni2HRo10+9KA
backups-dir: cram-md5.c:52 send: auth cram-md5
<[EMAIL PROTECTED]> ssl=0
backups-dir: cram-md5.c:70 Authenticate OK CB/ligxuxF/1f/+EA4pYNC
backups-dir: btimers.c:183 Stop bsock timer 0x80d2508 tid=0x10005 at
1115840969.
backups-dir: ua_status.c:104 status:status:
backups-dir: ua_status.c:137 do_prompt: select daemon
backups-dir: ua_status.c:141 item=0
backups-dir: ua_status.c:104 status:status:
backups-dir: ua_status.c:137 do_prompt: select daemon
backups-dir: ua_status.c:141 item=2
backups-dir: fd_cmds.c:87 Opened connection with File daemon
backups-dir: btimers.c:169 Start bsock timer 0x80d2518 tid=0xc004 for
600 secs at 1115841057
backups-dir: cram-md5.c:120 sending resp to challenge:
Gy/J0W+QmSYGOy17X9dlXB
backups-dir: cram-md5.c:52 send: auth cram-md5
<[EMAIL PROTECTED]> ssl=0
backups-dir: cram-md5.c:70 Authenticate OK t9/PBGFe26Uark4WPxFeIB
backups-dir: btimers.c:183 Stop bsock timer 0x80d2518 tid=0xc004 at
1115841058.
backups-dir: ua_status.c:336 Connected to file daemon
backups-dir: ua_status.c:104 status:status:
backups-dir: ua_status.c:137 do_prompt: select daemon
backups-dir: ua_status.c:141 item=2
backups-dir: fd_cmds.c:87 Opened connection with File daemon
backups-dir: btimers.c:169 Start bsock timer 0x80d1a10 tid=0xc004 for
600 secs at 1115841069
backups-dir: cram-md5.c:120 sending resp to challenge:
oW+kAChWM5I3YUxuP//GYD
backups-dir: cram-md5.c:52 send: auth cram-md5
<[EMAIL PROTECTED]> ssl=0
backups-dir: cram-md5.c:70 Authenticate OK 0G+xN/RAFTpd7EhbDQQ/OA
backups-dir: btimers.c:183 Stop bsock timer 0x80d1a10 tid=0xc004 at
1115841069.
backups-dir: ua_status.c:336 Connected to file daemon
backups-dir: ua_prune.c:249 select sql=SELECT JobId from Job WHERE
JobTDate<1113249193 AND ClientId=2 AND PurgedFiles=0
backups-dir: ua_prune.c:279 Delete JobId=413
bacula-dir: src/pager.c:570: pager_playback_one_page: Assertion
`pPg->nRef==0 || pPg->pgno==1' failed.
Aborted

What you report is the directors log during user interaction, right?

What I *think* I see is that you start a job manually, and after selecting the client the director crashes, probably where it chooses a job to base a differential or incremental backup upon.

Now, I didn't read the source, esp. src/pager.c around lines 570, but for me it would be helpful to have some other information:
- First, screenshot of your interaction,
- Second, the relevant configuration (client, fileset, pools, storage)
- What OS and version of bacula runs on the client?
- Can you run other jobs on the client?
- Can you run identical jobs on the client?
- What has the catalog about Job 413?


If something with the database is wrong you can try to repair it.
If something with Job 413 as a reference job is wrong, you can run a new full backup.


Arno

It appears to be one particular backup that fails regularly.  When run
manually, others seem to complete, while this one fails.

I'd rather not dump the backups that have been made, but if it is
necessary it can be done.

David


------------------------------------------------------- This SF.Net email is sponsored by Oracle Space Sweepstakes Want to be the first software developer in space? Enter now for the Oracle Space Sweepstakes! http://ads.osdn.com/?ad_id=7393&alloc_id=16281&op=click _______________________________________________ Bacula-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/bacula-users

-- IT-Service Lehmann [EMAIL PROTECTED] Arno Lehmann http://www.its-lehmann.de


------------------------------------------------------- This SF.Net email is sponsored by Oracle Space Sweepstakes Want to be the first software developer in space? Enter now for the Oracle Space Sweepstakes! http://ads.osdn.com/?ad_id=7393&alloc_id=16281&op=click _______________________________________________ Bacula-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to