Hi i hope you are good and happy and on holiday, i am working today Smile
so we have a new esx 5.5.2 host on which a vm is present: a backupserver running bacula, an open source backup solution. it was running on another esx system with vcenter and stuff, and it did not have same problem there... the backup is failing only for one particular host (op5master server) to be backed up which backup filesize is way bigger than others and scripts needs a lot of time ON the remote host to collect the backup data together in order to prepare a 3,5 GB big file 12:55 backup job starting 12:55 bacula logs into machine to be backuped and is collecting and taring files waiting a long time until taring is finished, in the webinterface i see the job is running and the duration counts until approx 31 minutes 13:26 bacula backupjob finished with state successful on machine to be backuped the backup is ready to be taken away by the baculaserver, which will never happen 13:29 in the webinterface i see that the duration of the job is reset to 0 minutes and counts again until approx 15 minutes but the log is already written and states backup successul, which means the info is collected on the backuped server but not yet transfered on the bacula server and the storage, thats correct, i checked (now i dont know what happends) i dont see any logs progressing even bacula console debug is turned on and nothing happends in the log files i think something is waiting for a timeout which seems to be 15 minutes 13:44 i have the following in my logfiles: Baculaserver / messages: Dec 23 13:44:22 suorva bacula-dir: 23-Dec 13:44 Message delivery ERROR: Mail program terminated in error.#012CMD=/usr/sbin/bsmtp -h localhost -f "(Bacula) <root@localhost>" -s "Bacula: Backup Fatal Error of op5master.x.x.x Incremental" root@localhost#012ERR=Child exited with code 1 Baculaserver / log file: 2014-12-23 13:28:35op5master.br.arn.se JobId 30253: ClientRunBeforeJob: 2014-12-23 12:28:35 INFO - Backup was successfully created 2014-12-23 13:44:42op5master.br.arn.se JobId 30253: Fatal error: Bad response from stored to open command 2014-12-23 13:44:22suorvadirector JobId 30253: Error: Director's comm line to SD dropped. 2014-12-23 13:44:22suorvadirector JobId 30253: Error: Bacula suorvadirector 5.2.13 (19Jan13): Build OS: x86_64-redhat-linux-gnu redhat Enterprise release (.......) Scheduled time: 23-Dec-2014 12:55:25 Start time: 23-Dec-2014 13:28:14 End time: 23-Dec-2014 13:44:22 Elapsed time: 16 mins 8 secs Priority: 10 FD Files Written: 0 SD Files Written: 0 FD Bytes Written: 0 (0 B) SD Bytes Written: 0 (0 B) Rate: 0.0 KB/s Software Compression: None VSS: no Encryption: no Accurate: no Volume name(s): Volume Session Id: 735 Volume Session Time: 1418474844 Last Volume Bytes: 1 (1 B) Non-fatal FD errors: 1 SD Errors: 0 FD termination status: Error SD termination status: Error Termination: *** Backup Error *** Baculaserver / bacula-sd.trace: rsgsuorvasd: dircmd.c:220-0 <dird: cancel Job=op5master.x.x.x.2014-12-23_12.55.25_05 rsgsuorvasd: dircmd.c:234-0 Do command: cancel rsgsuorvasd: pythonlib.c:225-0 No startup module. +---------------------------------------------------------------------- |This was sent by [email protected] via Backup Central. |Forward SPAM to [email protected]. +---------------------------------------------------------------------- ------------------------------------------------------------------------------ Dive into the World of Parallel Programming! The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net _______________________________________________ Bacula-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/bacula-users
