I am running a new Bareos setup in production since about a month, there is 1 annoying issue left.
For most of the servers, in the nightly back-up, the connection from the Director to the FileDaemon seems to fail with a "Connection has expired" error, it is retried then, and then it succeeds without a problem. I have not yet automated the monitoring, so I am manually matching the failed back-ups with a succeeded one a few hours later, and that is a bit of tedious thing to do.. This also happens when the Director is making a back-up of itself, so I don't really expect network issues. Environment: Director + StorageDaemon on FreeBSD 10.0, Bareos version 13.2.3 A few FD's in the same Datacenter, most in another DC connected using dedicated fiber. The back-up clients have all been setup this year, all Debian Wheezy x64 servers running Bareos 13.2.3 My guess was that it has something to do with concurrent jobs, my theory being that 2 jobs are started and that one of the 'components' is not able to do this and as such the connection to the FD is Idle for too long. For this reason, and for other reasons (spooling), I had changed the 'max concurrent jobs' setting everywhere to 1. This caused my VirtualFull back-ups to fail in the weekend (which I knew, but had forgotten..). My new setting is max concurrent jobs = 2 for the director, and set to 1 for all back-up jobs. esyst-bareos-dir Using Device "backup2-disk1" to write. Start Backup JobId 939, Job=job-webnode3.2014-12-04_23.05.00_31 esyst-bareos-dir Warning: bsock.c:120 Could not connect to Client: webnode3-fd on my.ip.addr:9102. ERR=Verbinding is verlopen (translated: Connnection has expired) Retrying ... esyst-bareos-dir Error: Bareos esyst-bareos-dir 13.2.3 (11Mar14): Build OS: x86_64-pc-linux-gnu debian Debian GNU/Linux 7.0 (wheezy) JobId: 939 Job: job-webnode3.2014-12-04_23.05.00_31 Backup Level: Incremental, since=2014-12-04 03:44:12 Client: "webnode3-fd" 13.2.3 (11Mar14) x86_64-pc-linux-gnu,debian,Debian GNU/Linux 7.0 (wheezy) FileSet: "fs-webserver" 2014-11-07 12:18:28 Pool: "File" (From Job resource) Catalog: "MyCatalog" (From Client resource) Storage: "backup2-store1" (From Pool resource) Scheduled time: 04-dec-2014 23:05:00 Start time: 05-dec-2014 02:26:14 End time: 05-dec-2014 02:29:24 Elapsed time: 3 mins 10 secs Priority: 10 FD Files Written: 0 SD Files Written: 0 FD Bytes Written: 0 (0 B) SD Bytes Written: 0 (0 B) Rate: 0.0 KB/s Software Compression: None VSS: no Encryption: no Accurate: yes Volume name(s): Volume Session Id: 286 Volume Session Time: 1417004145 Last Volume Bytes: 0 (0 B) Non-fatal FD errors: 1 SD Errors: 0 FD termination status: Error SD termination status: Waiting on FD Termination: *** Backup Error *** Fatal error: No Job status returned from FD. Fatal error: bsock.c:126 Unable to connect to Client: webnode3-fd on my.ip.addr:9102. ERR=Onderbroken systeemaanroep (translated: interrupted system call) -- You received this message because you are subscribed to the Google Groups "bareos-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. For more options, visit https://groups.google.com/d/optout.
