On 1/14/2014 12:04 PM, Roberts, Ben wrote:

Hi all,

 

I’ve recently setup a new Bacula director/storage daemon in preparation to move our existing backups to newer hardware. During testing, I’ve run into problems doing restores of backups taken to disk, failing with the messages:

 

Error: block.c:275 Volume data error at 24:4294944994! Wanted ID: "BB02", got "". Buffer discarded.

Fatal error: fd_cmds.c:169 Command error with FD, hanging up.

 

Similar errors are reported for both file-level backups, and block-level backups made using bpipe. I’ve seen the instructions in http://www.bacula.org/en/dev-manual/main/main/Restore_Command.html#SECTION0021100000000000000000, but these only seem to apply to tape backups rather than disk ones. Regardless, I’ve tried striping the positional information from the bootstrap file with no effect.

 

Some relevant notes from my testing:

-          The issue does not affect every backup made, but does affect a significant proportion tested.

-          A single job can be affected at multiple locations, i.e. skipping one affected file might see the job fail again at a subsequent file.

-          Attempting to restore the same job multiple times elicits failures at the same block each time. Re-running the job may produce a restorable backup, otherwise a job that will fail at a different location again. Other jobs fail at different locations.

-          All data is stored on ZFS, which reports completely clean of any checksum errors at the filesystem level

-          The server is not reporting any hardware issues, e.g. corrected or uncorrectable memory reads, disk accesses etc.

-          The backup jobs are multiple TB in size, and restores frequently fail within the first couple hundred GB.

-          The storage daemon is configured with a disk-changer backed autochanger, writing to 100GB volumes, all residing within the same ZFS filesystem (sitting atop a large RAID-Z2 disk array).

 

The director is running “Version: 5.0.2 (28 April 2010) i386-pc-solaris2.10 solaris 5.10” (compiled on solaris 5.10, running on 5.11). Storage daemon runs on the same machine as the director.  (I’m loosely tied to this version so the director can interact with a storage daemon on another machine connected to a tape changer).

A sample client is running “Version: 5.2.13 (19 February 2013)  i386-pc-solaris2.11 solaris 5.11”.


Could you downgrade the client to 5.0.2? I know SD and DIR are backward compatible with older clients, but I'm not so sure what happens when the client is a newer version.

 

From my understanding of how the Bacula components fit together, I suspect the corruption must be happening in the Storage daemon (since this is the only component that would be interested in the BB02 block header?) before the data is written to disk (otherwise ZFS would be reporting read/write errors).

 

Is this an issue that’s been seen before on other disk backups? Can anyone provide any assistance in locating and fixing the cause of the corruption? Any help would be greatly appreciated.

 

Regards,

 

Ben Roberts

IT Infrastructure

 

--- Relevant config excerpts:

 

Autochanger {

  Name = backup3-autochanger

  Device = drive-restore-backup3, drive-1-backup3

  Device = drive-2-backup3, drive-3-backup3

  Device = drive-4-backup3, drive-5-backup3

  Changer Device = /data2/bacula/storage/backup3-autochanger.conf

  Changer Command = "/opt/bacula/etc/disk-changer %c %o %S %a %d"

}

 

Device {

  Name = drive-1-backup3

  Archive Device = /data2/bacula/storage/backup3-autochanger/drive1

  Device Type = File

  Media Type = File-backup3

  AutoChanger = yes

  Removable media = no

  Random access = yes

  Requires Mount = no

  Always Open = no

  Label Media = yes

  Maximum Changer Wait = 180

  Drive Index = 1

  Maximum Spool Size = 100G

}

...

 

Storage {

  Name = backup3-sd

  Address = backup3.local

  Device = backup3-autochanger

  Media Type = File-backup3

  Autochanger = yes

}

 

Pool {

    Name = Disk-45Day-backup3

    Pool Type = Backup

    Recycle = yes

    AutoPrune = yes

    Job Retention = 45 days

    Volume Retention = 45 days

    Label Format = Disk-45Day-backup3-

    Storage = backup3-sd

    Maximum Volume Bytes = 100G

}



This email and any files transmitted with it contain confidential and proprietary information and is solely for the use of the intended recipient. If you are not the intended recipient please return the email to the sender and delete it from your computer and you must not use, disclose, distribute, copy, print or rely on this email or its contents. This communication is for informational purposes only. It is not intended as an offer or solicitation for the purchase or sale of any financial instrument or as an official confirmation of any transaction. Any comments or statements made herein do not necessarily reflect those of GSA Capital. GSA Capital Partners LLP is authorised and regulated by the Financial Conduct Authority and is registered in England and Wales at Stratton House, 5 Stratton Street, London W1J 8LA, number OC309261. GSA Capital Services Limited is registered in England and Wales at the same address, number 5320529.



------------------------------------------------------------------------------
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments & Everything In Between.
Get a Quote or Start a Free Trial Today. 
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk


_______________________________________________
Bacula-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/bacula-users



------------------------------------------------------------------------------
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments & Everything In Between.
Get a Quote or Start a Free Trial Today. 
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
_______________________________________________
Bacula-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to