Kern Sibbald schrieb: > > I wrote a mail to the -users list about problems with verify jobs, > > that may or may not hardware related. > > > > Now I have an additional question to the developers. > > > > my 2 org. mails: > > >> Now the VolumeToCatalog verify job fails each time. I tried two > > >> different drives with the same result. The job fails not always at > > >> the > > >> same position or tape. > > >> > > >> > > >> 16-Apr 14:58 VU0EA003-sd JobId 11277: Forward spacing Volume > > >> "A00147L4" to file:block 0:1. > > >> 16-Apr 17:46 VU0EA003-sd JobId 11277: Error: block.c:318 Volume data > > >> error at 406:6128! Block checksum mismatch in block=26762981 > > >> len=64512: calc=dae26793 blk=b522f8d9 > > >> > > >> 17-Apr 13:55 VU0EA003-sd JobId 11292: Forward spacing Volume > > >> "A00141L4" to file:block 0:1. > > >> 17-Apr 18:35 VU0EA003-sd JobId 11292: Error: block.c:318 Volume data > > >> error at 657:11272! Block checksum mismatch in block=56358402 > > >> len=64512: calc=5d582a92 blk=63befe58 > > >> > > >> > > >> > > >> There are no SCSI errors in the linux syslog or errors in the > > >> changers > > >> system log. > > >> > > >> Any Idea what to do next? It looks like a hardware problem, but why > > >> does it then fail on different drives and different tapes and not at > > >> the same position? > > > > > >Hm, I used btape with the option scanblocks with the tape where the > > >verify job had block checksum mismatch. > > > > > >[...] > > >15500 blocks of 64512 bytes in file 822 > > >End of File mark. > > >8243 blocks of 64512 bytes in file 823 > > >End of File mark. > > >Total files=823, blocks=12749243, bytes = 822,479,100,004 > > > > > > > > >But this seems not to check the the block checksum that is checked > > >during a verify > > > > > >Is there an other bacula tool to check the block checksum? > > Since I don't know what version of Bacula you are using, nor exactly what > commands produced the errors above, I can only respond in general.
Sorry, this was in my original mail to the -users list. It's bacula 2.4.4 on debian etch. The above error occured during 2 verify jobs of the same backup job. Not the same tape, not the same drive. > If you are getting block checkum errors, it means that the data that was read > is not the same as the data that was written. I'd be very surprised if there > are not SCSI errors noted in the log. I'd also be very surprised if there > are not errors reported by the drive itself (you should definitely enable > alert checking and run manual alert checks on your drive). No scsi errors in the bacula, syslog or changer log. Neither during backup nor during verify. > The first thing to do is to do a controlled back (i.e. known files, small > number of files). Verify that there are check sum errors. Restore the > backup (check for check sum errors) and compare the files on disk versus the > files restored. The problem is, that the backup job is ~10 TB large and the checksum errors didn't occur at the same position or tape. So where to start to be sure that there was/is no problem. I've to think about it... > > How can I check the block checksum of a tape (not the whole backup > > job) with one of the bacula tools? > > Btape scanning reads blocks and does not look at the block data (e.g. the > block checksum is in the block header). > > Checksum verification is almost certainly enabled with bextract, bcopy and > bscan, bls, ... (in short any program that looks at the contents of the > blocks), but that approach seems to me not to be very useful. What counts is > whether you get the right data when you restore using Bacula. I did a complete bscan of the tape where the checksum error occured the second time. No error this time. [...] bscan: bscan.c:410 Record: SessId=20 SessTim=1239118594 FileIndex=443 Stream=2 len=65536 18-Apr 11:56 bscan JobId 0: End of file 823 on device "ULTRIUM-TD4-D3" (/dev/ULTRIUM-TD4-D3), Volume "A00141L4" 18-Apr 11:56 bscan JobId 0: End of Volume at file 823 on device "ULTRIUM-TD4-D3" (/dev/ULTRIUM-TD4-D3), Volume "A00141L4" bscan: bscan.c:323-0 ========== JobId=0 ======== 18-Apr 11:56 bscan JobId 0: End of all volumes. bscan: bscan.c:410 Record: SessId=0 SessTim=0 FileIndex=-6 Stream=0 len=0 bscan: bscan.c:637 End of all Volumes. VolFiles=823 VolBlocks=0 VolBytes=822,020,123,328 > > Is there a way to verify an older backup? AFAIK a verify job only > > verifies against the last jobid. Now I have the problem, that in the > > meantime an incremental job finished, so I can't verify the full > > backup that had the block checksum errors. > > I believe that there is a way to enter the jobid for a "manual" verify as > opposed to automatic verify -- read the manual. Hm, I still think this was a limitation of the verify code and I cant find any way to tell bacula to verify a given jobid. Selection aborted, nothing done. Run Verify job JobName: VerifyVU0EM003-FBR Level: VolumeToCatalog Client: VU0EA003-fd FileSet: VU0EM003-FBR Pool: 2-Month-Full (From Job resource) Storage: Neo4100-LTO4-D2 (From Pool resource) Verify Job: VU0EM003-FBR Verify List: When: 2009-04-18 12:52:14 Priority: 10 OK to run? (yes/mod/no): Thanks for your reply Kern, Ralf ------------------------------------------------------------------------------ Stay on top of everything new and different, both inside and around Java (TM) technology - register by April 22, and save $200 on the JavaOne (SM) conference, June 2-5, 2009, San Francisco. 300 plus technical and hands-on sessions. Register today. Use priority code J9JMT32. http://p.sf.net/sfu/p _______________________________________________ Bacula-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/bacula-devel
