In message <[EMAIL PROTECTED]> you wrote:
>
> Then it sounds to me more like a bacula issue rather than the SCSI tape
> driver.
I disagree. We get pretty clear SCSI error messages (unexpected
disconnect). No matter what a user application does, the SCSI driver
must never run into such a situation. This is a SCSI driver problem.
> A problem in diagnosing it is that it is not reproducible. This could
> indicate a
The problem *is* reproducable. For me it happens pretty reliably. The
problem is that it takes a loooooong time - typicly hours. And I have
to admit that I didn't find (or take) the time to really start
debugging it. Probably raising debug levels for the SCSI system would
be a good start, but I'm not convinced.
BTW: I wrote befor that this happens without spooling only; this was
wrong. Scanning the logs I've seen cases of this problem when
spooling was active, too.
> timing issue as you've pointed out so if a trace is set up to catch the
> villain
> the incident may not occur at all. What can we do?
Let's summarize the observed symptoms again:
* On user level we see error messages like these:
Error: block.c:538 Write error at 39:5706 on device "SLR100"
(/dev/nst0). ERR=Input/output error.
Error: Error writing final EOF to tape. This Volume may not be readable.
dev.c:1536 ioctl MTWEOF error on "SLR100" (/dev/nst0). ERR=Input/output
error.
* On system level we see error messages like these:
sym0: unexpected disconnect
st0: Error 700ff (sugg. bt 0x0, driver bt 0x0, host bt 0x7).
sym0: unexpected disconnect
st0: Error 700ff (sugg. bt 0x0, driver bt 0x0, host bt 0x7).
st0: Error with sense data: <6>st0: Current: sense key: Unit Attention
Additional sense: Power on, reset, or bus device reset occurred
* It happens with different types of tape drives; for me with a SLR60
driver and 3 x SLR100 autoloaders.
* It happens with different types of SCSI controllers; for me with:
- LSI Logic / Symbios Logic 53c1030 PCI-X Fusion-MPT Dual Ultra320 SCSI
- Adaptec aic7899 Ultra160 SCSI adapter
- Adaptec AHA-2940UW Ultra SCSI adapter
- Dawicontrol DC-29160 Ultra160 SCSI adapter
* It happens long before the tape is actually full.
* I never had any other kinds of I/O errors, only this "Error writing
final EOF"; this boils down to a MTIOCTOP ioctl() with op=MTWEOF
and count=1 - and this is probably the major difference to all
other tape tests I've tried: none of the other tools I use to write
to a tape (like tar etc.) actually write an EOF themself; they just
close the tape device at the end of the write operations.
Maybe I'm going to write some test code for such a szenario - write
some buffers followed by an MTWEOF op...
Best regards,
Wolfgang Denk
--
Software Engineering: Embedded and Realtime Systems, Embedded Linux
Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: [EMAIL PROTECTED]
The only way to learn a new programming language is by writing pro-
grams in it. - Brian Kernighan
-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid0944&bid$1720&dat1642
_______________________________________________
Bacula-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/bacula-users