This may be a hardware problem and not a Linux problem, but if so I'd
like to be able to find out.
I have a SCSI hard drive which is the only SCSI device currently in my
box. It appears to function properly in most respects [setup details
below], except that under certain circumstances it will suddenly hang my
entire system. Sort of a bummer.
The problem first manifested itself as I was trying to copy a very large
directory from /usr to the SCSI drive. Since then I've tried several
experiments to track down the nature of the problem with only limited
success.
One thing I tried was to tar -cvf my entire /usr directory onto the SCSI
drive, which produced a hang after the first few minutes. tar -czvf
went fine. Copying the resulting tgz file into a duplicate of itself
within said SCSI drive also worked okay. When I tried to untar it,
though, my machine froze.
e2fsck comes back clean every time, so I decided to try a surface scan.
badblocks doesn't find anything wrong in read-only mode, but when I try
it in read-write mode I get the dreaded system freeze again. The "good"
news is that the freeze is predictable. Here is a hand-copied
transcript of what happens when I try it:
# badblocks -o sdablox0 -vw /dev/sda 2097136
Flushing buffers
Checking for bad blocks in read-write mode
>From block 0 to 2097136
Writing pattern 0x99999999: done
Flushing buffers
Reading and comparing: done
Flushing buffers
Writing pattern 0x55555555: done
Flushing buffers
Reading and comparing: done
Flushing buffers
Writing pattern: 0xffffffff:
At this point the machine appears to slow way down, and finally hangs
completely. I don't get any errors in /var/log/messages, and the output
file (sdablox0 above) always comes out blank.
I've seen problems similar to this one with improperly terminated
devices, but according to the docs for my hard drive, I have it
terminated properly. I also have termination enabled on the card.
One other oddity: after running badblocks in read/write mode, fdisk will
have problems looking at the drive. The 'p' command will peg my
processor load and never come back unless I issue a break. The 'n'
command complains that I already have too many partitions and I must
delete some before I can continue. If I delete partitions 4, 3, 2, and
1, fdisk then functions normally. Before this little adventure started,
I had one partition on the drive which took up all the available space.
Setup info:
Zeos p133, 64M memory, RHL5.1, kernel version 2.0.34
BusLogic SCSI controller (unsure of model number, can check if it's
important)
Conner 2g hard drive, model CFP2107S, seen by fdisk as follows:
Disk /dev/sda: 64 heads, 32 sectors, 2048 cylinders
Units = cylinders of 2048 * 512 bytes
I'm stumped on this one. Like I said, I suspect faulty hardware, but I
have no way of being sure. Any hints on where to go from here would be
great.
Thanks,
m
--
PLEASE read the Red Hat FAQ, Tips, Errata and the MAILING LIST ARCHIVES!
http://www.redhat.com/RedHat-FAQ /RedHat-Errata /RedHat-Tips /mailing-lists
To unsubscribe: mail [EMAIL PROTECTED] with
"unsubscribe" as the Subject.