Hi folks, and happy new year! We have seen people posting about the kernel error message of Lost_Interupt (or something similar) reasonably regularly, and then people usually post back saying 'your HD is about to die, replace it', and everyone then goes back on with their lives.
I too have had this error, and assumed that the HD was just going bad (which was confirmed by doing a BIOS low-level format). No-one likes it when a HD dies, but everything has a useful life-span. However, I recently installed debian onto a machine, where this error again showed up, twice, on different drives (on oldish, one brand new, both from different manifacturers). The larger, newer drive was the first to go - the drive made weird noises, spouted the error lots of times. When I got around to running a fsck on the drive, most of the contents had ended up in lost+found, and I basically had to write the installation off. I tried to convice the owner of said computer that it might have been a faulty drive (there is an identical one running in a Mandrake 6.1 machine which has had no problems). I then got the smaller, older drive in the same machine, and reinstalled onto that (I have been using this smaller drive as the root drive, with /home and /var on the larger, newer drive). After a few hours of reinstalling everything (the machine had not yet gone into production, so there was no backup), and installing the latest kernel (2.2.13, 2.2.12 had been running when the drive died), everything seems håppy. I get back from my new-years camping trip, and about an hour later I get a call to say that the machine has died again, this time with the other (older drive). Same symptoms - the owner turned off the computer when the drive went nuts (those who had heard it know it's not a pretty sound to hear from a delicate piece of electronics!), and although I have yet to see it in person, I'm not going to surprised if this drive is cactus as well (this time we have a backup). So, what do we have. Two drives - different sizes, different ages, different manifacturers, one the primary master, the other the secondary master. Two different kernels, both with CONFIG_IDEDMA_AUTO=n (which some have suggested might help), both with all the various IDE chipset workarounds enabled. An extremely vanilla installation of debian 2.1 (with all the latest official add-ons, and non of the non-offical ones). We have an identical drive working flawlessly in both other linux and windows NT machines. The same machine (with same drives) also ran with no problems with an NT installation). In both cases, the problem manifested after the machines had been running 24/7 for a few weeks (not sure exacly how long, but at least 14 days). It is *possible* that both drives were faulty and about to die anyway, but it looks very unlikely to me that they would both die in the same machine in the same way. Unfortunately I don't have the make and model of the motherboard with me, but it was running a Cyrix 266 chip in a fairly generic motherboard, the same combination we run in other machines with no problems that I am aware of. If I had to guess, I'd say that maybe the kernel didn't like the IDE controller (don't know make/model again), but it sounds like a pretty lame excuse when other OSs didn't have any problem with it. This machine was to be our new main server, running mail, dns, web, ppp, firewall, all the mod cons. I managed to successfully argue running debian, because if I was administering it, I wanted something I knew well. Of course, since we have never had a problem with any of our other RedHat or Mandrake boxes, Debian is being singled out as the culprate. I'm being told I should install RedHat, and forget debian, as it's the cause of all the woes in the world. I'd be very suprised if anything partiular in debian was the problem, more likely to be a kernel issue, I would think, which means it's distribution independent. But if I don't come up with a solution soon, it's going to be back to redhat (or worse still, NT)... Switch MBs/Machines might be a solution, but the sad fact is that if I have to use a new MB, I'm going to be going back to a P100 or something, which is not an accpetable solution, as far as I'm concerned. I *know* this issue has come up before, and I'm pretty sure no-one has suggested a plausible solution other than 'dump the hardware'. Should I just swap motherboards, go back to an underpowered machine (yes, it's all relative, I know, but I've had to fight to get good hardware for the linux servers)? Is there a chance it's debian related? Any suggestions will be greatly appreciated. cheers, damon -- Damon Muller ([EMAIL PROTECTED]) / It's not a sense of humor. * Criminologist / It's a sense of irony * Webmeister / disguised as one. * Linux Geek / - Bruce Sterling