Michael P. Soulier said: > Would not desirable behaviour be to log as many errors as possible, > but > recover from the hardware problem? I see no reason why any software, user > space or kernel space, should crash due to errors in a peripheral. Bad > RAM is one thing, but errors on a CD? I disagree. This is incorrect > behavour for any OS.
to the system its not an error on the CD. its getting flooded with I/O errors on the disk controller. the system usually tries to cope by resetting the controller and trying again, but it reaches a point where the controller is screwed and the system stops responding. this is beyond control of the software. linux is not alone, ive crashed at least half a dozen different linux/unix and non unix systems doing the exact same thing. in my case its always been due to a CD-R disc. All my CDs are very clean, just sometimes a CDROM freaks out when reading very large files from CD-R media. think of it thisway. the software is at the mercy of the hardware, the software cannot prevent you from pulling the power plug, it cannot prevent a disk failure, it cannot prevent I/O errors, there is some things it can do to try to work around the problem, but PC hardware is so limited that sometimes all workarounds fail and the system crashes. want to hear something that really sucks? recently at my former company one of the raid systems I built .. 6 x 80GB raid 10 hardware array connected to a 3ware 8 port raid card. So this is hardware raid. Transparent to the OS .. every single time a disk fails it crashes the system(kernel panic), there is no reason for it to crash, the disk failure should be handled transparently in the background by the controller, the OS doesn't care if one of of 2 of a raid1 array fails, the data is still there in it's entirety on the 2nd disk. So why does it crash everytime? At the mercey of the hardware, buggy hardware ..i worked with 3ware and my vendor for 6 months last year to try to fix these things by changing hdd brands, upgrading power supplies etc.. and a year later the problem is still there. fuckin 3ware. now on good hardware, perhaps some high end sun or RS6000 stuff where there is a lot more redundancy(e.g. multiple independent PCI busses, multiple disk controllers, multiple redundant cpus), the software has much more flexiblity in preventing a complete failure when hit with such a situation. But even then, someone blownin holes in the front of the system with a shotgun is gonna be beyond control of the software to prevent a crash :) nate -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]