weakly correlated with failure. However, of all the disks that failed, less than half (around 45%) had ANY of the "strong" signals and another 25% had some of the "weak" signals. This means that over a third of disks that failed gave no appreciable warning. Therefore even combining the variables would give no better than a 70% chance of predicting failure.

well, a factorial analysis might still show useful interactions.


number of disks. For example, among the disks that failed, many had a large number of seek error; however, over 70% of disks in the fleet -- failed and working -- had a large number of seek errors.

was there any trend across time in the seek errors?


So that's our master plan.  Just don't tell anyone. :)

hah.  well, if it were me, the M.P. would involve some sort of proactive
treatment: say, a full-disk read once a day. smart self-tests _ought_ to be more valuable than that, but otoh, the vendor probably munge the measurements pretty badly.

regards, mark hahn.
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to