weakly correlated with failure. However, of all the disks that failed, less
than half (around 45%) had ANY of the "strong" signals and another 25% had
some of the "weak" signals. This means that over a third of disks that
failed gave no appreciable warning. Therefore even combining the variables
would give no better than a 70% chance of predicting failure.
well, a factorial analysis might still show useful interactions.
number of disks. For example, among the disks that failed, many had a large
number of seek error; however, over 70% of disks in the fleet -- failed and
working -- had a large number of seek errors.
was there any trend across time in the seek errors?
So that's our master plan. Just don't tell anyone. :)
hah. well, if it were me, the M.P. would involve some sort of proactive
treatment: say, a full-disk read once a day. smart self-tests _ought_
to be more valuable than that, but otoh, the vendor probably munge the
measurements pretty badly.
regards, mark hahn.
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf