On Tue, Feb 3, 2009 at 9:33 PM, Marcin Cieslak <[email protected]> wrote: > I don't like this approach. I have always preferred software that "fails > fast". As soon as something is wrong - just abort with debugging information > what went wrong.
I don't think fails fast is incompatible with crash only software but rather that updating persistent storage should not be this big monlithic operation that should only be called regularly. > I see some issues with the approach described in the paper. It assumes that > the state saved is okay - I think that crashes occur _because_ internal > state is inconsistent or wrong. Sure, you can dump internal state regularly > for recovery - but it's like with backups - you never know which one is > really clean and okay until you try to restore. > I think that authors unnecessarily assume that software components are > "black boxes" that need to be kept up at all costs. This is not the right My reading was more to try and avoid the usual software development "tendency" that developers really don't like to think about things going wrong, so they spend time on code that feels "positive" like save routines, etc, and do as little stress testing of things as they can, and certainly with no regard to the users data when a programming error manifests. In contrast, if you're focused on making things robust in the case of a crash, you are actually forced to think about what can go wrong and how to ameliorate it. I tend to see this as most appropriate for applications dealing with transient data, eg, editors, user-modified-website stuff, etc, where you don't want to have guaranteed prisine data back to the beginning of time but where having the last day's modifications recoverable (possibly with some risk of corruption) is preferrable to a program essentially saying "I've crashed. Your recent data's gone. Deal with it. Here's a core dump for the developer though." > Sweeping problems under the carpet is not going to help much... I agree, and I don't think it's remotely appropriate for most software but it seems useful for niche applications to be concentrating on dealing with the dust (problems) rather than maintaining that in the next release there will never be any more dust generated (it'll be bug free). -- cheers, dave tweed__________________________ computer vision reasearcher: [email protected] "while having code so boring anyone can maintain it, use Python." -- attempted insult seen on slashdot
