While not strictly Clojure-related, I thought I'd share this idea with you
here because (1) I came up with it while thinking about design from a
Clojure / functional point of view and (2) I respect your opinion. It's
very likely you'll have better ideas...
*I'm in an highly constrained situation*
- When an Incident occurs (possible bug, bad behavior), I'm told days
later and I never have access to the machine where the code ran
- I only have the following
- a log file with size constraints (no more than n KB)
- the version of the code that was running
- I don't have the following
- core dumps (they'd be bigger than the n KB anyway, plus no one
knows a priori when to persist one for Incidents where the code fully
believed it was doing fine)
- complete info on how to re-create the environment (only partial
info)
- since the code can be running on any kind of machine with any kind
of configuration
- and since there are a lot of other applications of various
versions running as well
- besides, even if I had complete info, actually re-creating such
an environment would be very time consuming and error prone
Figuring out what went wrong has been *painful*.
But if I had access to all the values that a program *obtained/received*
from its environment leading up to the Incident then I could just have my
program use these values while running in a debugger.
*The basic idea is *
1. Log *external* values used by the program over time in production.
Don't worry about internal / local values since they are all derived
functionally from these external values.
2. When an Incident occurs, load this log and a final time-stamp into
the program's "state map"
3. Any time the program needs a value from the outside, it uses a value
from the state map instead
4. Set breakpoints and debug away (I'm stuck using C++ (sadness!))
I like this because minimal time is spent re-creating the crime scene. I
just have to tweak the program to start the task / thread in question after
it's done loading the state. I won't have to ask QA "do you have a test
environment where this problem is reproducible?" And I won't be making any
mistakes in reproducing the Incident because all the values used will be
loaded in an automated fashion.
*Considerations*
- Since values may be large, I may have to tweak the logging to enable
re-using a value from earlier if it hasn't changed instead of logging it
all over again. (current / expired / re-use)
- The program would be checking if the value has changed for those
values which have "expired" (that is, values expire if the task they're
related to has finished -- when that task starts up again the program
would
check the map for a value that it needs, find that it has expired, and go
fetch the current one from the environment. Then it can decide how to log
it in the state log.)
- I have to make sure every value I need would be within the most
recent n KB of log. I may have a separate thread that logs a snapshot of
the entire state every n KB.
- I'm forced to change "every" external access into a conditional that
checks the state map first.
- sections of code can always opt out of this as long as
- I don't think I'll need to debug it esp. if it's been working fine
for months
- it's basically separate from the rest of the code (i.e. it won't
be involved in re-creating any Incidents in other code)
- State maps make the code more test-able since I can make the
program "see" any kind of arbitrary weirdness.
I'm very interested to know what you think of this. It does smell
heavy-handed to me -- but having something like it would alleviate a ton of
pain... It could be worth it. Thanks in advance for any feedback.
--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to [email protected]
Note that posts from new members are moderated - please be patient with your
first post.
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to the Google Groups
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.