On 12/8/09 9:22 AM, "james bardin" <jbar...@bu.edu> wrote:
> On Tue, Dec 8, 2009 at 10:50 AM, Prentice Bisbal <prent...@ias.edu> wrote:
>
>> You'd hope that. Most of my current clusters users are scientific
>> researchers in academia, not computer scientists. While some are
>> extremely computer savvy, others have learned just enough about
>> programming to do their calculations. Expecting the latter to write code
>> with checkpointing is unrealistic, and working in academia, I can't
>> force them to. Which is why taking down 4 nodes instead of just one is
>> less than ideal.
>>
>
> I find it's still advantageous to push them to learn it. A researcher
> working with a tight deadline for a grant will often see the light
> when a hardware failure loses them a month or more of data processing.
> It really is in their own best interests to learn about their tools.
What about some form of "image checkpoint" like "hibernation"... Should be
application unaware, just snapshots memory.
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf