Craig Tierney <[EMAIL PROTECTED]> writes: > Allowing users to run for days or weeks as SOP is begging for failure.
Define failure. Our time limit is typically somewhere around 5 or 6 days. Many codes don't have checkpointing, and it's often simply not possible to add it because you don't have access to the source code. With backfill scheduling, short and narrow jobs typically don't have to wait *that* long, at least with the job mixture we see. -- Leif Nixon - Systems expert ------------------------------------------------------------ National Supercomputer Centre - Linkoping University ------------------------------------------------------------ _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf