On Thursday 01 March 2012 18:22:12 Rayson Ho wrote: > Torque has job preemption. I have not used Torque for a while (I > used to be an OpenPBS user) so I am not the best person to answer > the question for Torque. However, if you google "job > preemption"+torque, you should be able to find some useful info.
It also integrates with BLCR (Berkeley Lab Checkpoint/Restart) to provide kernel level checkpointing support for Linux, including for MPI applications if you are using Open-MPI. Useful links: BLCR: https://ftg.lbl.gov/projects/CheckpointRestart/ Open-MPI and BLCR: http://www.open-mpi.org/faq/?category=ft Torque and checkpoint/restore and BLCR: http://www.clusterresources.com/torquedocs/2.6jobcheckpoint.shtml Hope this helps! Chris -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf