----- "Håkon Bugge" <h-bu...@online.no> wrote: > What we did in Platform (Scali) MPI, was to drain > the HPC interconnect, then close it down. The problem > was then reduced to checkpoint (e.g. using BLCR) > N processes.
I suspect this is what Open-MPI does too, but I don't know if the VM based systems can migrate such jobs without this application layer support. cheers, Chris -- Christopher Samuel - (03) 9925 4751 - Systems Manager The Victorian Partnership for Advanced Computing P.O. Box 201, Carlton South, VIC 3053, Australia VPAC is a not-for-profit Registered Research Agency _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf