On 06/12/2012 03:47 PM, Skylar Thompson wrote: > We manage this by having users run this in the same Grid Engine > parallel environment they run their job in. This means they're > guaranteed to run the sync job on the same nodes their actual job runs > on. The copied files change so slowly that even on 1GbE network is > rarely a bottleneck, since we only transfer files that are changed.
Our problem is we have many users and don't want 50,000 30 minute jobs to turn into a giant jobs that defeats the priority system while running. With an array job users can get 100% of the cluster if it's idle and quickly decay to their fair share when other higher priority jobs run. That way we can have the cluster 100% utilized, but new jobs (from users using less than their fair share) can get through the queue (which might well be months long) quickly. _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf