On 16/1/20 3:24 pm, Lux, Jim (US 337K) via Beowulf wrote:
What I’m interested in is the idea of jobs that, if spread across many nodes (dozens) can complete in seconds (<1 minute) providing essentially “interactive” access, in the context of large jobs taking days to complete. It’s not clear to me that the current schedulers can actually do this – rather, they allocate M of N nodes to a particular job pulled out of a series of queues, and that job “owns” the nodes until it completes. Smaller jobs get run on (M-1) of the N nodes, and presumably complete faster, so it works down through the queue quicker, but ultimately, if you have a job that would take, say, 10 seconds on 1000 nodes, it’s going to take 20 minutes on 10 nodes.
But doesn't that depend a lot on what the user asks for, or am I misunderstanding?
All the best, Chris -- Chris Samuel : http://www.csamuel.org/ : Berkeley, CA, USA _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit https://beowulf.org/cgi-bin/mailman/listinfo/beowulf