On Sat, May 15, 2010 at 07:33:08AM -0700, Skylar Thompson wrote: > I'm not quite sure I understand what you're doing, but if you make all > your execution hosts submit hosts as well you can submit jobs within > your running jobs. You can use "-now y -sync y" in your jobs to ensure
Yes, that's what I did with SGE, that part works fine. SGE's other behaviors often leave much to be desired. E.g., "reschedule_unknown". By default, SGE marks a node as down only when the node's execd daemon comes back *up*! So if the node hits a kernel oops, reboots, and successfully restarts its execd, everything is fine - SGE notices that the machine crashed, and reschedules whatever job was running on it at the time. But if the node just stays down permanently, or worse, if it goes entirely catatonic, SGE *never* considers the node down, and will *never* reschedule the job elsewhere! The job remains in limbo indefinitely until some human intervenes. Of course there is a setting to make SGE behave in a more sane way, it's called "reschedule_unknown". It basically defines a timeout, where if SGE can't get a response from a node within that time, SGE restarts that node's jobs elsewhere. This was all exceedingly non-obvious. I only figured it out by reading Templeton's detailed "FridayTutorial.pdf" slides discussing many practical aspects of SGE, which unfortunately have since vanished from the web: http://www.globusworld.org/documents/FridayTutorial.pdf http://gridengine.sunsource.net/nonav/source/browse/~checkout~/gridengine/doc/htmlman/htmlman5/sge_conf.html?pathrev=V62u2_TAG Unfortunately, even after the reschedule_unknown fix I still see occasional job lockups with SGE, where my master process stalls indefinitely until I manually notice and tell SGE to kill and restart some hung child job. I haven't yet sunk the debugging time into figuring out just what the heck is really going on there. (And it could well be something that's not SGE's fault at all, of course.) That isn't the only snafu I've had with SGE, just one of the more memorable one. I am by no means an SGE expert, nor even a particularly experienced user, but it has mostly struck me as klunky and rather programmer unfriendly. Basically, I ended up using SGE due to historical accident, and my hands-on experience with it has encouraged me to take a step back and evaluate other toolkit options. > > 4. I really, really want a good API for programmably interacting with > > the cluster scheduler and ALL of its features. I don't care too much > I haven't looked at it much, but I think DRMAA will work for that in SGE. Not as far as I could tell from reading the SGE docs a while back, no. It looked as if DRMAA only covers a very limited subset of SGE's functionality, not enough to cover the features I need. I did not (yet) check the source to see how SGE's DRMAA support is implemented, but the docs made it sound as if they were rolling it from scratch rather than simply building on top of some clear pre-existing SGE API. > > 8. Of course the scheduler must have a good way to track all the basic > > information about my nodes: CPU sockets and cores, RAM, etc. Ideally > > it'd also be straightforward for me to extend the database of node > SGE does this and can make it available as XML. Which reminds me, I need to look harder to figure out WHERE exactly SGE stores its node configuration data, and how I can perhaps extend it with additional information, like the network topology between my nodes. This is probably simple but it wasn't obvious from the (voluminous) SGE docs. -- Andrew Piskorski <a...@piskorski.com> http://www.piskorski.com/ _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf