Hello List, we're running a heterogeneous cluster (just x86_64, but a lot of different node types from 8 to 64 HW threads, 1 to 4 GPUs). Our processing power (for our main application, at least) is exclusively provided by the GPUs, so cons_tres looks quite promising: depending on the size of the job, request an appropriate number of GPUs. Of course, you have to request some CPUs as well -- ideally, evenly distributed among the GPUs (e.g. 10 per GPU on a 20-core, 2-GPU node; 16 on a 64-core, 4-GPU node). Of course, one could use different partitions for different nodes, and then submit individual jobs with CPU requests tailored to one such partition, but I'd prefer a more flexible approach where a given job could run on any large enough node.
Is there anyone with a similar setup? Any config options I've missed, or do you have a work-around? Thanks, A. -- Ansgar Esztermann Sysadmin Dep. Theoretical and Computational Biophysics http://www.mpibpc.mpg.de/grubmueller/esztermann
smime.p7s
Description: S/MIME cryptographic signature