> Mind you, I'm a huge fan of small clusters under a single person's control, >where nobody is watching to see if you are making 'effective utilization' >and you can do whatever you want. A personal supercomputer, as it were. >But I recognize that for much of the HPC world, clusters are managed in the >same way as big iron mainframes were in the 70s,
I think you're being a bit disingenuous here. dedicated/personal clusters are perfectly sensible when the workload is non-bursty or somehow otherwise high-duty-cycle. or perhaps when you're talking about resources cheap enough to hand out like pencils. (that is, let's be honest: cheap enough to waste.) a larger, shared resource pool is ideal for bursty/low-DS environments. as far as I can see, there are really only a couple problems with this: - many people and most environments have a mixture of burstiness. - schedulers are not awesome at managing latency of either flavor when both are mixed, especially in the presence of poor resource requirements (bad runtime estimates, poor memory requirements, etc.) - resource granularity becomes even more of a problem: serial jobs "contaminate" nodes for parallel use or high vs low mem, etc. - very short runtime limits permit more rebalancing of resources, but are incredibly harmful to most people's productivity. - preemption (SIG_STOP/CONT) seems to be a relatively little-used way to optimize for latency - enough so that it simply does not work right on major non-free schedulers. - it's hard to get people to treat storage as ephemeral :( - big resources are also big budget targets :( _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf