On 10/15/12 11:07 AM, "Mark Hahn" <h...@mcmaster.ca> wrote:
>> Mind you, I'm a huge fan of small clusters under a single person's >>control, >>where nobody is watching to see if you are making 'effective utilization' >>and you can do whatever you want. A personal supercomputer, as it were. >>But I recognize that for much of the HPC world, clusters are managed in >>the >>same way as big iron mainframes were in the 70s, > >I think you're being a bit disingenuous here. dedicated/personal >clusters are perfectly sensible when the workload is non-bursty >or somehow otherwise high-duty-cycle. or perhaps when you're >talking about resources cheap enough to hand out like pencils. >(that is, let's be honest: cheap enough to waste.) That's a good way to describe it.. Cheap enough to waste (or, cheaper to let the computer idle waiting for the user rather than the user idling, waiting for the computer) > >a larger, shared resource pool is ideal for bursty/low-DS environments. Especially if the bursts are "planable".. > >as far as I can see, there are really only a couple problems with this: > >- many people and most environments have a mixture of burstiness. Yes. > >- schedulers are not awesome at managing latency of either flavor > when both are mixed, especially in the presence of poor resource > requirements (bad runtime estimates, poor memory requirements, etc.) Yes.. (whether the scheduler is human or algorithmic) > >- resource granularity becomes even more of a problem: serial jobs > "contaminate" nodes for parallel use or high vs low mem, etc. > >- very short runtime limits permit more rebalancing of resources, > but are incredibly harmful to most people's productivity. > >- preemption (SIG_STOP/CONT) seems to be a relatively little-used > way to optimize for latency - enough so that it simply does not work > right on major non-free schedulers. > >- it's hard to get people to treat storage as ephemeral :( If it's under my desk, it doesn't have to be ephemeral. I can leave all my temporary data on the node disks... > >- big resources are also big budget targets :( Yes, but I think it's more of an issue of "size of capital expenditure" vs "size of organization where expenditure is reviewed"... For instance, a 10K expenditure might be reviewed very locally, and might be "big" compared to, say, a new ergonomic desk chair at $800. But in the context of something like JPL's entire budget of $1.5B, 10k is below the noise floor. So once you got your 10k, if it sat idle in your office half the time, you only really have to explain it to a few people. OTOH, if it's a million dollar piece of hardware, and it's sitting in your office idle, you're going to be explaining that to a substantially larger group of people. I suppose it all comes down to how much reduction of *my* labor cost there is by having the computer available, when I want it, as opposed to, say, tomorrow after calling to make an appointment for supercomputer time. Recognizing that the "I've got it now" is not going to be anywhere near the horsepower of "request it for tomorrow". > _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf