Larry Stewart wrote:
>Designing the communications network for this worst-case pattern has a >number of benefits: > > * it makes the machine less sensitive to the actual communications pattern > * it makes performance less variable run-to-run, when the job controller > chooses different subsets of the system I agree with this and pretty much all of your other comments, but wanted to make the point that a worst-case, hardware-only solution is not required or necessarily where all of the research and development effort should be placed for HPC as a whole. And let's not forgot that unless they are supported by some coincidental volume requirement in another non-HPC market, they will cost more (sometimes a lot). If worst-case hardware solutions were required then clusters would not have pushed out their HPC predecessors, and novel high-end designs would not find it so hard to break into the market. Lower cost hardware solutions often stimulate the more software-intelligent use of the additional resources that come along for the ride. With clusters you paid less for interconnects, memory interfaces, and packaged software, and got to spend the savings on more memory, more memory bandwidth (aggregate), and more processing power. This in turn had an effect on the problems tackled, weak scaling an application was an approach to use the memory while managing the impact of a cheaper interconnect. So, yes let's try to banish latency with cool state-of-the-art interconnects engineered for worst-case, not common-case, scenarios (we have been hearing about the benefits of high radix switches), but remember that interconnect cost and data locality and partitioning will always matter and may make the worse-case interconnect unnecessary >There's a paper in the IBM Journal of Research and Development about this, >they wound up using simulated annealing to find good placement on the most >regular machine around, because the "obvious" assignments weren't optimal. Can you point me at this paper ... sounds very interesting ... ?? >Personally, I believe our thinking about interconnects has been poisoned by >thinking >that NICs are I/O devices. We would be better off if they were coprocessors. >Threads >should be able to send messages by writing to registers, and arriving packets >should >activate a hyperthread that has full core capabilities for acting on them, and >with the >ability to interact coherently with the memory hierarchy from the same end as >other >processors. We had started kicking this around for the SiCortex gen-3 chip, >but were >overtaken by events. Yes to all this ... now that everyone has made the memory controller an integral part of the processor. We can move on to the NIC ... ;-) ... rbw
_______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf