Re: [Beowulf] Opinions of Hyper-threading?

Jim Lux Thu, 28 Feb 2008 07:03:51 -0800

Quoting Joe Landman <[EMAIL PROTECTED]>, on Thu 28 Feb2008 05:20:01 AM PST:

Bill Broadley wrote:
The problem with many (cores|threads) is that memory bandwidthwall. A fixed size (B) pipe to memory, with N requesters on thatpipe ...
What wall? Bandwidth is easy, it just costs money, and not much atthat. Want 50GB/sec[1] buy a $170 video card. Want 100GB/sec...buy a
Heh... if it were that easy, we would spend extra on more bandwidth for
Harpertown and Barcelona ...

The point is that the design determines your hard/fixed per socket
limits, and no programming technique is going to get you around that
limit per socket.  You need to change your programming technique to go
many socket.  That limit is the bandwidth wall.

And this is much the same as the earlier discussions on this list,when folks were building 8 and 16 processor clusters. There, thebandwidth wall was the 10Mbps Ethernet interconnect, first through ahub, then a switch, etc.

This is sort of why any programming technique for speed up that relieson tight coupling (e.g. shared memory) can't scale infinitely. Atsome point, the speed of light and physical size conspire to do you in.

If one wanted to design revolutionary distributed/parallel computingalgorithms, one could probably work with floppy disks and sneakernet.If it works there, it will certainly work on any faster mechanism.See.. true computer science doesn't need a 1000 processor cluster.

Another cluster related computer science issue is to start dealingwith unreliable links between the nodes of the cluster. Theoverwhelming majority of cluster codes assume that message passing isperfect and has no errors. Sometimes this is provided transparentlyby the communications mechanism (i.e. TCP/IP promises in order,error-free delivery). However, in the TCP case that comes at a cost..the latency isn't constant (because it achieves reliability bytemporal redundancy:retries), and if your algorithm does some sort ofscatter/gather and needs barrier synchronization, a late packet on onelink brings the whole mass to a halt.

As data rates get higher, even really good bit error rates on the wireget to be too big. Consider this.. a BER of 1E-10 is quite good, butif you're pumping 10Gb/s over the wire, that's an error every second.(A BER of 1E-10 is a typical rate for something like 100Mbps link...).So, practical systems use some sort of FEC, but even with that, BERsof 1E-14 or 1E-15 are pretty much state of the art over shortish(meters) distances. (It's a power/signal to noise ratio thing..Howmuch energy can you put into sending one bit of information?)


Jim

_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Re: [Beowulf] Opinions of Hyper-threading?

Reply via email to