Hi Gus, your nice reply; as usual. I ran my code on single socket xeon node having two cores; It ran linear 97+% efficient.
Then I ran my code on single socket xeon node having four cores ( Xeon 3220 -which really not a good quad core) I got the efficiency of around 85%. But on four single socket nodes I ran 4 processes (1 process on each node); I got the efficiency of around 62%. Yes, CFD codes are memory bandwidth bound usually. Thank you very much. run with 2core On Wed, Dec 9, 2009 at 9:11 PM, Gus Correa <g...@ldeo.columbia.edu> wrote: > Hi Amjad > > There is relatively inexpensive Infiniband SDR: > > http://www.colfaxdirect.com/store/pc/showsearchresults.asp?customfield=5&SearchValues=65 > http://www.colfaxdirect.com/store/pc/viewPrd.asp?idproduct=12 > > http://www.colfaxdirect.com/store/pc/viewCategories.asp?SFID=12&SFNAME=Brand&SFVID=50&SFVALUE=Mellanox&SFCount=0&page=0&pageStyle=m&idcategory=2&VS12=0&VS9=0&VS10=0&VS4=0&VS3=0&VS11=0 > Not the latest greatest, but faster than Gigabit Ethernet. > A better Gigabit Ethernet switch may help also, > but I wonder if the impact will be as big as expected. > > However, are you sure the scalability problems you see are > due to poor network connection? > Could it be perhaps related to the code itself, > or maybe to the processors' memory bandwidth? > > You could test if it is network running the program inside a node > (say on 4 cores) and across 4 nodes with > one core in use on each node, or other combinations > (2 cores on 2 nodes). > > You could have an indication of the processors' scalability > by timing program runs inside a single node using 1,2,3,4 cores. > > My experience with dual socket dual core Xeons vs. > dual socket dual core Opterons, > with the type of code we run here (ocean,atmosphere,climate models, > which are not totally far from your CFD) is that Opterons > scale close to linear, but Xeons get nearly stuck in terms of scaling > when there are more than 2 processes (3 or 4) running in a single node. > > My two cents. > Gus Correa > --------------------------------------------------------------------- > Gustavo Correa > Lamont-Doherty Earth Observatory - Columbia University > Palisades, NY, 10964-8000 - USA > --------------------------------------------------------------------- > > > amjad ali wrote: > >> Hi all, >> >> I have, with my group, a small cluster of about 16 nodes (each one with >> single socket Xeon 3085 or 3110; And I face problem of poor scalability. Its >> network is quite ordinary GiGE (perhaps DLink DGS-1024D 24-Port >> 10/100/1000), store and forward switch, of price about $250 only. >> ftp://ftp10.dlink.com/pdfs/products/DGS-1024D/DGS-1024D_ds.pdf >> >> How should I work on that for better scalability? >> >> What could be better affordable options of fast switches? (Myrinet, >> Infiniband are quite costly). >> >> When buying a switch what should we see in it? What latency? >> >> >> Thank you very much. >> >> >> ------------------------------------------------------------------------ >> >> _______________________________________________ >> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing >> To change your subscription (digest mode or unsubscribe) visit >> http://www.beowulf.org/mailman/listinfo/beowulf >> > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf >
_______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf