A while ago Tiago Marques had provided some benchmarking info in a thread ( http://www.beowulf.org/archive/2009-May/025739.html ) and some recent tests that I've been doing made me interested in this snippet again:
>One of the codes, VASP, is very bandwidth limited and loves to run in a >number of cores multiple of 3. The 5400s are also very bandwith - memory and >FSB - limited which causes that they sometimes don't scale well above 6 >cores. They are very fast per core, as someone mentioned, when compared to >AMD cores. >These are the times I get from a benchmark I usually run in VASP: > >VASP on Core i7: > - 1 core = 162.453s, 162.778s (no HT) > - 2 cores = 100s,102s (no HT) > - 3 cores = 77.835s, 78.195s (no HT) > - 4 cores = 87.63s, 87.322s (no HT) > - 6 cores = *76.56s, 76.4s* > - 6 cores DDR3-1600 CAS9 - 69.654s, 68.816s, 67.7s > >HT doesn't add much but DDR3-1600 does. Still, ~78s is very fast with a >quad-core because our dual 5400s can only do *91s* at best, even using >tweaks like CPU affinity, which brings it down from 95s, by distributing >only 3 threads per socket and not 4/2 or having 4 of them constantly jumping >from socket to socket. Apparently it shows that the Nehalems for VASP scale well only to 3 cores? Putting 4 cores on the job actually causes the runtime to increase? This seems pretty bizzare to me at first sight but this seems close to what I am getting as well. Any other people seen similar scaling? (I am trying the cpu affinity flags now to see if that makes a difference) How would you explain this? In the past I've seen the codes scale well to core numbers higher than this. -- Rahul _______________________________________________ Beowulf mailing list, [email protected] sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
