Greg Lindahl <[EMAIL PROTECTED]> wrote:
> On Tue, Sep 11, 2007 at 03:32:29AM +0000, [EMAIL PROTECTED] wrote: > > > So with all 8 cores at work from 2 sockets you are seeing 70% of peak > > assuming > > you are using 667 MHz DDR2 > > I'm not sure what this word "peak" means -- it cannot be achieved by > any test under any circumstances, whereas for a processor floating > point peak, you can usually do it with some weird code. > > Much better to call the measured STREAM number the actual peak; then I > won't have to let loose the memory bandwidth peak bot complainer once > again ;-) Yes, yes ... ;-) ... I like the stream numbers too, but I also like to know how much of the advertized capacity of memory bus one can get. We know what both AMD and Intel say their system can deliver, and we measure value of each design by comparing the percentage of the advertized capacity they deliver on a benchmark of interest. In the of memory bandwidth it is revealing I think ... Oui? On other hand, if peak is a dirty word I will refrain from using it in polite company ... ;-) ... > > > I thought first byte latencies were around 65 nanos for Opteron. Am > > I confused? > > You're misremembering. Opteron latency was always a function of the > number of active sockets, and it is usually measured with only one > core active, while Bill is doing the more realistic thing of having > all the cores active. Run the same code on your favorite Intel if you > want to compare. Granted, latency measures depend on the nearness of the memory referenced (ala cc-NUMA) to the location of the thread and the number of threads that are active, but I thought Bill's 1 thread results were also quite a bit larger than expected. Maybe I need to look at the Intel numbers for Bill's again test too. Perhaps I was comparing Intel's ideal numbers to Bill's real world AMD ones and that is what I was "misremembering." Do you expect the best case first byte latencies for a single-core run refering to cc-NUMA-local memory on the Barcelona to roughly (5-10%) equal those of dual-core socket 1207 and/or socket 940 ... this is what I was thinking initially, but perhaps Bill's result and fact that there is an L3 cache to consider changes things. Regards, rbw -- "Making predictions is hard, especially about the future." Niels Bohr -- Richard Walsh Thrashing River Consulting-- 5605 Alameda St. Shoreview, MN 55126 Phone #: 612-382-4620 _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf