Hello, Sorry, I don't have the same applications as you. Did you compile them with gcc? If gcc, then -o3 can do some optimization. -march=k8 is enough I think. And you make sure the CPU running at the default frequency. Sometime Powernow is active as default. And BTW, what's your platform? Linux? Which release? X86_64? Regards, Li, Bo ----- Original Message ----- From: "Mikhail Kuzminsky" <[EMAIL PROTECTED]> To: "Li, Bo" <[EMAIL PROTECTED]> Cc: <beowulf@beowulf.org> Sent: Sunday, June 29, 2008 12:23 AM Subject: Re: [Beowulf] Strange Opteron 2350 performance: Gaussian-03
> In message from "Li, Bo" <[EMAIL PROTECTED]> (Sun, 29 Jun 2008 00:07:07 > +0800): >>Hello, >>I am afraid there must be something wrong with your experiment. >>How did you get the performance? Was your DFT codes running in >>parallel? Any optimization involved? > > I was afraid the same, but the results are reproduced twice. > > As I wrote in my message: > > - there were ONE CORE (one CPU for Opteron 246) runs > - the optimization was performed for OLD Opteron 246 (because > Gaussian, Inc do not propose binaries optimized specially for > Barcelona) > > DFT test397 (as any other DFT) is parallelized well, and on Opteron > 246 it gives 1.9 times speedup on 2 CPUs. But I didn't run 2-cores > parallelized job for Opteron 2350: I was stressed by results obtained > for 1 core. > >>In most of my test, K8L or K10 can beat old opteron at the same >>frequency with about 20% improvement. > > Sorry, do you have this on Gaussian-03 and for DFT in particular ? Did > you compile it on K10 using target=barcelona (i.e. optimized for > barcelona) ? > > Yours > Mikhail > >>Regards, >>Li, Bo >>----- Original Message ----- >>From: "Mikhail Kuzminsky" <[EMAIL PROTECTED]> >>To: <beowulf@beowulf.org> >>Sent: Saturday, June 28, 2008 11:48 PM >>Subject: [Beowulf] Strange Opteron 2350 performance: Gaussian-03 >> >> >>> I'm runnung a set of quad-core Opteron 2350 benchmarks, in >>>particular >>> using Gaussian-03 (binary version from Gaussian, Inc, i.e. >>>translated >>> by more old - than current - pgf77 version, for Opteron target). >>> >>> I compare in particular *one core* of Opteron 2350 w/Opteron 246 >>> having the same 2 Ghz frequency and the same amount of cache per >>>core >>> (512K L2 + 0.25*2 MB L3 for Opteron 2350 is just 1 MB L2 for Opteron >>> 246). Opteron 246 has even more fast DDR2-667 RAM. >>> >>> The Gaussian-03 performance in some cases is close for both >>>Opteron's >>> (I remember that compilation didn't know about Barcelona !), but for >>> very popular DFT method Opteron 2350 cores looks as slow: one job >>> gives 33% more bad (than Opteron 246) performance. >>> >>> But on standard Gaussian-03 test397.com DFT/B3LYP test: *one* (1) >>> Opteron 2350 core run 15667 sec. (both startstop and cpu) vs 8709 >>>sec. >>> on (one) Opteron 246 !! >>> >>> There is no powersaved daemon, so the frequnecy of Opteron 2350 is >>> fixed to 2 Ghz. I reproduced this result twice on Opteron 2350, in >>> particular one time using forced good numactl behaviour. I'm >>> reproducing it on Opteron 246 again :-) but I have indirect >>> confirmation of this timings (based on 2-cpus Opteron 246 parallel >>> test). >>> >>> Yes, AFAIK DFT method is cache-friendly, and more slow L3 cache in >>> Opteron 2350 may give more bad performance. But in 1.8 times ?? >>> >>> Any your comments are welcome. >>> >>> Mikhail Kuzminsky >>> Computer Assistance to Chemical Research Center >>> Zelinsky Institute of Organic Chemistry >>> Moscow >>> >>> >>> >>> >>> >>> >>> _______________________________________________ >>> Beowulf mailing list, Beowulf@beowulf.org >>> To change your subscription (digest mode or unsubscribe) visit >>>http://www.beowulf.org/mailman/listinfo/beowulf > _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf