Hi, I'm using a recent gforth revision from git (6ec9915f6277de) and noticed that running gforth --dynamic produces pretty extreme performance degradation (about a factor of 5) for the benchmark I was running [1]. This happens on Loongson-2f MIPS (debian squeeze mipsel, 32bit). Note that on MIPS dynamic superinstructions aren't enabled by default as they may violate load delay slot requirements on some very old MIPS CPUs.
The minimum code I could come up with that clearly shows the anomaly is:
time gforth-fast -r 600M \
-e '30000000 :noname 1- DUP 0> IF RECURSE THEN ; EXECUTE BYE'
user 0m1.680s
vs.
time gforth-fast --dynamic -r 600M \
-e '30000000 :noname 1- DUP 0> IF RECURSE THEN ; EXECUTE BYE'
user 0m12.529s
I.e. a degradation by a factor of 7.
Any ideas how to proceed further? This could be a side effect of the
BTB errata of Loongson2f [2] maybe doing speculative loads to invalid
addresses causing instruction stalls or cache flushes. But then how
could the micro-benchmark shown above ever cause a BTB prediction miss?
(the Loongson2 BTB has 16 entries). Any ideas how to explain the result
without invoking CPU bugs?
cheers,
David
[1] http://svn.code.sf.net/p/fkt/code/trunk/benchmark.fs
[2] https://sourceware.org/ml/binutils/2009-11/msg00387.html
--
GnuPG public key: http://dvdkhlng.users.sourceforge.net/dk2.gpg
Fingerprint: B63B 6AF2 4EEB F033 46F7 7F1D 935E 6F08 E457 205F
pgpoVbCpGHjY4.pgp
Description: PGP signature
