Hi, I did some tests on the following function
--- CUT HERE --- int fibo(int n) { if (n < 2) return 1; return (fibo(n-2) + fibo(n-1)); } --- CUT HERE --- and I discovered that it is faster -O2 than -O3. This is with gcc 4.9.2. Looking at the disassembly I see it is using FP registers to hold integer values. The following is a small extract. .L3: fmov w0, s8 sub w25, w25, #1 cmn w25, #1 add w0, w0, w27 fmov s8, w0 bne .L19 add w0, w0, 1 b .L2 Recompiling with -mgeneral-regs-only generates a huge improvement. The following are the times I get on various partner HW. I have normalised the -O2 times to 1 second so that I do not disclose actual partner performance data: Partner 1: -O2 = 1sec, -O3 = 1.13sec, -O3 -mgeneral-regs-only = 0.72sec Partner 2: -O2 = 1sec, -O3 = 0.68sec, -O3 -mgeneral-regs-only = 0.60sec Partner 3: -O2 = 1sec, -O3 = 0.73sec, -O3 -mgeneral-regs-only = 0.68sec Partner 4: -O2 = 1sec, -O3 = 0.83sec, -O3 -mgeneral-regs-only = 0.84sec So, in general, -O3 does actually do better than -O2, but in all cases performance is better if I stop it using FP registers for int values. I have put a tarball of the test program along with 3 binaries and 3 disassemblies here:- http://people.linaro.org/~edward.nevill/fibo.tar All the best, Ed. _______________________________________________ linaro-toolchain mailing list linaro-toolchain@lists.linaro.org https://lists.linaro.org/mailman/listinfo/linaro-toolchain