Hi,
I am backporint some patches from FSF mainline, which may improve Linaro
4.5 gcc on thumb2 speed.
The first one is done by Richard E. "Improve optimization to transform
TST into LSLS"
http://gcc.gnu.org/ml/gcc-patches/2010-06/msg02518.html
After it applied to Linaro 4.5 tree, EEMBC speed number downgrades,
while code size is reduced to some extent. The code difference is like
this,
6801 ldr r1, [r0, #0]
f831 3013 ldrh.w r3, [r1, r3, lsl #1]
-f413 6f00 tst.w r3, #2048 ; 0x800
-f43f af41 beq.w cc <t_run_test+0xcc>
+0518 lsls r0, r3, #20
+f57f af44 bpl.w cc <t_run_test+0xcc>
4610 mov r0, r2
After reading cortex-a8 TRM, I can't find exact timing cycles of lsls.
Under Chung-Lin's help, we feel that lsls should be slower than tst, but
don't have any evidence to prove. If any people is familiar with arm
microarch, help is welcome. If our assumption is correct, we may can
change this patch to an optimization specific to size only.
The second patch is Bernd's "Fix an if statement in arm_rtx_costs_1"
http://gcc.gnu.org/ml/gcc-patches/2010-07/msg02096.html
After this patch applied, EEMBC benchmark number is not changed. Shall
we merge this patch to linaro 4.5 tree? I am inclined to merge it, but
if you have concerns on this patch, let us discuss here.
--
Yao Qi
CodeSourcery
y...@codesourcery.com
(650) 331-3385 x739
_______________________________________________
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain