On 15/04/13 18:19, Greta Yorsh wrote:
Generate prologue/epilogue using STRD/LDRD in ARM mode, when tuning
prefer_ldrd_strd flag is set, such as in Cortex-A15.
The previous version of this patch was posted for review here:
http://gcc.gnu.org/ml/gcc-patches/2012-10/msg00995.html
The new version includes the following improvements:
(1) For prologue, it generates STRD whenever possible, otherwise it
generate single-word loads, instead of STM. This allows us to use
offset addressing with STRD, instead of writeback on every store used
in the previous version of this patch. Similarly, for epilogue. To
allow epilogue returns by loading directly into PC, a separate stack
update instruction is emitted before the final load into PC.
(2) The previous version of this patch causes an ICE in
arm_emit_strd_push, when gcc is called with "-fno-omit-frame-pointer
-mapcs-frame" command-line options. It is fixed in the attached patch,
where arm_emit_strd_push is not called when TARGET_APCS_FRAME holds
(epilogue already has a similar condition).
(3) The previous version of the patch generated incorrect return
sequences for interrupt function. This version fixes it by using the
original LDM epilogues for interrupt functions. No need to change the
tests gcc.target/arm/interrupt-*.c.
(4) Takes assert statements out of the loop, addressing a comment made
about a related patch, also relevant here.
(5) Improves dwarf info generation.
No regression on qemu for arm-none-eabi cortex-a15.
Bootstrap successful on A15 TC2.
Spec2k overall slight performance improvement (less than 1%) on Cortex-A15
TC2.
Out of 26 benchmarks, 4 show regression of 2.5% or less (benchmarks
186,254,255,178).
Other benchmarks show improvements or no change.
Size increase overall by 1.4%.
No clear correlation between performance and size increase.
Ok for trunk?
Thanks,
Greta
ChangeLog
gcc/
2013-04-15 Greta Yorsh <Greta.Yorsh at arm.com>
* config/arm/arm.c (emit_multi_reg_push): New declaration
for an existing function.
(arm_emit_strd_push): New function.
(arm_expand_prologue): Used here.
(arm_emit_ldrd_pop): New function.
(arm_expand_epilogue): Used here.
(arm_get_frame_offsets): Update condition.
(arm_emit_multi_reg_pop): Add a special case for load of a single
register with writeback.
OK.
R.