On Wed, Feb 13, 2013 at 1:35 PM, Greta Yorsh <greta.yo...@arm.com> wrote: > This patch defines peephole2 patterns that merge two individual LDR > instructions into LDRD instruction (resp. STR into STRD) whenever possible > using the following transformations: > * reorder two memory accesses, > * rename registers when storing two constants, and > * reorder target registers of a load when they are used by a commutative > operation. > > In ARM mode only, the pair of registers IP and SP is allowed as operands in > LDRD/STRD. To handle it, this patch defines a new constraint "q" to be > CORE_REGS in ARM mode and GENERAL_REGS (i.e., equivalent to "r") otherwise. > Note that in ARM mode "q" is not equivalent to "rk" because of the way > constraints are matched. The new constraint "q" is used in place of "r" for > DImode move between register and memory. > > This is a new version of the patch posted for review a long time ago: > http://gcc.gnu.org/ml/gcc-patches/2011-11/msg00914.html > All the dependencies mentioned in the previous patch have already been > upstreamed. > Compared to the previous version, the new patch > * handles both ARM and Thumb modes in the same peephole pattern, > * does not attempt to generate LDRD/STRD when optimizing for size and non of > the LDM/STM patterns match (but it would be easy to add),
I think it's worth doing that as a follow-up but remember to handle mfix-cortex-m3-ldrd . > * does not include scan-assembly tests specific for cortex-a15 and > cortex-a9, because they are not stable and highly sensitive to other > optimizations. > > No regression on qemu for arm-none-eabi with cpu cortex-a15. > > Bootstrap successful on Cortex-A15 TC2. > > Spec2k results: > Performance: slight improvement in overall scores (less than 1%) in both > CINT2000 and CFP2000. > For individual benchmarks, there is a slight variation in performance, > within less than 1%, which I consider to be just noise. > Object size: there is a slight reduction in size in all the benchmarks - > overall 0.2% and at most 0.5% for individual benchmarks. > Baseline compiler is gcc r194473 from December 2012. > Compiled in thumb mode with hardfp. > Run on Cortex-A15 hardware. > > Ok for gcc4.9 stage 1? Ok if no regressions. regards Ramana > > Thanks, > Greta > > gcc/ > > 2013-02-13 Greta Yorsh <greta.yo...@arm.com> > > * config/arm/constraints.md (q): New constraint. > * config/arm/ldrdstrd.md: New file. > * config/arm/arm.md (ldrdstrd.md) New include. > (arm_movdi): Use "q" instead of "r" constraint > for double-word memory access. > (movdf_soft_insn): Likewise. > * config/arm/vfp.md (movdi_vfp): Likewise. > * config/arm/t-arm (MD_INCLUDES): Add ldrdstrd.md. > * config/arm/arm-protos.h (gen_operands_ldrd_strd): New declaration. > * config/arm/arm.c (gen_operands_ldrd_strd): New function. > (mem_ok_for_ldrd_strd): Likewise. > (output_move_double): Update assertion. > > gcc/testsuite > > 2013-02-13 Greta Yorsh <greta.yo...@arm.com> > > * gcc.target/arm/peep-ldrd-1.c: New test. > * gcc.target/arm/peep-strd-1.c: Likewise.