On Wed, Feb 13, 2013 at 1:35 PM, Greta Yorsh <greta.yo...@arm.com> wrote:
> This patch defines peephole2 patterns that merge two individual LDR
> instructions into LDRD instruction (resp. STR into STRD) whenever possible
> using the following transformations:
> * reorder two memory accesses,
> * rename registers when storing two constants, and
> * reorder target registers of a load when they are used by a commutative
> operation.
>
> In ARM mode only, the pair of registers IP and SP is allowed as operands in
> LDRD/STRD. To handle it, this patch defines a new constraint "q" to be
> CORE_REGS in ARM mode and GENERAL_REGS (i.e., equivalent to "r") otherwise.
> Note that in ARM mode "q" is not equivalent to "rk" because of the way
> constraints are matched. The new constraint "q" is used in place of "r" for
> DImode move between register and memory.
>
> This is a new version of the patch posted for review a long time ago:
> http://gcc.gnu.org/ml/gcc-patches/2011-11/msg00914.html
> All the dependencies mentioned in the previous patch have already been
> upstreamed.
> Compared to the previous version, the new patch
> * handles both ARM and Thumb modes in the same peephole pattern,
> * does not attempt to generate LDRD/STRD when optimizing for size and non of
> the LDM/STM patterns match (but it would be easy to add),

I think it's worth doing that as a follow-up but remember to handle
mfix-cortex-m3-ldrd .

> * does not include scan-assembly tests specific for cortex-a15 and
> cortex-a9, because they are not stable and highly sensitive to other
> optimizations.
>
> No regression on qemu for arm-none-eabi with cpu cortex-a15.
>
> Bootstrap successful on Cortex-A15 TC2.
>
> Spec2k results:
> Performance: slight improvement in overall scores (less than 1%) in both
> CINT2000 and CFP2000.
> For individual benchmarks, there is a slight variation in performance,
> within less than 1%, which I consider to be just noise.
> Object size: there is a slight reduction in size in all the benchmarks -
> overall 0.2% and at most 0.5% for individual benchmarks.
> Baseline compiler is gcc r194473 from December 2012.
> Compiled in thumb mode with hardfp.
> Run on Cortex-A15 hardware.
>
> Ok for gcc4.9 stage 1?

Ok if no regressions.

regards
Ramana

>
> Thanks,
> Greta
>
> gcc/
>
> 2013-02-13  Greta Yorsh  <greta.yo...@arm.com>
>
>         * config/arm/constraints.md (q): New constraint.
>         * config/arm/ldrdstrd.md: New file.
>         * config/arm/arm.md (ldrdstrd.md) New include.
>         (arm_movdi): Use "q" instead of "r" constraint
>         for double-word memory access.
>         (movdf_soft_insn): Likewise.
>         * config/arm/vfp.md (movdi_vfp): Likewise.
>         * config/arm/t-arm (MD_INCLUDES): Add ldrdstrd.md.
>         * config/arm/arm-protos.h (gen_operands_ldrd_strd): New declaration.
>         * config/arm/arm.c (gen_operands_ldrd_strd): New function.
>         (mem_ok_for_ldrd_strd): Likewise.
>         (output_move_double): Update assertion.
>
> gcc/testsuite
>
> 2013-02-13  Greta Yorsh  <greta.yo...@arm.com>
>
>         * gcc.target/arm/peep-ldrd-1.c: New test.
>         * gcc.target/arm/peep-strd-1.c: Likewise.

Reply via email to