On 30/04/13 18:18, Greta Yorsh wrote:
This patch for gcc's internal memcpy emits LDRD/STRD whenever
possible, if prefer_ldrd_strd field is set in tune_params.
It uses DImode moves in both ARM and Thumb modes.
The generic move_by_pieces implementation cannot be used as is
to generate the same instruction sequence.
To handle cases in which either source or destination is not word-aligned,
this patch introduces new patterns for UNSPEC_UNALIGNED double-word access.
After reload, the pattern is split into two unaligned single-word accesses.
It prevents lower_subreg from splitting an aligned double-word access
that depends on the unaligned access.
This may become unnecessary when the cost model is fixed.
This patch also adjusts existing tests to accept LDRD/STRD or LDM/STM
depending on effective target arm_prefer_ldrd_strd.
An early version of this patch was posted here:
http://gcc.gnu.org/ml/gcc-patches/2011-11/msg00921.html
The new version is simpler because it generates
(a) the same RTL for both Thumb and ARM modes, and
(b) load and store blocks are matched, i.e., no need for store_partial_word
subroutine any more.
The previous version did not use DImode moves in Thumb mode.
Instead, it relied on LDRD/STRD patterns introduced by patches for
Thumb prolog/epilog using LDRD/STRD. These patterns were not approved,
because of a potential problem with reload, see here:
http://gcc.gnu.org/ml/gcc-patches/2012-10/msg01807.html
A slightly modified version of these patterns, approved and committed,
matches only
after reload, whereas the RTL insns for internal memcpy are generated
early on, during expand. There might be missed optimization
opportunities in Thumb mode.
No regression on qemu for arm-none-eabi with cpu cortex-a15 arm/thumb.
Bootstrap successful on Cortex-A15.
Ok for trunk?
Thanks,
Greta
ChangeLog
gcc/
2013-04-30 Greta Yorsh <greta.yo...@arm.com>
* config/arm/arm-protos.h (gen_movmem_ldrd_strd): New declaration.
* config/arm/arm.c (next_consecutive_mem): New function.
(gen_movmem_ldrd_strd): Likewise.
* config/arm/arm.md (movmemqi): Update condition and code.
(unaligned_loaddi, unaligned_storedi): New patterns.
gcc/testsuite
2013-04-30 Greta Yorsh <greta.yo...@arm.com>
* gcc.target/arm/unaligned-memcpy-2.c: Adjust expected output.
* gcc.target/arm/unaligned-memcpy-3.c: Likewise.
* gcc.target/arm/unaligned-memcpy-4.c: Likewise.
OK.
R.