PR 81356 points out that doing a __builtin_strcpy of an empty string on
aarch64 does a copy from memory instead of just writing out a zero byte.
In looking at this I found that it was because of
aarch64_use_by_pieces_infrastructure_p, which returns false for
STORE_BY_PIECES. The comment says:
/* STORE_BY_PIECES can be used when copying a constant string, but
in that case each 64-bit chunk takes 5 insns instead of 2 (LDR/STR).
For now we always fail this and let the move_by_pieces code copy
the string from read-only memory. */
But this doesn't seem to be the case anymore. When I remove this function
and the TARGET_USE_BY_PIECES_INFRASTRUCTURE_P macro that uses it the code
for __builtin_strcpy of a constant string seems to be either better or the
same. The only time I got more instructions after removing this function
was on an 8 byte __builtin_strcpy where we now generate a mov and 3 movk
instructions to create the source followed by a store instead of doing a
load/store of 8 bytes. The comment may have been applicable for
-mstrict-align at one time but it doesn't seem to be the case now. I still
get better code without this routine under that option as well.
Bootstrapped and tested without regressions, OK to checkin?
Steve Ellcey
sell...@cavium.com
2017-09-15 Steve Ellcey <sell...@cavium.com>
PR target/81356
* config/aarch64/aarch64.c (aarch64_use_by_pieces_infrastructure_p):
Remove.
(TARGET_USE_BY_PIECES_INFRASTRUCTURE_P): Remove define.
2017-09-15 Steve Ellcey <sell...@cavium.com>
* gcc.target/aarch64/pr81356.c: New test.
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 1c14008..fc72236 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -14118,22 +14118,6 @@ aarch64_asan_shadow_offset (void)
return (HOST_WIDE_INT_1 << 36);
}
-static bool
-aarch64_use_by_pieces_infrastructure_p (unsigned HOST_WIDE_INT size,
- unsigned int align,
- enum by_pieces_operation op,
- bool speed_p)
-{
- /* STORE_BY_PIECES can be used when copying a constant string, but
- in that case each 64-bit chunk takes 5 insns instead of 2 (LDR/STR).
- For now we always fail this and let the move_by_pieces code copy
- the string from read-only memory. */
- if (op == STORE_BY_PIECES)
- return false;
-
- return default_use_by_pieces_infrastructure_p (size, align, op, speed_p);
-}
-
static rtx
aarch64_gen_ccmp_first (rtx_insn **prep_seq, rtx_insn **gen_seq,
int code, tree treeop0, tree treeop1)
@@ -15631,10 +15615,6 @@ aarch64_libgcc_floating_mode_supported_p
#undef TARGET_LEGITIMIZE_ADDRESS
#define TARGET_LEGITIMIZE_ADDRESS aarch64_legitimize_address
-#undef TARGET_USE_BY_PIECES_INFRASTRUCTURE_P
-#define TARGET_USE_BY_PIECES_INFRASTRUCTURE_P \
- aarch64_use_by_pieces_infrastructure_p
-
#undef TARGET_SCHED_CAN_SPECULATE_INSN
#define TARGET_SCHED_CAN_SPECULATE_INSN aarch64_sched_can_speculate_insn
diff --git a/gcc/testsuite/gcc.target/aarch64/pr81356.c b/gcc/testsuite/gcc.target/aarch64/pr81356.c
index e69de29..9fd6baa 100644
--- a/gcc/testsuite/gcc.target/aarch64/pr81356.c
+++ b/gcc/testsuite/gcc.target/aarch64/pr81356.c
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+void f(char *a)
+{
+ __builtin_strcpy (a, "");
+}
+
+/* { dg-final { scan-assembler-not "ldrb" } } */