PR 81356 points out that doing a __builtin_strcpy of an empty string on
aarch64 does a copy from memory instead of just writing out a zero byte.
In looking at this I found that it was because of
aarch64_use_by_pieces_infrastructure_p, which returns false for
STORE_BY_PIECES.  The comment says:

  /* STORE_BY_PIECES can be used when copying a constant string, but
     in that case each 64-bit chunk takes 5 insns instead of 2 (LDR/STR).
     For now we always fail this and let the move_by_pieces code copy
     the string from read-only memory.  */

But this doesn't seem to be the case anymore.  When I remove this function
and the TARGET_USE_BY_PIECES_INFRASTRUCTURE_P macro that uses it the code
for __builtin_strcpy of a constant string seems to be either better or the
same.  The only time I got more instructions after removing this function
was on an 8 byte __builtin_strcpy where we now generate a mov and 3 movk
instructions to create the source followed by a store instead of doing a
load/store of 8 bytes.  The comment may have been applicable for
-mstrict-align at one time but it doesn't seem to be the case now.  I still
get better code without this routine under that option as well.

Bootstrapped and tested without regressions, OK to checkin?

Steve Ellcey
sell...@cavium.com



2017-09-15  Steve Ellcey  <sell...@cavium.com>

        PR target/81356
        * config/aarch64/aarch64.c (aarch64_use_by_pieces_infrastructure_p):
        Remove.
        (TARGET_USE_BY_PIECES_INFRASTRUCTURE_P): Remove define.


2017-09-15  Steve Ellcey  <sell...@cavium.com>

        * gcc.target/aarch64/pr81356.c: New test.
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 1c14008..fc72236 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -14118,22 +14118,6 @@ aarch64_asan_shadow_offset (void)
   return (HOST_WIDE_INT_1 << 36);
 }
 
-static bool
-aarch64_use_by_pieces_infrastructure_p (unsigned HOST_WIDE_INT size,
-					unsigned int align,
-					enum by_pieces_operation op,
-					bool speed_p)
-{
-  /* STORE_BY_PIECES can be used when copying a constant string, but
-     in that case each 64-bit chunk takes 5 insns instead of 2 (LDR/STR).
-     For now we always fail this and let the move_by_pieces code copy
-     the string from read-only memory.  */
-  if (op == STORE_BY_PIECES)
-    return false;
-
-  return default_use_by_pieces_infrastructure_p (size, align, op, speed_p);
-}
-
 static rtx
 aarch64_gen_ccmp_first (rtx_insn **prep_seq, rtx_insn **gen_seq,
 			int code, tree treeop0, tree treeop1)
@@ -15631,10 +15615,6 @@ aarch64_libgcc_floating_mode_supported_p
 #undef TARGET_LEGITIMIZE_ADDRESS
 #define TARGET_LEGITIMIZE_ADDRESS aarch64_legitimize_address
 
-#undef TARGET_USE_BY_PIECES_INFRASTRUCTURE_P
-#define TARGET_USE_BY_PIECES_INFRASTRUCTURE_P \
-  aarch64_use_by_pieces_infrastructure_p
-
 #undef TARGET_SCHED_CAN_SPECULATE_INSN
 #define TARGET_SCHED_CAN_SPECULATE_INSN aarch64_sched_can_speculate_insn
 
diff --git a/gcc/testsuite/gcc.target/aarch64/pr81356.c b/gcc/testsuite/gcc.target/aarch64/pr81356.c
index e69de29..9fd6baa 100644
--- a/gcc/testsuite/gcc.target/aarch64/pr81356.c
+++ b/gcc/testsuite/gcc.target/aarch64/pr81356.c
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+void f(char *a)
+{
+  __builtin_strcpy (a, "");
+}
+
+/* { dg-final { scan-assembler-not "ldrb" } } */

Reply via email to