Hi, This patch eliminates unnecessary byte swaps for block clear on P8 LE. For block clear, all the bytes are set to zero. The byte order doesn't make sense. So the alignment of destination could be set to the store mode size in stead of 1 byte in order to eliminates unnecessary byte swap instructions on P8 LE. The test case shows the problem.
Bootstrapped and tested on x86 and powerpc64-linux BE and LE with no regressions. Is this OK for trunk? Thanks Gui Haochen ChangeLog rs6000: Eliminate unnecessary byte swaps for block clear on P8 LE gcc/ PR target/113325 * config/rs6000/rs6000-string.cc (expand_block_clear): Set the alignment of destination to the size of mode. gcc/testsuite/ PR target/113325 * gcc.target/powerpc/pr113325.c: New. patch.diff diff --git a/gcc/config/rs6000/rs6000-string.cc b/gcc/config/rs6000/rs6000-string.cc index 7f777666ba9..4c9b2cbeefc 100644 --- a/gcc/config/rs6000/rs6000-string.cc +++ b/gcc/config/rs6000/rs6000-string.cc @@ -140,7 +140,9 @@ expand_block_clear (rtx operands[]) } dest = adjust_address (orig_dest, mode, offset); - + /* Set the alignment of dest to the size of mode in order to + avoid unnecessary byte swaps on LE. */ + set_mem_align (dest, GET_MODE_SIZE (mode) * BITS_PER_UNIT); emit_move_insn (dest, CONST0_RTX (mode)); } diff --git a/gcc/testsuite/gcc.target/powerpc/pr113325.c b/gcc/testsuite/gcc.target/powerpc/pr113325.c new file mode 100644 index 00000000000..4a3cae019c2 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr113325.c @@ -0,0 +1,9 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mdejagnu-cpu=power8" } */ +/* { dg-require-effective-target powerpc_p8vector_ok } */ +/* { dg-final { scan-assembler-not {\mxxpermdi\M} } } */ + +void* foo (void* s1) +{ + return __builtin_memset (s1, 0, 32); +}