Hi Haochen,

on 2024/1/11 16:28, HAO CHEN GUI wrote:
> Hi,
>   This patch eliminates unnecessary byte swaps for block clear on P8
> LE. For block clear, all the bytes are set to zero. The byte order
> doesn't make sense. So the alignment of destination could be set to
> the store mode size in stead of 1 byte in order to eliminates
> unnecessary byte swap instructions on P8 LE. The test case shows the
> problem.

I agree with Richi's concern, a bytes swap can be eliminated if the
bytes swapped result is known as before, one typical case is the vector
constant with predicate const_vector_each_byte_same, we can do some
optimization for that.

BR,
Kewen

> 
>   Bootstrapped and tested on x86 and powerpc64-linux BE and LE with no
> regressions. Is this OK for trunk?
> 
> Thanks
> Gui Haochen
> 
> ChangeLog
> rs6000: Eliminate unnecessary byte swaps for block clear on P8 LE
> 
> gcc/
>       PR target/113325
>       * config/rs6000/rs6000-string.cc (expand_block_clear): Set the
>       alignment of destination to the size of mode.
> 
> gcc/testsuite/
>       PR target/113325
>       * gcc.target/powerpc/pr113325.c: New.
> 
> patch.diff
> diff --git a/gcc/config/rs6000/rs6000-string.cc 
> b/gcc/config/rs6000/rs6000-string.cc
> index 7f777666ba9..4c9b2cbeefc 100644
> --- a/gcc/config/rs6000/rs6000-string.cc
> +++ b/gcc/config/rs6000/rs6000-string.cc
> @@ -140,7 +140,9 @@ expand_block_clear (rtx operands[])
>       }
> 
>        dest = adjust_address (orig_dest, mode, offset);
> -
> +      /* Set the alignment of dest to the size of mode in order to
> +      avoid unnecessary byte swaps on LE.  */
> +      set_mem_align (dest, GET_MODE_SIZE (mode) * BITS_PER_UNIT);
>        emit_move_insn (dest, CONST0_RTX (mode));
>      }
> 
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr113325.c 
> b/gcc/testsuite/gcc.target/powerpc/pr113325.c
> new file mode 100644
> index 00000000000..4a3cae019c2
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr113325.c
> @@ -0,0 +1,9 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mdejagnu-cpu=power8" } */
> +/* { dg-require-effective-target powerpc_p8vector_ok } */
> +/* { dg-final { scan-assembler-not {\mxxpermdi\M} } } */
> +
> +void* foo (void* s1)
> +{
> +  return __builtin_memset (s1, 0, 32);
> +}

Reply via email to