[PING^1][PATCH v8] rs6000: Use vector addition when left shifting by 1 [PR119702]

Avinash Jayakar Wed, 05 Nov 2025 07:57:00 -0800

Ping!
Please review.

Thanks,
Avinash Jayakar


On Fri, 2025-10-24 at 14:45 +0530, Avinash Jayakar wrote:
> Hi Segher,
> 
> Please ignore the v7 patch, had minor errors in capitalization.
> Thanks for the review of the v6 patch. I have incorporated all the
> mentioned 
> changes in this patch.
> Bootstrapped and regtested on powerpc64le-linux with no regressions.
> Ok for trunk?
> 
> As mentioned in the PATCH v6 thread vaddudm tests fail in power8/9
> for:
> 1. pr119702-3.c: Surya's patch for PR107757 should fix this test case
> once
> upstream.
> 2. pr119702-4.c: Work in progress (PR122065)
> 
> Changes from v7:
>       1. Minor corrections in commit message.
> Changes from v6:
>       1. for loop formatting in predicates.md
>       2. Correct type from long to char in pr119702-2.c
>       3. Add has_arch_pwr8 for vaddudm tests.
> Changes from v5:
>       1. Corrected formatting and define_insn name.
>       2. Removed cpu specific flag for altivec tests.
>       3. Remove target lp64 for altivec tests.
>       4. Use native types instead of types from inttypes.h for
> altivec tests.
> Changes from v4:
>       1. Added comments for the new predicate "vector_constant_1".
>       2. Added new tests for altivec vector types.
>       3. Added comments in test file.
> Changes from v3:
>       1. Add author information before changelog.
>       2. Right placement of PR target/119702.
>       3. Added new test to check multiply by 2 generates vadd
> insn.
> Changes from v2:
>       1. Indentation fixes in the commit message
>       2. define_insn has the name *altivec_vsl<VI_char>_const_1
>       3. Iterate starting from 0 for checking vector constant = 1
> and
>       fixed source code formatting for the for loop.
>       4. Removed unused macro in pr119702-1.c test file
> Changes from v1:
>       1. Use define_insn instead of define_expand to recognize
> left
>       shift by constant 1 and generate add instruction.
>       2. Updated test cases to cover integer types from byte,
> half-
>       word, word and double word.
> 
> Thanks and regards,
> Avinash Jayakar
> 
> 
> Whenever a vector of integers is left shifted by a constant value 1,
> gcc generates the following code for powerpc64le target:
>       vspltisw 0,1
>       vsld 2,2,0
> Instead the following code can be generated which is more efficient:
>       vaddudm 2,2,2
> This patch adds a pattern in  altivec.md to recognize a vector left
> shift by a constant value, and generates an add instruction if
> constant
> is 1.
> 
> Added the pattern in altivec.md to recognize a vector left shift by a
> constant value, and generate add instructions if constant is 1.
> Added a predicate in predicates.md to recognize if the rtl node is a
> uniform constant vector with value 1.
> 
> 2025-10-24  Avinash Jayakar  <[email protected]>
> 
> gcc/ChangeLog:
>       PR target/119702
>       * config/rs6000/altivec.md (*altivec_vsl<VI_char>_const_1):
> Recognize
>       << 1 and replace with vadd insn.
>       * config/rs6000/predicates.md (vector_constant_1): Predicate
> to check if
>       all elements of a vector constant is 1.
> 
> gcc/testsuite/ChangeLog:
>       PR target/119702
>       * gcc.target/powerpc/pr119702-1.c: New test.
>       * gcc.target/powerpc/pr119702-2.c: New test.
>       * gcc.target/powerpc/pr119702-3.c: New test.
>       * gcc.target/powerpc/pr119702-4.c: New test.
> ---
>  gcc/config/rs6000/altivec.md                  |  8 +++
>  gcc/config/rs6000/predicates.md               | 13 ++++
>  gcc/testsuite/gcc.target/powerpc/pr119702-1.c | 60
> +++++++++++++++++++
>  gcc/testsuite/gcc.target/powerpc/pr119702-2.c | 59
> ++++++++++++++++++
>  gcc/testsuite/gcc.target/powerpc/pr119702-3.c | 36 +++++++++++
>  gcc/testsuite/gcc.target/powerpc/pr119702-4.c | 36 +++++++++++
>  6 files changed, 212 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr119702-1.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr119702-2.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr119702-3.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr119702-4.c
> 
> diff --git a/gcc/config/rs6000/altivec.md
> b/gcc/config/rs6000/altivec.md
> index fa3368079ad..11b8501a7d0 100644
> --- a/gcc/config/rs6000/altivec.md
> +++ b/gcc/config/rs6000/altivec.md
> @@ -2107,6 +2107,14 @@
>    "vsrv %0,%1,%2"
>    [(set_attr "type" "vecsimple")])
>  
> +(define_insn "*altivec_vsl<VI_char>_const_1"
> +  [(set (match_operand:VI2 0 "register_operand" "=v")
> +     (ashift:VI2 (match_operand:VI2 1 "register_operand" "v")
> +                   (match_operand:VI2 2 "vector_constant_1"
> "")))]
> +  "<VI_unit>"
> +  "vaddu<VI_char>m %0,%1,%1"
> +)
> +
>  (define_insn "*altivec_vsl<VI_char>"
>    [(set (match_operand:VI2 0 "register_operand" "=v")
>          (ashift:VI2 (match_operand:VI2 1 "register_operand" "v")
> diff --git a/gcc/config/rs6000/predicates.md
> b/gcc/config/rs6000/predicates.md
> index 647e89afb6a..017ff867aea 100644
> --- a/gcc/config/rs6000/predicates.md
> +++ b/gcc/config/rs6000/predicates.md
> @@ -924,6 +924,19 @@
>      }
>  })
>  
> +;; Return 1 if the operand is a vector constant with 1 in all of the
> elements.
> +(define_predicate "vector_constant_1"
> +  (match_code "const_vector")
> +{
> +  unsigned nunits = GET_MODE_NUNITS (mode);
> +  for (unsigned i = 0; i < nunits; i++)
> +    {
> +      if (INTVAL (CONST_VECTOR_ELT (op, i)) != 1)
> +     return 0;
> +    }
> +  return 1;
> +})
> +
>  ;; Return 1 if operand is 0.0.
>  (define_predicate "zero_fp_constant"
>    (and (match_code "const_double")
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr119702-1.c
> b/gcc/testsuite/gcc.target/powerpc/pr119702-1.c
> new file mode 100644
> index 00000000000..d12ae23be60
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr119702-1.c
> @@ -0,0 +1,60 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -maltivec" } */
> +/* { dg-require-effective-target powerpc_altivec } */
> +
> +/* PR target/119702 -- verify left shift by 1 is converted into
> +   a vaddu<x>m instruction.  */
> +
> +
> +void lshift1_64(unsigned long long *a)
> +{
> +  a[0] <<= 1;
> +  a[1] <<= 1;
> +}
> +
> +void lshift1_32(unsigned int *a)
> +{
> +  a[0] <<= 1;
> +  a[1] <<= 1;
> +  a[2] <<= 1;
> +  a[3] <<= 1;
> +}
> +
> +void lshift1_16(unsigned short *a)
> +{
> +  a[0] <<= 1;
> +  a[1] <<= 1;
> +  a[2] <<= 1;
> +  a[3] <<= 1;
> +  a[4] <<= 1;
> +  a[5] <<= 1;
> +  a[6] <<= 1;
> +  a[7] <<= 1;
> +}
> +
> +void lshift1_8(unsigned char *a)
> +{
> +  a[0] <<= 1;
> +  a[1] <<= 1;
> +  a[2] <<= 1;
> +  a[3] <<= 1;
> +  a[4] <<= 1;
> +  a[5] <<= 1;
> +  a[6] <<= 1;
> +  a[7] <<= 1;
> +  a[8] <<= 1;
> +  a[9] <<= 1;
> +  a[10] <<= 1;
> +  a[11] <<= 1;
> +  a[12] <<= 1;
> +  a[13] <<= 1;
> +  a[14] <<= 1;
> +  a[15] <<= 1;
> +}
> +
> +
> +/* { dg-final { scan-assembler-times {\mvaddudm\M} 1 { target
> has_arch_pwr8 } } } */
> +/* { dg-final { scan-assembler-times {\mvadduwm\M} 1 } } */
> +/* { dg-final { scan-assembler-times {\mvadduhm\M} 1 } } */
> +/* { dg-final { scan-assembler-times {\mvaddubm\M} 1 } } */
> +
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr119702-2.c
> b/gcc/testsuite/gcc.target/powerpc/pr119702-2.c
> new file mode 100644
> index 00000000000..45161f6311a
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr119702-2.c
> @@ -0,0 +1,59 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -maltivec" } */
> +/* { dg-require-effective-target powerpc_altivec } */
> +
> +/* PR target/119702 -- verify multiply by 2 is converted into
> +   a vaddu<x>m instruction.  */
> +
> +void lshift1_64(unsigned long long *a)
> +{
> +  a[0] *= 2;
> +  a[1] *= 2;
> +}
> +
> +void lshift1_32(unsigned int *a)
> +{
> +  a[0] *= 2;
> +  a[1] *= 2;
> +  a[2] *= 2;
> +  a[3] *= 2;
> +}
> +
> +void lshift1_16(unsigned short *a)
> +{
> +  a[0] *= 2;
> +  a[1] *= 2;
> +  a[2] *= 2;
> +  a[3] *= 2;
> +  a[4] *= 2;
> +  a[5] *= 2;
> +  a[6] *= 2;
> +  a[7] *= 2;
> +}
> +
> +void lshift1_8(unsigned char  *a)
> +{
> +  a[0] *= 2;
> +  a[1] *= 2;
> +  a[2] *= 2;
> +  a[3] *= 2;
> +  a[4] *= 2;
> +  a[5] *= 2;
> +  a[6] *= 2;
> +  a[7] *= 2;
> +  a[8] *= 2;
> +  a[9] *= 2;
> +  a[10] *= 2;
> +  a[11] *= 2;
> +  a[12] *= 2;
> +  a[13] *= 2;
> +  a[14] *= 2;
> +  a[15] *= 2;
> +}
> +
> +
> +/* { dg-final { scan-assembler-times {\mvaddudm\M} 1 { target
> has_arch_pwr8 } } } */
> +/* { dg-final { scan-assembler-times {\mvadduwm\M} 1 } } */
> +/* { dg-final { scan-assembler-times {\mvadduhm\M} 1 } } */
> +/* { dg-final { scan-assembler-times {\mvaddubm\M} 1 } } */
> +
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr119702-3.c
> b/gcc/testsuite/gcc.target/powerpc/pr119702-3.c
> new file mode 100644
> index 00000000000..28a53e75dd9
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr119702-3.c
> @@ -0,0 +1,36 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -maltivec" } */
> +/* { dg-require-effective-target powerpc_altivec } */
> +
> +/* PR target/119702 -- verify vector shift left by 1 is converted
> into
> +   a vaddu<x>m instruction.  */
> +
> +vector unsigned long long
> +lshift1_64 (vector unsigned long long a)
> +{
> +  return a << (vector unsigned long long) { 1, 1 };
> +}
> +
> +vector unsigned int
> +lshift1_32 (vector unsigned int a)
> +{
> +  return a << (vector unsigned int) { 1, 1, 1, 1 };
> +}
> +
> +vector unsigned short
> +lshift1_16 (vector unsigned short a)
> +{
> +  return a << (vector unsigned short) { 1, 1, 1, 1, 1, 1, 1, 1 };
> +}
> +
> +vector unsigned char
> +lshift1_8 (vector unsigned char a)
> +{
> +  return a << (vector unsigned char) { 1, 1, 1, 1, 1, 1, 1, 1,
> +                                       1, 1, 1, 1, 1, 1, 1, 1 };
> +}
> +
> +/* { dg-final { scan-assembler-times {\mvaddudm\M} 1 { target
> has_arch_pwr8 } } } */
> +/* { dg-final { scan-assembler-times {\mvadduwm\M} 1 } } */
> +/* { dg-final { scan-assembler-times {\mvadduhm\M} 1 } } */
> +/* { dg-final { scan-assembler-times {\mvaddubm\M} 1 } } */
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr119702-4.c
> b/gcc/testsuite/gcc.target/powerpc/pr119702-4.c
> new file mode 100644
> index 00000000000..759866e2cd2
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr119702-4.c
> @@ -0,0 +1,36 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -maltivec" } */
> +/* { dg-require-effective-target powerpc_altivec } */
> +
> +/* PR target/119702 -- verify vector multiply by 2 is converted into
> +   a vaddu<x>m instruction.  */
> +
> +vector unsigned long long
> +lshift1_64 (vector unsigned long long a)
> +{
> +  return a * (vector unsigned long long) { 2, 2 };
> +}
> +
> +vector unsigned int
> +lshift1_32 (vector unsigned int a)
> +{
> +  return a * (vector unsigned int) { 2, 2, 2, 2 };
> +}
> +
> +vector unsigned short
> +lshift1_16 (vector unsigned short a)
> +{
> +  return a * (vector unsigned short) { 2, 2, 2, 2, 2, 2, 2, 2 };
> +}
> +
> +vector unsigned char
> +lshift1_8 (vector unsigned char a)
> +{
> +  return a * (vector unsigned char) { 2, 2, 2, 2, 2, 2, 2, 2,
> +                                       2, 2, 2, 2, 2, 2, 2, 2 };
> +}
> +
> +/* { dg-final { scan-assembler-times {\mvaddudm\M} 1 { target
> has_arch_pwr8 } } } */
> +/* { dg-final { scan-assembler-times {\mvadduwm\M} 1 } } */
> +/* { dg-final { scan-assembler-times {\mvadduhm\M} 1 } } */
> +/* { dg-final { scan-assembler-times {\mvaddubm\M} 1 } } */

[PING^1][PATCH v8] rs6000: Use vector addition when left shifting by 1 [PR119702]

Reply via email to