Ping. I have incorporated review comments from Peter in this revised patch. The comment was to remove -mvsx option from dg-options as this is implied by -mcpu=power8. Ok for trunk?
Regards, Surya On 09/01/25 8:53 pm, Surya Kumari Jangala wrote: > Ping > > On 02/12/24 2:20 pm, Surya Kumari Jangala wrote: >> I have incorporated review comments in this patch. >> >> Regards, >> Surya >> >> >> rs6000: Inefficient vector splat of small V2DI constants [PR107757] >> >> On P8, for vector splat of double word constants, specifically -1 and 1, >> gcc generates inefficient code. For -1, gcc generates two instructions >> (vspltisw and vupkhsw) whereas only one instruction (vspltisw) is >> sufficient. For constant 1, gcc generates a load of the constant from >> .rodata instead of the instructions vspltisw and vupkhsw. >> >> The routine vspltisw_vupkhsw_constant_p() returns true if the constant >> can be synthesized with instructions vspltisw and vupkhsw. However, for >> constant 1, this routine returns false. >> >> For constant -1, this routine returns true. Vector splat of -1 can be >> done with only one instruction, i.e., vspltisw. We do not need two >> instructions. Hence this routine should return false for -1. >> >> With this patch, gcc generates only one instruction (vspltisw) >> for -1. And for constant 1, this patch generates two instructions >> (vspltisw and vupkhsw). >> >> 2024-11-20 Surya Kumari Jangala <jskum...@linux.ibm.com> >> >> gcc/ >> PR target/107757 >> * config/rs6000/rs6000.cc (vspltisw_vupkhsw_constant_p): >> Return false for -1 and return true for 1. >> >> gcc/testsuite/ >> PR target/107757 >> * gcc.target/powerpc/pr107757-1.c: New. >> * gcc.target/powerpc/pr107757-2.c: New. >> --- >> gcc/config/rs6000/rs6000.cc | 2 +- >> gcc/testsuite/gcc.target/powerpc/pr107757-1.c | 14 ++++++++++++++ >> gcc/testsuite/gcc.target/powerpc/pr107757-2.c | 13 +++++++++++++ >> 3 files changed, 28 insertions(+), 1 deletion(-) >> create mode 100644 gcc/testsuite/gcc.target/powerpc/pr107757-1.c >> create mode 100644 gcc/testsuite/gcc.target/powerpc/pr107757-2.c >> >> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc >> index 02a2f1152db..d0c528f4d5f 100644 >> --- a/gcc/config/rs6000/rs6000.cc >> +++ b/gcc/config/rs6000/rs6000.cc >> @@ -6652,7 +6652,7 @@ vspltisw_vupkhsw_constant_p (rtx op, machine_mode >> mode, int *constant_ptr) >> return false; >> >> value = INTVAL (elt); >> - if (value == 0 || value == 1 >> + if (value == 0 || value == -1 >> || !EASY_VECTOR_15 (value)) >> return false; >> >> diff --git a/gcc/testsuite/gcc.target/powerpc/pr107757-1.c >> b/gcc/testsuite/gcc.target/powerpc/pr107757-1.c >> new file mode 100644 >> index 00000000000..49076fba255 >> --- /dev/null >> +++ b/gcc/testsuite/gcc.target/powerpc/pr107757-1.c >> @@ -0,0 +1,14 @@ >> +/* { dg-do compile } */ >> +/* { dg-options "-mdejagnu-cpu=power8 -O2" } */ >> +/* { dg-require-effective-target powerpc_vsx } */ >> +/* { dg-final { scan-assembler {\mvspltisw\M} } } */ >> +/* { dg-final { scan-assembler {\mvupkhsw\M} } } */ >> +/* { dg-final { scan-assembler-not {\mlvx\M} } } */ >> + >> +#include <altivec.h> >> + >> +vector long long >> +foo () >> +{ >> + return vec_splats (1LL); >> +} >> diff --git a/gcc/testsuite/gcc.target/powerpc/pr107757-2.c >> b/gcc/testsuite/gcc.target/powerpc/pr107757-2.c >> new file mode 100644 >> index 00000000000..4955696f11d >> --- /dev/null >> +++ b/gcc/testsuite/gcc.target/powerpc/pr107757-2.c >> @@ -0,0 +1,13 @@ >> +/* { dg-do compile } */ >> +/* { dg-options "-mdejagnu-cpu=power8 -O2" } */ >> +/* { dg-require-effective-target powerpc_vsx } */ >> +/* { dg-final { scan-assembler {\mvspltisw\M} } } */ >> +/* { dg-final { scan-assembler-not {\mvupkhsw\M} } } */ >> + >> +#include <altivec.h> >> + >> +vector long long >> +foo () >> +{ >> + return vec_splats (~0LL); >> +} >