https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93136
--- Comment #6 from CVS Commits <cvs-commit at gcc dot gnu.org> --- The releases/gcc-9 branch has been updated by Kewen Lin <li...@gcc.gnu.org>: https://gcc.gnu.org/g:8c18220564f8372f4d45ed1a4df3fc7f71928654 commit r9-8843-g8c18220564f8372f4d45ed1a4df3fc7f71928654 Author: Kewen Lin <li...@linux.ibm.com> Date: Tue Sep 1 21:12:32 2020 -0500 rs6000: Backport fixes for PR92923 and PR93136 This patch is to backport the fix for PR92923 and its sequent fix for PR93136. We found the builtin functions needlessly using VIEW_CONVERT_EXPRs on their operands can probably cause remarkable performance degradation especailly when they are used in the hotspot. One typical case is ...github.com/antonblanchard/crc32-vpmsum/blob/master/vec_crc32.c With this patch, the execution time can improve 47.81%. Apart from the original fixes, this patch also gets two cases below updated. During the regression testing I found two cases failed due to icf optimization able to be adopted with this patch, the function bodies use tail calls, the expected assembly instructions are gone. gcc.target/powerpc/fold-vec-logical-ands-longlong.c gcc.target/powerpc/fold-vec-logical-ors-longlong.c Bootstrapped/regtested on powerpc64{,le}-linux-gnu P8. 2019-12-30 Peter Bergner <berg...@linux.ibm.com> gcc/ChangeLog PR target/92923 * config/rs6000/rs6000-builtin.def (VAND, VANDC, VNOR, VOR, VXOR): Delete. (EQV_V16QI_UNS, EQV_V8HI_UNS, EQV_V4SI_UNS, EQV_V2DI_UNS, EQV_V1TI_UNS, NAND_V16QI_UNS, NAND_V8HI_UNS, NAND_V4SI_UNS, NAND_V2DI_UNS, NAND_V1TI_UNS, ORC_V16QI_UNS, ORC_V8HI_UNS, ORC_V4SI_UNS, ORC_V2DI_UNS, ORC_V1TI_UNS, VAND_V16QI_UNS, VAND_V16QI, VAND_V8HI_UNS, VAND_V8HI, VAND_V4SI_UNS, VAND_V4SI, VAND_V2DI_UNS, VAND_V2DI, VAND_V4SF, VAND_V2DF, VANDC_V16QI_UNS, VANDC_V16QI, VANDC_V8HI_UNS, VANDC_V8HI, VANDC_V4SI_UNS, VANDC_V4SI, VANDC_V2DI_UNS, VANDC_V2DI, VANDC_V4SF, VANDC_V2DF, VNOR_V16QI_UNS, VNOR_V16QI, VNOR_V8HI_UNS, VNOR_V8HI, VNOR_V4SI_UNS, VNOR_V4SI, VNOR_V2DI_UNS, VNOR_V2DI, VNOR_V4SF, VNOR_V2DF, VOR_V16QI_UNS, VOR_V16QI, VOR_V8HI_UNS, VOR_V8HI, VOR_V4SI_UNS, VOR_V4SI, VOR_V2DI_UNS, VOR_V2DI, VOR_V4SF, VOR_V2DF, VXOR_V16QI_UNS, VXOR_V16QI, VXOR_V8HI_UNS, VXOR_V8HI, VXOR_V4SI_UNS, VXOR_V4SI, VXOR_V2DI_UNS, VXOR_V2DI, VXOR_V4SF, VXOR_V2DF): Add definitions. * config/rs6000/rs6000-c.c (altivec_overloaded_builtins) <ALTIVEC_BUILTIN_VAND, ALTIVEC_BUILTIN_VANDC, ALTIVEC_BUILTIN_VNOR, ALTIVEC_BUILTIN_VOR, ALTIVEC_BUILTIN_VXOR>: Remove. <ALTIVEC_BUILTIN_VAND_V4SF, ALTIVEC_BUILTIN_VAND_V2DF, ALTIVEC_BUILTIN_VAND_V2DI, ALTIVEC_BUILTIN_VAND_V2DI_UNS, ALTIVEC_BUILTIN_VAND_V4SI_UNS, ALTIVEC_BUILTIN_VAND_V4SI, ALTIVEC_BUILTIN_VAND_V8HI_UNS, ALTIVEC_BUILTIN_VAND_V8HI, ALTIVEC_BUILTIN_VAND_V16QI, ALTIVEC_BUILTIN_VAND_V16QI_UNS, ALTIVEC_BUILTIN_VANDC_V4SF, ALTIVEC_BUILTIN_VANDC_V2DF, ALTIVEC_BUILTIN_VANDC_V2DI, ALTIVEC_BUILTIN_VANDC_V2DI_UNS, ALTIVEC_BUILTIN_VANDC_V4SI_UNS, ALTIVEC_BUILTIN_VANDC_V4SI, ALTIVEC_BUILTIN_VANDC_V8HI_UNS, ALTIVEC_BUILTIN_VANDC_V8HI, ALTIVEC_BUILTIN_VANDC_V16QI, ALTIVEC_BUILTIN_VANDC_V16QI_UNS, ALTIVEC_BUILTIN_VNOR_V4SF, ALTIVEC_BUILTIN_VNOR_V2DF, ALTIVEC_BUILTIN_VNOR_V2DI, ALTIVEC_BUILTIN_VNOR_V2DI_UNS, ALTIVEC_BUILTIN_VNOR_V4SI, ALTIVEC_BUILTIN_VNOR_V4SI_UNS, ALTIVEC_BUILTIN_VNOR_V8HI, ALTIVEC_BUILTIN_VNOR_V8HI_UNS, ALTIVEC_BUILTIN_VNOR_V16QI, ALTIVEC_BUILTIN_VNOR_V16QI_UNS, ALTIVEC_BUILTIN_VOR_V4SF, ALTIVEC_BUILTIN_VOR_V2DF, ALTIVEC_BUILTIN_VOR_V2DI, ALTIVEC_BUILTIN_VOR_V2DI_UNS, ALTIVEC_BUILTIN_VOR_V4SI_UNS, ALTIVEC_BUILTIN_VOR_V4SI, ALTIVEC_BUILTIN_VOR_V8HI_UNS, ALTIVEC_BUILTIN_VOR_V8HI, ALTIVEC_BUILTIN_VOR_V16QI, ALTIVEC_BUILTIN_VOR_V16QI_UNS, ALTIVEC_BUILTIN_VXOR_V4SF, ALTIVEC_BUILTIN_VXOR_V2DF, ALTIVEC_BUILTIN_VXOR_V2DI, ALTIVEC_BUILTIN_VXOR_V2DI_UNS, ALTIVEC_BUILTIN_VXOR_V4SI_UNS, ALTIVEC_BUILTIN_VXOR_V4SI, ALTIVEC_BUILTIN_VXOR_V8HI, ALTIVEC_BUILTIN_VXOR_V8HI_UNS, ALTIVEC_BUILTIN_VXOR_V16QI, ALTIVEC_BUILTIN_VXOR_V16QI_UNS>: Add definitions. <P8V_BUILTIN_EQV_V16QI, P8V_BUILTIN_EQV_V8HI, P8V_BUILTIN_EQV_V4SI, P8V_BUILTIN_EQV_V2DI, P8V_BUILTIN_NAND_V16QI, P8V_BUILTIN_NAND_V8HI, P8V_BUILTIN_NAND_V4SI, P8V_BUILTIN_NAND_V2DI, P8V_BUILTIN_ORC_V16QI, P8V_BUILTIN_ORC_V8HI, P8V_BUILTIN_ORC_V4SI, P8V_BUILTIN_ORC_V2DI>: Change unsigned usages to use the new *_UNS definition names. * config/rs6000/rs6000.c (rs6000_gimple_fold_builtin) <ALTIVEC_BUILTIN_VAND_V16QI_UNS, ALTIVEC_BUILTIN_VAND_V16QI, ALTIVEC_BUILTIN_VAND_V8HI_UNS, ALTIVEC_BUILTIN_VAND_V8HI, ALTIVEC_BUILTIN_VAND_V4SI_UNS, ALTIVEC_BUILTIN_VAND_V4SI, ALTIVEC_BUILTIN_VAND_V2DI_UNS, ALTIVEC_BUILTIN_VAND_V2DI, ALTIVEC_BUILTIN_VAND_V4SF, ALTIVEC_BUILTIN_VAND_V2DF, ALTIVEC_BUILTIN_VANDC_V16QI_UNS, ALTIVEC_BUILTIN_VANDC_V16QI, ALTIVEC_BUILTIN_VANDC_V8HI_UNS, ALTIVEC_BUILTIN_VANDC_V8HI, ALTIVEC_BUILTIN_VANDC_V4SI_UNS, ALTIVEC_BUILTIN_VANDC_V4SI, ALTIVEC_BUILTIN_VANDC_V2DI_UNS, ALTIVEC_BUILTIN_VANDC_V2DI, ALTIVEC_BUILTIN_VANDC_V4SF, ALTIVEC_BUILTIN_VANDC_V2DF, P8V_BUILTIN_NAND_V16QI_UNS, P8V_BUILTIN_NAND_V8HI_UNS, P8V_BUILTIN_NAND_V4SI_UNS, P8V_BUILTIN_NAND_V2DI_UNS, P8V_BUILTIN_NAND_V2DI, ALTIVEC_BUILTIN_VOR_V16QI_UNS, ALTIVEC_BUILTIN_VOR_V16QI, ALTIVEC_BUILTIN_VOR_V8HI_UNS, ALTIVEC_BUILTIN_VOR_V8HI, ALTIVEC_BUILTIN_VOR_V4SI_UNS, ALTIVEC_BUILTIN_VOR_V4SI, ALTIVEC_BUILTIN_VOR_V2DI_UNS, ALTIVEC_BUILTIN_VOR_V2DI, ALTIVEC_BUILTIN_VOR_V4SF, ALTIVEC_BUILTIN_VOR_V2DF, P8V_BUILTIN_ORC_V16QI_UNS, P8V_BUILTIN_ORC_V8HI_UNS, P8V_BUILTIN_ORC_V4SI_UNS, P8V_BUILTIN_ORC_V2DI_UNS, P8V_BUILTIN_ORC_V2DI, ALTIVEC_BUILTIN_VXOR_V16QI_UNS, ALTIVEC_BUILTIN_VXOR_V16QI, ALTIVEC_BUILTIN_VXOR_V8HI_UNS, ALTIVEC_BUILTIN_VXOR_V8HI, ALTIVEC_BUILTIN_VXOR_V4SI_UNS, ALTIVEC_BUILTIN_VXOR_V4SI, ALTIVEC_BUILTIN_VXOR_V2DI_UNS, ALTIVEC_BUILTIN_VXOR_V2DI, ALTIVEC_BUILTIN_VXOR_V4SF, ALTIVEC_BUILTIN_VXOR_V2DF, ALTIVEC_BUILTIN_VNOR_V16QI_UNS, ALTIVEC_BUILTIN_VNOR_V16QI, ALTIVEC_BUILTIN_VNOR_V8HI_UNS, ALTIVEC_BUILTIN_VNOR_V8HI, ALTIVEC_BUILTIN_VNOR_V4SI_UNS, ALTIVEC_BUILTIN_VNOR_V4SI, ALTIVEC_BUILTIN_VNOR_V2DI_UNS, ALTIVEC_BUILTIN_VNOR_V2DI, ALTIVEC_BUILTIN_VNOR_V4SF, ALTIVEC_BUILTIN_VNOR_V2DF>: Use new definition names. (builtin_function_type) <ALTIVEC_BUILTIN_VAND_V16QI_UNS, ALTIVEC_BUILTIN_VAND_V8HI_UNS, ALTIVEC_BUILTIN_VAND_V4SI_UNS, ALTIVEC_BUILTIN_VAND_V2DI_UNS, ALTIVEC_BUILTIN_VANDC_V16QI_UNS, ALTIVEC_BUILTIN_VANDC_V8HI_UNS, ALTIVEC_BUILTIN_VANDC_V4SI_UNS, ALTIVEC_BUILTIN_VANDC_V2DI_UNS, ALTIVEC_BUILTIN_VNOR_V16QI_UNS, ALTIVEC_BUILTIN_VNOR_V8HI_UNS, ALTIVEC_BUILTIN_VNOR_V4SI_UNS, ALTIVEC_BUILTIN_VNOR_V2DI_UNS, ALTIVEC_BUILTIN_VOR_V16QI_UNS, ALTIVEC_BUILTIN_VOR_V8HI_UNS, ALTIVEC_BUILTIN_VOR_V4SI_UNS, ALTIVEC_BUILTIN_VOR_V2DI_UNS, ALTIVEC_BUILTIN_VXOR_V16QI_UNS, ALTIVEC_BUILTIN_VXOR_V8HI_UNS, ALTIVEC_BUILTIN_VXOR_V4SI_UNS, ALTIVEC_BUILTIN_VXOR_V2DI_UNS, P8V_BUILTIN_EQV_V16QI_UNS, P8V_BUILTIN_EQV_V8HI_UNS, P8V_BUILTIN_EQV_V4SI_UNS, P8V_BUILTIN_EQV_V2DI_UNS, P8V_BUILTIN_EQV_V1TI_UNS, P8V_BUILTIN_NAND_V16QI_UNS, P8V_BUILTIN_NAND_V8HI_UNS, P8V_BUILTIN_NAND_V4SI_UNS, P8V_BUILTIN_NAND_V2DI_UNS, P8V_BUILTIN_NAND_V1TI_UNS, P8V_BUILTIN_ORC_V16QI_UNS, P8V_BUILTIN_ORC_V8HI_UNS, P8V_BUILTIN_ORC_V4SI_UNS, P8V_BUILTIN_ORC_V2DI_UNS, P8V_BUILTIN_ORC_V1TI_UNS>: Handle unsigned builtins. 2019-12-30 Peter Bergner <berg...@linux.ibm.com> 2020-02-08 Peter Bergner <berg...@linux.ibm.com> gcc/testsuite/ChangeLog * gcc.target/powerpc/fold-vec-logical-ands-longlong.c: Adjust. * gcc.target/powerpc/fold-vec-logical-ors-longlong.c: Likewise. PR target/92923 * gcc.target/powerpc/pr92923-1.c: New test. * gcc.target/powerpc/pr92923-2.c: Likewise. PR target/93136 * gcc.dg/vmx/ops.c: Add -flax-vector-conversions to dg-options. * gcc.target/powerpc/vsx-vector-6.h: Split tests into smaller functions. * gcc.target/powerpc/vsx-vector-6.p7.c: Adjust scan-assembler-times regex directives. Adjust expected instruction counts. * gcc.target/powerpc/vsx-vector-6.p8.c: Likewise. * gcc.target/powerpc/vsx-vector-6.p9.c: Likewise. (cherry picked from commit 4559be2358020714ec7521c80589992716d23035) (cherry picked from commit 4b39d801b2698d0f756231f6f8fa0be5a36f0c05)