Re: [PATCH] rs6000, Add new overloaded vector shift builtin int128, varients

2024-07-25 Thread Kewen.Lin
Hi Carl, Some minor comments are inlined on top of Segher's and Peter's comments. on 2024/7/20 04:04, Carl Love wrote: > GCC developers: > > The following patch adds the int128 varients to the existing overloaded > built-ins vec_sld, vec_sldb, vec_sldw, vec_sll, vec_slo, vec_srdb, vec_srl, > v

Re: [PATCH 0/2] rs6000, remove vec and vsx set builtins

2024-07-25 Thread Kewen.Lin
Hi Carl, on 2024/7/24 01:32, Carl Love wrote: > GCC maintainers: > > The code generated by using C-code to set a vector element versus using a > built-in has been investigated.  The assembly code generated from the C-code > is as good or better than the assembly code generated for the built-ins

Re: [PATCH 2/2] rs6000, remove built-ins __builtin_vsx_set_1ti, __builtin_vsx_set_2df, __builtin_vsx_set_2di

2024-07-25 Thread Kewen.Lin
Hi Carl, on 2024/7/24 01:52, Carl Love wrote: > GCC maintainers: > > This patch removes the vsx set built-ins: __builtin_vsx_set_1ti, > __builtin_vsx_set_2df, __builtin_vsx_set_2di.  With the  removal of these > built-ins, the built-in attribute "set", used in the built-in definition > file, i

Re: [PATCH 1/2] rs6000, Remove __builtin_vec_set_v1ti,, __builtin_vec_set_v2df, __builtin_vec_set_v2di

2024-07-25 Thread Kewen.Lin
Hi Carl, on 2024/7/24 01:52, Carl Love wrote: > > GCC maintainers: > > This patch was previously posted.  Per the feedback, it is now the first of > two patches to remove the set built-ins. > > This patch removes the __builtin_vec_set_v1ti, __builtin_vec_set_v2df and > __builtin_vec_set_v2di

Re: [PATCH ver 2] rs6000, remove __builtin_vsx_xvcmp* built-ins

2024-07-25 Thread Kewen.Lin
Hi Carl, on 2024/7/24 01:06, Carl Love wrote: > GCC maintainers: > > version 2, Updated patch comments, added missing ChangeLog.  Fixed unintended > line removal. > > The following patch removes the three __builtin_vsx_xvcmp[eq|ge|gt]sp  > builtins as they similar to the overloaded vec_cmp[eq|

Re: [PATCH ver 2] rs6000, Add new overloaded vector shift builtin int128, varients

2024-07-29 Thread Kewen.Lin
Hi Carl, on 2024/7/27 06:37, Carl Love wrote: > GCC developers: > > Version 2, updated rs6000-overload.def to remove adding additonal internal > names and to change XXSLDWI_Q to XXSLDWI_1TI per comments from Kewen.  Move > new documentation statement for the PIVPR built-ins per comments from Ke

Re: [PATCH] rs6000, add comment to VEC_IC definition

2024-07-29 Thread Kewen.Lin
Hi Carl, on 2024/7/27 07:31, Carl Love wrote: > GCC maintainers: > > This patch adds a comment to the VEC_IC definitions to clarify the V1TI > "TARGET_POWER10" mode per the request by Segher in the feedback to patch > "https://gcc.gnu.org/pipermail/gcc-patches/2024-July/658156.html";. > > http

Re: [PATCH ver 2] rs6000, Add new overloaded vector shift builtin int128, varients

2024-07-29 Thread Kewen.Lin
on 2024/7/29 23:47, Peter Bergner wrote: > On 7/29/24 5:21 AM, Kewen.Lin wrote: >> on 2024/7/27 06:37, Carl Love wrote: >>> --- /dev/null >>> +++ b/gcc/testsuite/gcc.target/powerpc/vec-shift-double-runnable-int128.c >>> @@ -0,0 +1,358 @@ >>> +/* { dg-do

[PATCH] testsuite, rs6000: Make {vmx,vsx,p8vector}_hw check for altivec/vsx feature

2024-07-31 Thread Kewen.Lin
Hi, Different from p9vector_hw, vmx_hw/vsx_hw/p8vector_hw checks can still succeed without Altivec/VSX feature support. We have many runnable test cases only checking for these *_hw without extra checking for if Altivec/VSX feature enabled or not. It means they can fail if being tested by explic

[PATCH] testsuite, rs6000: Remove useless powerpc_{altivec,vsx}_ok

2024-07-31 Thread Kewen.Lin
Hi, Checking the existing powerpc_{altivec,vsx}_ok test cases, I found there are some test cases which don't require the checks powerpc_{altivec,vsx} even, some of them already have other effective target check which can cover check powerpc_{altivec,vsx}, or some of them don't actually require VSX

[PATCH] testsuite, rs6000: Replace powerpc_vsx_ok with powerpc_vsx

2024-07-31 Thread Kewen.Lin
Hi, Following up the previous r15-886, this patch to clean up the remaining powerpc_vsx_ok which actually should use powerpc_vsx instead. Bootstrapped and regtested on powerpc64-linux-gnu P8/P9 and powerpc64le-linux-gnu P9 and P10. I'm going to push this next week if no objections. BR, Kewen --

[PATCH] testsuite, rs6000: Fix some run cases with appropriate *_hw

2024-07-31 Thread Kewen.Lin
Hi, When cleaning up the remaining powerpc_{vsx,altivec}_ok test cases, I found some dg-do run test cases which should check for the appropriate {p8vector,vmx}_hw check instead. This patch is to adjust them accordingly. Bootstrapped and regtested on powerpc64-linux-gnu P8/P9 and powerpc64le-linu

[PATCH] testsuite, rs6000: Replace powerpc_vsx_ok with powerpc_altivec etc.

2024-07-31 Thread Kewen.Lin
Hi, This is a follow up patch for the previous patch adjusting powerpc_vsx_ok with powerpc_vsx, focusing on those test cases which don't really require VSX feature but used powerpc_vsx_ok before, they actually require some other effective target check, like some of them just require ALTIVEC featur

[PATCH] testsuite, rs6000: Adjust pr78056-[1357].c and remove pr78056-[246].c

2024-07-31 Thread Kewen.Lin
Hi, When cleaning up the remaining powerpc_{vsx,altivec}_ok test cases, I found some issues are related to pr78056-*.c. Firstly, the test points of pr78056-[246].c are no longer available since r9-3164 drops many HAVE_AS_* and the expected warning are dropped together, so this patch is to remove t

Re: [PATCH, rs6000] Add const_vector into any_operand predicate

2024-07-31 Thread Kewen.Lin
Hi Haochen, on 2024/7/25 11:34, HAO CHEN GUI wrote: > Hi, > This patch add const_vector into any_operand predicate. From my > understanding, any_operand should include all kinds of operands. > The const_vector should be included. As emit_move_insn doesn't check > the predicate, the const_vector

Re: [PATCH] rs6000, document built-ins vec_test_lsbb_all_ones and, vec_test_lsbb_all_zeros

2024-07-31 Thread Kewen.Lin
Hi Carl, on 2024/7/27 06:56, Carl Love wrote: > GCC maintainers: > > Per a report from a user, the existing vec_test_lsbb_all_ones and, > vec_test_lsbb_all_zeros built-ins are not documented in the GCC documentation > file. > > The following patch adds missing documentation for the vec_test_ls

[PATCH] testsuite: Adjust fam-in-union-alone-in-struct-2.c to support BE [PR116148]

2024-07-31 Thread Kewen.Lin
Hi, As Andrew pointed out in PR116148, fam-in-union-alone-in-struct-2.c was designed for little-endian, the recent commit r15-2403 made it be tested with running on BE and PR116148 got exposed. This patch is to adjust the expected data for members in with_fam_2_v and with_fam_3_v by considering e

Re: [PATCH] rs6000, document built-ins vec_test_lsbb_all_ones and, vec_test_lsbb_all_zeros

2024-07-31 Thread Kewen.Lin
on 2024/8/1 01:52, Carl Love wrote: > Kewen: > > On 7/31/24 2:12 AM, Kewen.Lin wrote: >> Hi Carl, >> >> on 2024/7/27 06:56, Carl Love wrote: >>> GCC maintainers: >>> >>> Per a report from a user, the existing vec_test_lsbb_all_ones and, >&

Re: [PATCH] testsuite: Adjust fam-in-union-alone-in-struct-2.c to support BE [PR116148]

2024-08-01 Thread Kewen.Lin
8. BR, Kewen > > Richard. > >> Qing >> >>> On Jul 31, 2024, at 05:22, Kewen.Lin wrote: >>> >>> Hi, >>> >>> As Andrew pointed out in PR116148, fam-in-union-alone-in-struct-2.c >>> was designed for little-endian, the

Re: [PATCH ver 3] rs6000, Add new overloaded vector shift builtin int128, variants

2024-08-04 Thread Kewen.Lin
Hi Carl, on 2024/8/2 03:35, Carl Love wrote: > GCC developers: > > Version 3, updated the testcase dg-do link to dg-do compile.  Moved the new > documentation again.  Retested on Power 10 LE and BE to verify the dg > arguments disable the test on Power10BE but enable the test for Power10LE.  >

Re: [PATCH] rs6000, document built-ins vec_test_lsbb_all_ones and, vec_test_lsbb_all_zeros

2024-08-04 Thread Kewen.Lin
on 2024/8/3 05:48, Peter Bergner wrote: > On 7/31/24 10:21 PM, Kewen.Lin wrote: >> on 2024/8/1 01:52, Carl Love wrote: >>> Yes, I noticed that the built-ins were defined as overloaded but only had >>> one definition. Did seem odd to me. >>> >>>> e

Re: [PATCH] rs6000: Add OPTION_MASK_POWER8 [PR101865]

2024-04-11 Thread Kewen.Lin
Hi, on 2024/4/12 06:15, Peter Bergner wrote: > FYI: This patch is an update to Will Schmidt's patches to fix PR101865: > > https://gcc.gnu.org/pipermail/gcc-patches/2022-September/601825.html > https://gcc.gnu.org/pipermail/gcc-patches/2022-September/601823.html > > ...taking into considerat

Re: [PATCH, rs6000] Fix test case bcd4.c

2024-04-17 Thread Kewen.Lin
Hi, on 2024/4/17 13:12, HAO CHEN GUI wrote: > Hi, > This patch fixes loss of return statement in maxbcd of bcd-4.c. Without > return statement, it returns an invalid bcd number and make the test > noneffective. The patch also enables test to run on Power9 and Big Endian, > as all bcd instruction

Re: [PATCH V3] rs6000: Don't ICE when compiling the __builtin_vsx_splat_2di built-in [PR113950]

2024-04-17 Thread Kewen.Lin
Hi, on 2024/4/17 17:05, jeevitha wrote: > Hi, > > On 18/03/24 7:00 am, Kewen.Lin wrote: > >>> The bogus vsx_splat_ code goes all the way back to GCC 8, so we >>> should backport this fix. Segher and Ke Wen, can we get an approval >>> to backport this t

[PATCH] testsuite, rs6000: Fix builtins-6-p9-runnable.c for BE [PR114744]

2024-04-17 Thread Kewen.Lin
Hi, Test case builtins-6-p9-runnable.c doesn't work well on BE due to two problems: - When applying vec_xl_len onto data_128 and data_u128 with length 8, it expects to load 128[01] from the memory, but unfortunately assigning 128[01] to a {vector} {u,}int128 type variable, th

Re: [PATCH, rs6000] Use bcdsub. instead of bcdadd. for bcd invalid number checking

2024-04-17 Thread Kewen.Lin
Hi, on 2024/4/18 10:01, HAO CHEN GUI wrote: > Hi, > This patch replace bcdadd. with bcdsub. for bcd invalid number checking. > bcdadd on two same numbers might cause overflow which also set > overflow/invalid bit so that we can't distinguish it's invalid or overflow. > The bcdsub doesn't have th

Re: [PATCH] [testsuite] [ppc64] expect error on vxworks too

2024-04-22 Thread Kewen.Lin
on 2024/4/22 17:23, Alexandre Oliva wrote: > > These ppc lp64 tests check for errors or warnings on -mno-powerpc64. > On powerpc64-*-vxworks* we get the same errors as on most other > covered platforms, but the tests did not mark them as expected for > this target. On powerpc-*-vxworks*, the test

Re: [PATCH] disable ldist for test, to restore vectorizing-candidate loop

2024-04-22 Thread Kewen.Lin
on 2024/4/22 17:27, Alexandre Oliva wrote: > Ping? > https://gcc.gnu.org/pipermail/gcc-patches/2021-March/566524.html > > The loop we're supposed to try to vectorize in > gcc.dg/vect/costmodel/ppc/costmodel-vect-31a.c is turned into a memset > before the vectorizer runs. > > Various other tests i

Re: [PATCH] Request check for hw support in ppc run tests with -maltivec/-mvsx

2024-04-22 Thread Kewen.Lin
on 2024/4/22 17:31, Alexandre Oliva wrote: > > From: Olivier Hainque > > Regstrapped on x86_64-linux-gnu and ppc64el-linux-gnu. Also tested with > gcc-13 on ppc64-vx7r2 and ppc-vx7r2. Ok to install? OK, thanks! BR, Kewen > > for gcc/testsuite/ChangeLog > > * gcc.target/powerpc/swap

Re: [PATCH] ppc: testsuite: vec-mul requires vsx runtime

2024-04-22 Thread Kewen.Lin
on 2024/4/22 17:35, Alexandre Oliva wrote: > Ping? > https://gcc.gnu.org/pipermail/gcc-patches/2022-May/593947.html > > > vec-mul is an execution test, but it only requires a powerpc_vsx_ok > effective target, which is enough only for compile tests. In order to > To check for runtime and executi

Re: [PATCH v2] xfail fetestexcept test - ppc always uses fcmpu

2024-04-23 Thread Kewen.Lin
Hi, on 2024/4/22 18:00, Alexandre Oliva wrote: > On Mar 10, 2021, Joseph Myers wrote: > >> On Wed, 10 Mar 2021, Alexandre Oliva wrote: >>> operand exception for quiet NaN. I couldn't find any evidence that >>> the rs6000 backend ever outputs fcmpo. Therefore, I'm adding the same >>> execution

Re: [PATCH v2] [testsuite] require sqrt_insn effective target where needed

2024-04-23 Thread Kewen.Lin
Hi, on 2024/4/22 17:56, Alexandre Oliva wrote: > This patch takes feedback received for 3 earlier patches, and adopts a > simpler approach to skip the still-failing tests, that I believe to be > in line with ppc maintainers' expressed preferences. > https://gcc.gnu.org/pipermail/gcc-patches/2021-F

Re: [PATCH] adjust vectorization expectations for ppc costmodel 76b

2024-04-24 Thread Kewen.Lin
Hi, on 2024/4/22 17:28, Alexandre Oliva wrote: > Ping? > https://gcc.gnu.org/pipermail/gcc-patches/2021-March/566525.html > > > This test expects vectorization at power8+ because strict alignment is > not required for vectors. For power7, vectorization is not to take > place because it's not de

Re: [PATCH] adjust vectorization expectations for ppc costmodel 76b

2024-04-28 Thread Kewen.Lin
Hi, on 2024/4/28 16:14, Alexandre Oliva wrote: > On Apr 24, 2024, "Kewen.Lin" wrote: > >> For !has_arch_pwr7 case, it still adopts peeling but as the comment (one >> line above) >> shows the original intention of this case is to expect not profitable for >

Re: [PATCH] ppc: testsuite: pr79004 needs -mlong-double-128

2024-04-28 Thread Kewen.Lin
Hi, on 2024/4/28 16:20, Alexandre Oliva wrote: > On Apr 23, 2024, "Kewen.Lin" wrote: > >> This patch seemed to miss to CC gcc-patches list. :) > > Oops, sorry, thanks for catching that. > > Here it is. FTR, you've already responded suggesting an appare

Re: [PATCH] adjust vectorization expectations for ppc costmodel 76b

2024-04-29 Thread Kewen.Lin
on 2024/4/29 14:28, Alexandre Oliva wrote: > On Apr 28, 2024, "Kewen.Lin" wrote: > >> Nit: Maybe add a prefix "testsuite: ". > > ACK > >>> >>> From: Kewen Lin > >> Thanks, you can just drop this. :) > > I've t

Re: [PATCH] ppc: testsuite: pr79004 needs -mlong-double-128

2024-04-29 Thread Kewen.Lin
on 2024/4/29 15:20, Alexandre Oliva wrote: > On Apr 28, 2024, "Kewen.Lin" wrote: > >> OK, from this perspective IMHO it seems more clear to adopt xfail >> with effective target long_double_64bit? > > That's effective target is quite broken, alas. I

[PATCH 1/4] rs6000: Make all 128 bit scalar FP modes have 128 bit precision [PR112993]

2024-05-07 Thread Kewen.Lin
Hi, On rs6000, there are three 128 bit scalar floating point modes TFmode, IFmode and KFmode. With some historical reasons, we defines them with different mode precisions, that is KFmode 126, TFmode 127 and IFmode 128. But in fact all of them should have the same mode precision 128, this special

[PATCH 2/4] fortran: Teach get_real_kind_from_node for Power 128 fp modes [PR112993]

2024-05-07 Thread Kewen.Lin
Hi, Previously effective target fortran_real_c_float128 never passes on Power regardless of the default 128 long double is ibmlongdouble or ieeelongdouble. It's due to that TF mode is always used for kind 16 real, which has precision 127, while the node float128_type_node for c_float128 has 128 t

[PATCH 3/4] ranger: Revert the workaround introduced in PR112788 [PR112993]

2024-05-07 Thread Kewen.Lin
Hi, This reverts commit r14-6478-gfda8e2f8292a90 "range: Workaround different type precision between _Float128 and long double [PR112788]" as the fixes for PR112993 make all 128 bits scalar floating point have the same 128 bit precision, this workaround isn't needed any more. Bootstrapped and reg

[PATCH 4/4] tree: Remove KFmode workaround [PR112993]

2024-05-07 Thread Kewen.Lin
Hi, The fix for PR112993 makes KFmode have 128 bit mode precision, we don't need this workaround to fix up the type precision any more, and just go with mode precision. So this patch is to remove KFmode workaround. Bootstrapped and regress-tested on: - powerpc64-linux-gnu P8/P9 (with ibm128 by

[PATCH] rs6000: Adjust -fpatchable-function-entry* support for dual entry [PR112980]

2024-05-07 Thread Kewen.Lin
Hi, As the discussion in PR112980, although the current implementation for -fpatchable-function-entry* conforms with the documentation (making N NOPs be consecutive), it's inefficient for both kernel and userspace livepatching (see comments in PR for the details). So this patch is to change the c

[PATCH] rs6000: Fix ICE on IEEE128 long double without vsx [PR114402]

2024-05-07 Thread Kewen.Lin
Hi, As PR114402 shows, we supports IEEE128 format long double even if there is no vsx support, but there is an ICE about cbranch as the test case shows. For now, we only supports compare:CCFP pattern for IEEE128 fp if TARGET_FLOAT128_HW, so in function rs6000_generate_compare we have a check with

[PATCH] rs6000: Clean up TF and TD check with FLOAT128_2REG_P

2024-05-07 Thread Kewen.Lin
Hi, Commit r6-2116-g2c83faf86827bf did some clean up on TFmode and TFmode check with FLOAT128_2REG_P, but it missed to update an assertion, this patch is to make it align. btw, it's noticed when I'm making a patch to get rid of TFmode. Bootstrapped and regtested on powerpc64-linux-gnu P8/P9 and

[PATCH] rs6000: Remove useless operands[3]

2024-05-07 Thread Kewen.Lin
Hi, As shown, three uses of operands[3] are totally useless, so this patch is to remove them to avoid any confusion. Bootstrapped and regtested on powerpc64-linux-gnu P8/P9 and powerpc64le-linux-gnu P9 and P10. I'm going to push this soon if no objections. BR, Kewen - gcc/ChangeLog:

[PATCH] rs6000: Remove useless entries in rreg

2024-05-07 Thread Kewen.Lin
Hi, When I was working on a trial patch to get rid of TFmode, I noticed that mode attribute rreg only gets used for mode iterator SFDF, it means that only SF and DF key-value pairs are useful, the other are useless, so this patch is to clean up them. Bootstrapped and regtested on powerpc64-linux-

[PATCH] rs6000: Drop useless vector_{load,store}_ defines

2024-05-07 Thread Kewen.Lin
Hi, When I was working on a patch to get rid of TFmode, I noticed that define_expands vector_load_ and vector_store_ are useless. This patch is to clean up both. Bootstrapped and regtested on powerpc64-linux-gnu P8/P9 and powerpc64le-linux-gnu P9 and P10. I'm going to push this soon if no objec

[PATCH] testsuite: Fix typo in torture/vector-{1,2}.c

2024-05-07 Thread Kewen.Lin
Hi, When making some clean up patches, I happened to find test cases vector-{1,2}.c are having typo "powerpc64--*-*" in target selector, which should be powerpc64-*-*. The reason why we didn't catch before is that all our testing machines support VMX insns, so it passes always. But it would brea

[PATCH] testsuite, rs6000: Remove some checks with aix[456]

2024-05-07 Thread Kewen.Lin
Hi, Since r12-75-g0745b6fa66c69c aix6 support had been dropped, so we don't need to check for aix[456].* when testing, this patch is to remove such checks. Regtested on powerpc64-linux-gnu P8/P9 and powerpc64le-linux-gnu P9 and P10. I'm going to push this soon if no objections. BR, Kewen -

[PATCH] testsuite, rs6000: Remove all linux*paired* checks and cases

2024-05-07 Thread Kewen.Lin
Hi, Since r9-115-g559289370f76bf the support of paired single had been dropped, but we still have some test checks and cases for that, this patch is to get rid of them. Regtested on powerpc64-linux-gnu P8/P9 and powerpc64le-linux-gnu P9 and P10. I'm going to push this soon if no objections. BR,

[PATCH] rs6000: Add assert !TARGET_VSX if !TARGET_ALTIVEC and strip a useless check

2024-05-07 Thread Kewen.Lin
Hi, In function rs6000_option_override_internal, we have the checks and adjustments like: if (TARGET_P8_VECTOR && !TARGET_ALTIVEC) rs6000_isa_flags &= ~OPTION_MASK_P8_VECTOR; if (TARGET_P8_VECTOR && !TARGET_VSX) rs6000_isa_flags &= ~OPTION_MASK_P8_VECTOR; But in fact some previous c

[PATCH] libgcc, rs6000: Remove powerpcspe related code

2024-05-07 Thread Kewen.Lin
Hi, Since r9-4728 the powerpcspe support had been removed, this follow-up patch is to remove the remaining pieces in libgcc. Bootstrapped and regtested on powerpc64-linux-gnu P8/P9 and powerpc64le-linux-gnu P9 and P10. I'm going to push this soon if no objections. BR, Kewen - libgcc/Change

[PATCH] testsuite, rs6000: Remove effective target powerpc_405_nocache

2024-05-07 Thread Kewen.Lin
Hi, With the introduction of -mdejagnu-cpu=, when the test case is specifying -mdejagnu-cpu=405, it would override the other possibly given -mcpu=, so it would compile for PowerPC 405 for sure. This patch is to remove the effective target powerpc_405_nocache and update all its uses. Regtested on

[PATCH 1/2] testsuite, rs6000: Make powerpc_vsx consider current_compiler_flags [PR114842]

2024-05-07 Thread Kewen.Lin
Hi, As noted in PR114842, most of the test cases which require effective target check powerpc_vsx_ok actually care about if VSX feature is enabled, and they should adopt effective target powerpc_vsx instead. By considering we already have a number of test cases having explicit -mvsx in dg-options

[PATCH] testsuite, rs6000: Remove powerpc_popcntb_ok

2024-05-07 Thread Kewen.Lin
Hi, There are three uses of effective target powerpc_popcntb_ok, they are all for compiling, but powerpc_popcntb_ok checks for executable generation, which is too heavy. This patch is to remove powerpc_popcntb_ok and adjust its three uses accordingly. Regtested on powerpc64-linux-gnu P8/P9 and p

[PATCH] testsuite, rs6000: Remove powerpcspe test cases and checks

2024-05-08 Thread Kewen.Lin
Hi, Since r9-4728 the powerpcspe support had been removed, this follow-up patch is to remove the remaining pieces in testsuite. Regtested on powerpc64-linux-gnu P8/P9 and powerpc64le-linux-gnu P9 and P10. I'm going to push this soon if no objections. BR, Kewen - gcc/testsuite/ChangeLog:

Re: [PATCH] rs6000: Adjust -fpatchable-function-entry* support for dual entry [PR112980]

2024-05-08 Thread Kewen.Lin
Hi Richi, >> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi >> index c584664e168..58e48f7dc55 100644 >> --- a/gcc/doc/invoke.texi >> +++ b/gcc/doc/invoke.texi >> @@ -18363,11 +18363,11 @@ If @code{N=0}, no pad location is recorded. >> The NOP instructions are inserted at---and maybe befor

Re: [PATCH] ppc: testsuite: pr79004 needs -mlong-double-128

2024-05-08 Thread Kewen.Lin
on 2024/4/30 07:11, Alexandre Oliva wrote: > On Apr 29, 2024, "Kewen.Lin" wrote: > >> Thanks for catching this and sorry >> that I didn't check it before suggesting it, I think we can aggressively >> drop this effective target instead to avoid any po

Re: [PATCH 2/4] fortran: Teach get_real_kind_from_node for Power 128 fp modes [PR112993]g

2024-05-08 Thread Kewen.Lin
Hi, on 2024/5/9 06:01, Steve Kargl wrote: > On Wed, May 08, 2024 at 01:27:53PM +0800, Kewen.Lin wrote: >> >> Previously effective target fortran_real_c_float128 never >> passes on Power regardless of the default 128 long double >> is ibmlongdouble or ieeelongdouble.

Re: [PATCH] rs6000: Enable overlapped by-pieces operations

2024-05-08 Thread Kewen.Lin
Hi, on 2024/5/8 14:47, HAO CHEN GUI wrote: > Hi, > This patch enables overlapped by-piece operations. On rs6000, default > move/set/clear ratio is 2. So the overlap is only enabled with compare > by-pieces. Thanks for enabling this, did you evaluate if it can help some benchmark? > > Bootst

Re: [PATCHv2] rs6000: Enable overlapped by-pieces operations

2024-05-12 Thread Kewen.Lin
on 2024/5/10 17:29, HAO CHEN GUI wrote: > Hi, > This patch enables overlapped by-piece operations. On rs6000, default > move/set/clear ratio is 2. So the overlap is only enabled with compare > by-pieces. > > Compared to previous version, the change is to remove power8 > requirement from test c

Re: [PATCH 1/13] rs6000, Remove __builtin_vsx_cmple* builtins

2024-05-12 Thread Kewen.Lin
Hi, on 2024/4/20 05:16, Carl Love wrote: > > rs6000, Remove __builtin_vsx_cmple* builtins > > The built-ins __builtin_vsx_cmple_u16qi, __builtin_vsx_cmple_u2di, > __builtin_vsx_cmple_u4si and __builtin_vsx_cmple_u8hi should take > unsigned arguments and return an unsigned result. The current de

Re: [PATCH] report message for operator %a on unaddressible exp

2024-05-13 Thread Kewen.Lin
Hi, on 2024/5/13 10:57, Jiufu Guo wrote: > Hi, > > For PR96866, when gcc print asm code for modifier "%a" which requires > an address operand, while the operand is with the constraint "X" which > allow non-address form. An error message would be reported to indicate > the invalid asm operands. >

Re: [PATCH] rs6000: Enable overlapped by-pieces operations

2024-05-13 Thread Kewen.Lin
Hi, on 2024/5/9 15:35, HAO CHEN GUI wrote: > Hi Kewen, > Thanks for your comments. > > 在 2024/5/9 13:44, Kewen.Lin 写道: >> Hi, >> >> on 2024/5/8 14:47, HAO CHEN GUI wrote: >>> Hi, >>> This patch enables overlapped by-piece operations. On rs600

Re: [PATCH 5/13] rs6000, remove duplicated built-ins of vecmergl and vec_mergeh

2024-05-13 Thread Kewen.Lin
Hi, on 2024/4/20 05:17, Carl Love wrote: > rs6000, remove duplicated built-ins of vecmergl and vec_mergeh > > The following undocumented built-ins are same as existing documented > overloaded builtins. > > const vf __builtin_vsx_xxmrghw (vf, vf); > same as vf __builtin_vec_mergeh (vf, vf);

Re: [PATCH 6/13] rs6000, add overloaded vec_sel with int128 arguments

2024-05-13 Thread Kewen.Lin
Hi, on 2024/4/20 05:17, Carl Love wrote: > rs6000, add overloaded vec_sel with int128 arguments > > Extend the vec_sel built-in to take three signed/unsigned int128 arguments > and return a signed/unsigned int128 result. > > Extending the vec_sel built-in makes the existing buit-ins > __builtin_

Re: [PATCH 7/13] rs6000, remove the vec_xxsel built-ins, they are duplicates

2024-05-13 Thread Kewen.Lin
Hi, on 2024/4/20 05:18, Carl Love wrote: > rs6000, remove the vec_xxsel built-ins, they are duplicates > > The following undocumented built-ins are covered by the existing overloaded > vec_sel built-in definitions. > > const vsc __builtin_vsx_xxsel_16qi (vsc, vsc, vsc); > same as vsc __builtin

Re: [PATCH 8/13] rs6000, remove __builtin_vsx_vperm_* built-ins

2024-05-13 Thread Kewen.Lin
Hi, on 2024/4/20 05:18, Carl Love wrote: > rs6000, remove __builtin_vsx_vperm_* built-ins > > The undocumented built-ins: > __builtin_vsx_vperm_16qi_uns, > __builtin_vsx_vperm_1ti, > __builtin_vsx_vperm_1ti_uns, > __builtin_vsx_vperm_2df, > __builtin_vsx_vperm_2di, > __builtin_vsx_vpe

Re: [PATCH 9/13] rs6000, remove __builtin_vsx_xvnegdp and __builtin_vsx_xvnegsp built-ins

2024-05-13 Thread Kewen.Lin
Hi, on 2024/4/20 05:18, Carl Love wrote: > rs6000, remove __builtin_vsx_xvnegdp and __builtin_vsx_xvnegsp built-ins > > The undocumented __builtin_vsx_xvnegdp and __builtin_vsx_xvnegsp are > redundant. The overloaded vec_neg built-in provides the same > functionality. The two buit-ins are not d

Re: [PATCH] report message for operator %a on unaddressible exp

2024-05-13 Thread Kewen.Lin
Hi, on 2024/5/14 11:00, Jiufu Guo wrote: > Hi, > > Thanks a lot for your helpful review! > > "Kewen.Lin" writes: > >> Hi, >> >> on 2024/5/13 10:57, Jiufu Guo wrote: >>> Hi, >>> >>> For PR96866, when gcc print asm code f

Re: [PATCH 10/13] rs6000, extend vec_xxpermdi built-in for __int128 args

2024-05-13 Thread Kewen.Lin
Hi, on 2024/4/20 05:18, Carl Love wrote: > rs6000, extend vec_xxpermdi built-in for __int128 args > > Add a new overloaded instance for vec_xxpermdi > >__int128 vec_xxpermdi (__int128, __int128, const int); > > Update the documentation to include a reference to the new built-in > instance.

Re: [PATCH 12/13] rs6000, remove __builtin_vsx_xvcmpeqsp built-in

2024-05-13 Thread Kewen.Lin
Hi, on 2024/4/20 05:18, Carl Love wrote: > rs6000, remove __builtin_vsx_xvcmpeqsp built-in > > The built-in __builtin_vsx_xvcmpeqsp is a duplicate of the overloaded > vec_cmpeq built-in. The built-in is undocumented. The built-in and > the test cases are removed. > > gcc/ChangeLog: > * c

Re: [PATCH 13/13] rs6000, remove vector set and vector init built-ins.

2024-05-13 Thread Kewen.Lin
Hi, on 2024/4/20 05:18, Carl Love wrote: > rs6000, remove vector set and vector init built-ins. > > The vector init built-ins: > > __builtin_vec_init_v16qi, __builtin_vec_init_v8hi, > __builtin_vec_init_v4si, __builtin_vec_init_v4sf, > __builtin_vec_init_v2di, __builtin_vec_init_v2df, >

Re: [PATCH 3/13] rs6000, fix error in unsigned vector float to unsigned int built-in definitions

2024-05-14 Thread Kewen.Lin
Hi, on 2024/4/20 05:17, Carl Love wrote: > rs6000, fix error in unsigned vector float to unsigned int built-in > definitions > > The built-ins __builtin_vsx_vunsigned_v2df and__builtin_vsx_vunsigned_v4sf > are supposed to take a vector of floats and return a vector of unsigned > long long ints.

Re: [PATCH 4/13] rs6000, extend the current vec_{un,}signed{e,o} built-ins

2024-05-14 Thread Kewen.Lin
Hi, on 2024/4/20 05:17, Carl Love wrote: > rs6000, extend the current vec_{un,}signed{e,o} built-ins > > The built-ins __builtin_vsx_xvcvspsxds and __builtin_vsx_xvcvspuxds > convert a vector of floats to signed/unsigned long long ints. Extend the > existing vec_{un,}signed{e,o} built-ins to han

Re: [PATCH 2/4 GCC11] Add target hook stride_dform_valid_p

2020-02-25 Thread Kewen.Lin
on 2020/1/20 下午9:14, Segher Boessenkool wrote: > Hi! > > On Mon, Jan 20, 2020 at 10:42:12AM +, Richard Sandiford wrote: >> "Kewen.Lin" writes: >>> gcc/ChangeLog >>> >>> 2020-01-16 Kewen Lin >>> >>>

[PATCH 3/4 V2 GCC11] IVOPTs Consider cost_step on different forms during unrolling

2020-02-25 Thread Kewen.Lin
Hi, As the proposed hook changes, updated this with main changes: 1) Check with addr_offset_valid_p instead. 2) Check the 1st and the last use for the whole address group. 3) Scale up group costs accordingly. Bootstrapped/regtested on powerpc64le-linux-gnu (LE). BR, Kewen --- gcc/

[testsuite] Update several scev/IVOPTs cases

2020-02-25 Thread Kewen.Lin
Hi, Several scev/IVOPTs cases aim to check some array references are sceved and later marked as REFERENCE ADDRESS IV groups. With IV group type dumping improving, these check strings can be improved. Otherwise, they become fragile with dumping changes. This patch is to keep check strings concise

[testsuite] Fix PR93935 to guard case under vect_hw_misalign

2020-02-25 Thread Kewen.Lin
Hi, This patch is to apply the same fix as r267528 to another similar case bb-slp-over-widen-2.c which requires misaligned vector access. Verified it on ppc64-redhat-linux (Power7 BE). Is it ok for trunk? BR, Kewen --- gcc/testsuite/ChangeLog 2020-02-26 Kewen Lin PR testsu

Re: [PATCH 2/4 GCC11] Add target hook stride_dform_valid_p

2020-03-03 Thread Kewen.Lin
>> Hi Segher and Richard S., >> >> Sorry for late response. Thanks for your comments on legitimate_address_p >> hook >> and function addr_offset_valid_p. I updated the IVOPTs part with >> addr_offset_valid_p, although rs6000_legitimate_offset_address_p doesn't >> check >> strictly all the time

[testsuite] Fix PR94023 to guard case under vect_hw_misalign

2020-03-03 Thread Kewen.Lin
Hi, As PR94023 shows, the expected SLP requires misaligned vector access support. This patch is to guard the check under the target condition vect_hw_misalign to ensure that. Verified it on ppc64-redhat-linux (Power7 BE). Is it ok for trunk, and backport to GCC 9 after some burn-in time? BR,

[testsuite] Fix PR94019 to allow one vector char when !vect_hw_misalign

2020-03-03 Thread Kewen.Lin
Hi, As PR94019 shows, without misaligned vector access support but with realign load, the vectorized loop will end up with realign scheme. It generates mask (control vector) with return type vector signed char which breaks the not check. The fix is to differentiate powerpc vect_hw_misalign and po

Re: [PATCH] [rs6000] Rewrite the declaration of a variable

2020-03-04 Thread Kewen.Lin
on 2020/3/4 下午3:24, binbin wrote: > Hi > > On 2020/3/4 上午8:33, Segher Boessenkool wrote: >> Hi! >> >> On Tue, Mar 03, 2020 at 10:13:56AM -0600, Bin Bin Lv wrote: >>> Rewrite the declaration of toc_section from the source file rs6000.c to its >>> header file for standardizing the code. >> >>> diff

Re: [testsuite] Fix PR94019 to allow one vector char when !vect_hw_misalign

2020-03-04 Thread Kewen.Lin
Hi Segher, on 2020/3/5 上午2:44, Segher Boessenkool wrote: > Hi! > > On Wed, Mar 04, 2020 at 03:13:51PM +0800, Kewen.Lin wrote: >> As PR94019 shows, without misaligned vector access support but with >> realign load, the vectorized loop will end up with realign scheme. >>

Re: [testsuite] Fix PR94019 to allow one vector char when !vect_hw_misalign

2020-03-04 Thread Kewen.Lin
Hi Richard, on 2020/3/5 上午3:09, Richard Sandiford wrote: > "Kewen.Lin" writes: >> Hi, >> >> >> --- a/gcc/testsuite/gcc.dg/vect/vect-over-widen-17.c >> +++ b/gcc/testsuite/gcc.dg/vect/vect-over-widen-17.c >> @@ -41,6 +41,10 @@ main (void

[PATCH, rs6000] Support float from/to long conversion vectorization

2019-09-26 Thread Kewen.Lin
Hi, This patch is to add the support for float from/to long conversion vectorization. ISA 2.06 supports the vector version instructions for conversion between float and long long (both signed and unsigned), but vectorizer can't exploit them since the optab check fails. So this patch is mainly to

Re: [PATCH, rs6000] Support float from/to long conversion vectorization

2019-09-27 Thread Kewen.Lin
Hi Segher, on 2019/9/27 下午3:27, Segher Boessenkool wrote: > Hi Kewen, > >> +;; Support signed/unsigned long long to float conversion vectorization. >> +(define_expand "vec_pack_float_v2di" >> + [(match_operand:V4SF 0 "vfloat_operand") >> + (any_float:V4SF (parallel [(match_operand:V2DI 1 "vint

Re: [PATCH, rs6000] Support float from/to long conversion vectorization

2019-09-28 Thread Kewen.Lin
Hi Segher, on 2019/9/28 上午12:12, Segher Boessenkool wrote: > On Fri, Sep 27, 2019 at 04:52:30PM +0800, Kewen.Lin wrote: >>> (Maybe one of the gen* tools complains any_fix needs a mode? :QI will do >>> if so, or :P if you like that better). >> >> I didn'

[PATCH, rs6000] Lower vec_perm vectorization cost for P8/P9

2019-09-28 Thread Kewen.Lin
Hi, Recently we are revisiting vectorization cost setting in rs6000_builtin_vectorization_cost, and found the current cost of vec_perm on VSX looks overpriced for Power8 and Power9. The high cost was set for Power7 single VSU pipe, but Power8 and Power9 have supported more VSX units, the perform

Re: [PATCH, rs6000] Lower vec_perm vectorization cost for P8/P9

2019-09-29 Thread Kewen.Lin
Hi Segher, on 2019/9/29 下午3:28, Segher Boessenkool wrote: > Hi! > > On Sun, Sep 29, 2019 at 01:38:31PM +0800, Kewen.Lin wrote: >> Recently we are revisiting vectorization cost setting in >> rs6000_builtin_vectorization_cost, and found the current cost of >> vec_perm

[PATCH, rs6000] Lower vec_promote_demote vectorization cost for P8/P9

2019-10-08 Thread Kewen.Lin
Hi, This patch is to lower vec_promote_demote vectorization cost in rs6000_builtin_vectorization_cost. It's similar to what we committed for vec_perm, the current cost for vec_promote_demote is also overpriced for Power8 and Power9 since Power8 and Power9 has supported more units for permute/unpa

Re: [PATCH, testsuite] Fix PR92464 by adjust test case loop bound

2019-11-13 Thread Kewen.Lin
Hi Segher, on 2019/11/13 下午6:42, Segher Boessenkool wrote: > Hi! > > On Wed, Nov 13, 2019 at 03:31:11PM +0800, Kewen.Lin wrote: >> As PR92464 shows, the recent vectorization cost adjustment on load >> insns is responsible for this regression. It leads the profitable >

Re: [PATCH, rs6000] Refactor FP vector comparison operators

2019-11-19 Thread Kewen.Lin
Hi Segher, on 2019/11/20 上午1:29, Segher Boessenkool wrote: > Hi! > > On Tue, Nov 12, 2019 at 06:41:07PM +0800, Kewen.Lin wrote: >> +;; code iterators and attributes for vector FP comparison operators: >> +(define_code_iterator vector_fp_comparis

Re: [PATCH, rs6000] Refactor FP vector comparison operators

2019-11-21 Thread Kewen.Lin
Hi Segher, on 2019/11/20 下午10:06, Segher Boessenkool wrote: > Hi! > > On Wed, Nov 20, 2019 at 03:31:36PM +0800, Kewen.Lin wrote: > Yeah. Just doing can_create_pseudo in the insn condition (and in the > split condition, via &&) will work -- there just is this window of

Re: [PATCH, rs6000] Refactor FP vector comparison operators

2019-11-24 Thread Kewen.Lin
Hi Segher, on 2019/11/23 上午12:08, Segher Boessenkool wrote: > Hi! >> 2019-11-21 Kewen Lin >> >> * config/rs6000/vector.md (vector_fp_comparison_simple): >> New code iterator. >> (vector_fp_comparison_complex): Likewise. >> (vector_ for VEC_F and >> vector_fp_comparison_s

[PATCH, rs6000] Fix PR92566 by checking VECTOR_UNIT_NONE_P

2019-11-26 Thread Kewen.Lin
Hi, As Segher pointed out in PR92566, we shouldn't offer some vector modes which aren't supported under current setting. This patch is to make it check by VECTOR_UNIT_NONE_P which is initialized as current architecture masks. Bootstrapped and tested on powerpc64le-linux-gnu. Is it ok for trunk?

[PATCH] Fix PR91790 by considering different first_stmt_info for realign

2019-11-26 Thread Kewen.Lin
Hi, As PR91790 exposed, when we have one slp node whose first_stmt_info_for_drptr is different from first_stmt_info, it's possible that the first_stmt DR isn't initialized yet before stmt SLP_TREE_SCALAR_STMTS[0] of slp node. So we shouldn't use first_stmt_info for vect_setup_realignment, instead

Re: [PATCH] [rs6000] Fix PR92098

2019-11-26 Thread Kewen.Lin
Hi Lijia, on 2019/11/27 下午2:31, Li Jia He wrote: > Hi, > > In order to fix PR92098, we need to define vec_cmp_* and vcond_mask_*. In > fact, > PR92132 already fixed the issue on the trunk. We need to backport PR92132 int > part to gcc-9-branch. This patch backport vector_{ungt,unge,unlt,unle}

[PATCH, rs6000] Fix PR92760 by checking VECTOR_MEM_NONE_P instead

2019-12-03 Thread Kewen.Lin
Hi, PR92760 exposed one issue that VECTOR_UNIT_NONE_P (V2DImode) is true on Power7 then we won't return it as preferred_simd_mode but ISA 2.06 (Power7) does introduce partial support on vector doubleword (very limitted) and more basic support origins from ISA 2.07 (Power8) though. To make vector

Re: [PATCH v3 2/3] Add predict_doloop_p target hook

2019-05-21 Thread Kewen.Lin
on 2019/5/21 下午6:20, Richard Biener wrote: > On Tue, 21 May 2019, Kewen.Lin wrote: > >> on 2019/5/21 上午12:37, Segher Boessenkool wrote: >>> On Mon, May 20, 2019 at 08:43:59AM -0600, Jeff Law wrote: >>>>> I think we should have two hooks: one is called with the

<    1   2   3   4   5   6   7   8   9   10   >