Re: decremnt IV patch create fails on PowerPC

2023-05-30 Thread Kewen.Lin via Gcc-patches
on 2023/5/30 17:26, juzhe.zh...@rivai.ai wrote: > Ok. > > It seems that for this conditions: > > + /* If we're vectorizing a loop that uses length "controls" and > + can iterate more than once, we apply decrementing IV approach > + in loop control. */ > + if (LOOP_VINFO_CAN_USE_PARTIAL

Re: [PATCH] VECT: Change flow of decrement IV

2023-05-31 Thread Kewen.Lin via Gcc-patches
> Hi, Richi. > >>> Note with SELECT_VL all bets will be off since as I understand the >>> value it gives can vary from iteration to iteration (but we know >>> a lower and maybe an upper bound?) > Yes, in RVV side, the SELECT_VL output can be in range of [ceil(avl/2), > vlmax],  > can be any value

Re: [PATCH] libgcc: Use initarray section type for .init_stack

2023-05-31 Thread Kewen.Lin via Gcc-patches
Hi Andreas, on 2023/5/25 15:25, Andreas Krebbel wrote: > On 3/20/23 07:33, Kewen.Lin wrote: >> Hi, >> >> One of my workmates found there is a warning like: >> >> libgcc/config/rs6000/morestack.S:402: Warning: ignoring >> incorrect section type for .init_array.0 >> >> when compiling libgc

Re: [PATCH] [testsuite] [powerpc] adjust -m32 counts for fold-vec-extract*

2023-05-31 Thread Kewen.Lin via Gcc-patches
Hi Segher, on 2023/5/25 19:22, Segher Boessenkool wrote: > Hi! > > On Thu, May 25, 2023 at 07:05:55AM -0300, Alexandre Oliva wrote: >> On May 25, 2023, "Kewen.Lin" wrote: >>> So both lp64 and ilp32 have the same count, could we merge it and >>> remove the selectors? >> >> We could, but... I tho

Re: [PATCH v2] rs6000: Add buildin for mffscrn instructions

2023-05-31 Thread Kewen.Lin via Gcc-patches
Hi Carl, on 2023/5/25 23:59, Carl Love wrote: > Peter, Kewen: > > On Thu, 2023-05-25 at 13:28 +0800, Kewen.Lin wrote: >> on 2023/5/24 23:20, Carl Love wrote: >>> On Wed, 2023-05-24 at 13:32 +0800, Kewen.Lin wrote: on 2023/5/24 06:30, Peter Bergner wrote: > On 5/23/23 12:24 AM, Kewen.Lin

Re: [PATCH V2] VECT: Change flow of decrement IV

2023-05-31 Thread Kewen.Lin via Gcc-patches
Hi Juzhe, on 2023/6/1 08:31, juzhe.zh...@rivai.ai wrote: > Bootstrapped and Regression on X86 no surprise different. > > Looking forward Kewen's test report for this patch. > This patch can be bootstrapped and regress-tested on powerpc64-linux-gnu P9 and powerpc64le-linux-gnu P9/P10. Also SPEC

Re: [PATCH V3] VECT: Change flow of decrement IV

2023-05-31 Thread Kewen.Lin via Gcc-patches
Hi, on 2023/6/1 13:00, juzhe.zh...@rivai.ai wrote: > This patch is no difference from V2. I support this patch based on the testing and SPEC2017 evaluation results on Power (see my comments on patch v2). > Just add PR tree-optimization/109971 as Kewen's suggested. Thanks for adding that, I was

Re: [PATCH] rs6000: Fix arguments for __builtin_altivec_tr_stxvrwx, __builtin_altivec_tr_stxvrhx

2023-06-01 Thread Kewen.Lin via Gcc-patches
Hi Carl, on 2023/6/2 04:01, Carl Love wrote: > Kewen, Segher, Peter: > > The following patch is a redo of the previous "rs6000: Fix > __builtin_vec_xst_trunc definition" patch. > > This patch fixes the argument in the two builtin definitions > __builtin_altivec_tr_stxvrwx and __builtin_altivec

Re: [PATCH] rs6000: Fix __builtin_vec_xst_trunc definition

2023-06-01 Thread Kewen.Lin via Gcc-patches
on 2023/6/2 04:01, Carl Love wrote: > On Wed, 2023-05-31 at 12:59 -0500, Peter Bergner wrote: >> On 5/22/23 4:04 AM, Kewen.Lin wrote: >>> on 2023/5/11 02:06, Carl Love via Gcc-patches wrote: @@ -3161,12 +3161,15 @@ void __builtin_altivec_tr_stxvrbx (vsq, signed long, signed char *

Re: [PATCH] libgcc: Use initarray section type for .init_stack

2023-06-04 Thread Kewen.Lin via Gcc-patches
on 2023/6/1 00:57, Ian Lance Taylor wrote: > On Wed, May 31, 2023 at 12:41 AM Kewen.Lin via Gcc-patches > wrote: >> >>>> libgcc/ChangeLog: >>>> >>>> * config/i386/morestack.S: Use @init_array rather than >>>> @progbits for se

Re: [PATCH V2, rs6000] Disable generation of scalar modulo instructions

2023-06-04 Thread Kewen.Lin via Gcc-patches
Hi Pat, Thanks for fixing this and sorry for the late review! on 2023/4/18 20:22, Pat Haugen wrote: > Updated from prior patch to also disable for int128. > > > Disable generation of scalar modulo instructions. > > It was recently discovered that the scalar modulo instructions can suffer > not

Re: [PATCH] rs6000: Add builtins for IEEE 128-bit floating point values

2023-06-05 Thread Kewen.Lin via Gcc-patches
Hi Carl, on 2023/5/2 23:52, Carl Love via Gcc-patches wrote: > GCC maintainers: > > The following patch adds three buitins for inserting and extracting the > exponent and significand for an IEEE 128-bit floating point values. > The builtins are valid for Power 9 and Power 10. We already have:

Re: [PATCH] rs6000: vec_cmpne confusing implementation

2023-06-06 Thread Kewen.Lin via Gcc-patches
Hi Carl, on 2023/5/3 23:30, Carl Love via Gcc-patches wrote: > GCC maintainers: > > The following patch cleans up the definition for the > __builtin_altivec_vcmpnet. The current implementation implies that the > builtin is only supported on Power 9 since it is defined under the > Power 9 stanza.

[PATCH] rs6000: Guard __builtin_{un,}pack_vector_int128 with vsx [PR109932]

2023-06-06 Thread Kewen.Lin via Gcc-patches
Hi, As PR109932 shows, builtins __builtin_{un,}pack_vector_int128 should be guarded under vsx rather than power7, as their corresponding bif patterns have the conditions TARGET_VSX and VECTOR_MEM_ALTIVEC_OR_VSX_P (V1TImode). This patch is to move __builtin_{un,}pack_vector_int128 to stanza vsx to

[PATCH] rs6000: Don't use TFmode for 128 bits fp constant in toc [PR110011]

2023-06-06 Thread Kewen.Lin via Gcc-patches
Hi, As PR110011 shows, when encoding 128 bits fp constant into toc, we adopts REAL_VALUE_TO_TARGET_LONG_DOUBLE which is to find the first float mode with LONG_DOUBLE_TYPE_SIZE bits of precision, it would be TFmode here. But the 128 bits fp constant can be with mode IFmode or KFmode, which doesn't

Re: [PATCH, rs6000] Add two peephole2 patterns for mr. insn

2023-06-06 Thread Kewen.Lin via Gcc-patches
Hi Haochen, on 2023/5/30 14:32, HAO CHEN GUI wrote: > Hi, > By checking the object files of SPECint, I found that two kinds of > compare/move can't be combined to "mr." pattern as there is no register > link between them. The patch adds two peephole2 patterns for them. > Thanks for improving t

Re: [PATCH] rs6000: Add builtins for IEEE 128-bit floating point values

2023-06-07 Thread Kewen.Lin via Gcc-patches
Hi, on 2023/6/7 03:54, Carl Love wrote: > On Mon, 2023-06-05 at 16:45 +0800, Kewen.Lin wrote: >> Hi Carl, >> >> on 2023/5/2 23:52, Carl Love via Gcc-patches wrote: >>> GCC maintainers: >>> >>> The following patch adds three buitins for inserting and extracting >>> the >>> exponent and significand

Re: [PATCH] rs6000: Guard __builtin_{un, }pack_vector_int128 with vsx [PR109932]

2023-06-11 Thread Kewen.Lin via Gcc-patches
on 2023/6/12 02:39, David Edelsohn wrote: > On Tue, Jun 6, 2023 at 5: 19 AM Kewen. Lin wrote: > Hi, As PR109932 shows, builtins __builtin_{un,}pack_vector_int128 should be > guarded under vsx rather than power7, as their corresponding bif patterns > have the conditions > ZjQcmQRYFpfptBannerStar

Re: [PATCH] rs6000: Don't use TFmode for 128 bits fp constant in toc [PR110011]

2023-06-11 Thread Kewen.Lin via Gcc-patches
on 2023/6/11 10:04, David Edelsohn wrote: > On Tue, Jun 6, 2023 at 5: 20 AM Kewen. Lin wrote: > Hi, As PR110011 shows, when encoding 128 bits fp constant into toc, we adopts > REAL_VALUE_TO_TARGET_LONG_DOUBLE which is to find the first float mode with > LONG_DOUBLE_TYPE_SIZE > ZjQcmQRYFpfptBann

Re: [PATCHv2, rs6000] Add two peephole2 patterns for mr. insn

2023-06-12 Thread Kewen.Lin via Gcc-patches
Hi Haochen, on 2023/6/12 10:34, HAO CHEN GUI wrote: > Hi, > This patch adds two peephole2 patterns which help convert certain insn > sequences to "mr." instruction. These insn sequences can't be combined in > combine pass. > > Compared to last version, it adds a new mode iterator "Q" which sh

Re: [PATCH ver 3] rs6000: Add builtins for IEEE 128-bit floating point values

2023-06-12 Thread Kewen.Lin via Gcc-patches
Hi Carl, on 2023/6/8 23:21, Carl Love wrote: > Kewen, GCC maintainers: > > Version 3, was able to get the overloaded version of scalar_insert_exp > to work and the change to xsxexpqp_f128_ define instruction to > work with the suggestions from Kewen. > > Version 2, I have addressed the various

Re: [PATCH] rs6000, fix vec_replace_unaligned builtin arguments

2023-06-12 Thread Kewen.Lin via Gcc-patches
Hi Carl, on 2023/5/31 04:41, Carl Love wrote: > GCC maintainers: > > The following patch fixes the first argument in the builtin definition > and the corresponding test cases. Initially, the builtin specification > was wrong due to a cut and past error. The documentation was fixed in: > > >

[PATCH, committed] testsuite: Check int128 effective target for pr109932-{1,2}.c [PR110230]

2023-06-13 Thread Kewen.Lin via Gcc-patches
Hi, This patch is to make newly added test cases pr109932-{1,2}.c check int128 effective target to avoid unsupported type error on 32-bit. I did hit this failure during testing and fixed it, but made a stupid mistake not updating the local formatted patch which was actually out of date. Pushed a

Re: [PATCH v1] rs6000: Update powerpc test fold-vec-extract-int.p8.c

2023-06-13 Thread Kewen.Lin via Gcc-patches
Hi, on 2023/5/19 15:40, Ajit Agarwal via Gcc-patches wrote: > Hello All: > > Update powerpc tests for both le and be endian with extra removal of zero > extension and sign extension. > with default ree pass for rs6000 target. Nice! > > Bootstrapped and regtested on powerpc64-linux-gnu. > >

Re: [PATCHv3, rs6000] Splat vector small V2DI constants with ISA 2.07 instructions [PR104124]

2023-06-14 Thread Kewen.Lin via Gcc-patches
Hi Haochen, on 2023/5/26 10:49, HAO CHEN GUI wrote: > Hi, > This patch adds a new insn for vector splat with small V2DI constants on P8. > If the value of constant is in RANGE (-16, 15) and not 0 or -1, it can be > loaded > with vspltisw and vupkhsw on P8. It should be efficient than loading ve

Re: [PATCH ver 4] rs6000: Add builtins for IEEE 128-bit floating point values

2023-06-14 Thread Kewen.Lin via Gcc-patches
Hi Carl, on 2023/6/15 04:37, Carl Love wrote: > Kewen, GCC maintainers: > > Version 4, added missing cases for new xxexpqp, xsxexpdp and xsxsigqp > cases to rs6000_expand_builtin. Merged the new define_insn definitions > with the existing definitions. Renamed the builtins by removing the > __bu

PING^3 [PATCH 0/9] rs6000: Rework rs6000_emit_vector_compare

2023-06-14 Thread Kewen.Lin via Gcc-patches
Hi, Gentle ping this series: https://gcc.gnu.org/pipermail/gcc-patches/2022-November/607146.html BR, Kewen > >> >> on 2022/11/24 17:15, Kewen Lin wrote: >>> Hi, >>> >>> Following Segher's suggestion, this patch series is to rework >>> function rs6000_emit_vector_compare for vector float and in

PING^2 [PATCH v2] sched: Change no_real_insns_p to no_real_nondebug_insns_p [PR108273]

2023-06-14 Thread Kewen.Lin via Gcc-patches
Hi, I'd like to gentle ping this patch: https://gcc.gnu.org/pipermail/gcc-patches/2023-March/614818.html BR, Kewen > on 2023/3/29 15:18, Kewen.Lin via Gcc-patches wrote: >> Hi, >> >> By addressing Alexander's comments, against v1 this >> patch v2 mainly

PING^1 [PATCH v2] rs6000: Don't use optimize_function_for_speed_p too early [PR108184]

2023-06-14 Thread Kewen.Lin via Gcc-patches
Hi, Gentle ping this: https://gcc.gnu.org/pipermail/gcc-patches/2023-January/609993.html BR, Kewen on 2023/1/16 17:08, Kewen.Lin via Gcc-patches wrote: > Hi, > > As Honza pointed out in [1], the current uses of function > optimize_function_for_speed_p in rs6000_option_override_in

Re: [PATCH ver4] rs6000, Add return value to __builtin_set_fpscr_rn

2023-07-13 Thread Kewen.Lin via Gcc-patches
Hi Carl, on 2023/7/12 02:06, Carl Love wrote: > GCC maintainers: > > Ver 4, Removed extra space in subject line. Added comment to commit > log comments about new __SET_FPSCR_RN_RETURNS_FPSCR__ define. Changed > Added to Add and Renamed to Rename in ChangeLog. Updated define_expand > "rs6000_se

Re: [PATCH ver 3] rs6000, fix vec_replace_unaligned built-in arguments

2023-07-13 Thread Kewen.Lin via Gcc-patches
Hi Carl, on 2023/7/8 04:18, Carl Love wrote: > > GCC maintainers: > > Version 3, added code to altivec_resolve_overloaded_builtin so the > correct instruction is selected for the size of the second argument. > This restores the instruction counts to the original values where the > correct instr

[PATCH] vect: Initialize new_temp to avoid false positive warning [PR110652]

2023-07-16 Thread Kewen.Lin via Gcc-patches
Hi, As PR110652 and its duplicate PRs show, there could be one build error error: 'new_temp' may be used uninitialized for some build configurations. It's a false positive warning (or error at -Werror), but in order to make the build succeed, this patch is to initialize the reported variable

Re: [PATCH] vect: Initialize new_temp to avoid false positive warning [PR110652]

2023-07-17 Thread Kewen.Lin via Gcc-patches
on 2023/7/17 14:39, Richard Biener wrote: > On Mon, Jul 17, 2023 at 4:22 AM Kewen.Lin wrote: >> >> Hi, >> >> As PR110652 and its duplicate PRs show, there could be one >> build error >> >> error: 'new_temp' may be used uninitialized >> >> for some build configurations. It's a false positive war

Re: [PATCH, rs6000] Generate mfvsrwz for all platforms and remove redundant zero extend [PR106769]

2023-07-18 Thread Kewen.Lin via Gcc-patches
Hi Haochen, on 2023/6/19 09:14, HAO CHEN GUI wrote: > Hi, > This patch modifies vsx extract expander and generates mfvsrwz/stxsiwx > for all platforms when the mode is V4SI and the index of extracted element > is 1 for BE and 2 for LE. Also this patch adds a insn pattern for mfvsrwz > which can

Re: rs6000: Fix expected counts powerpc/p9-vec-length-full

2023-07-18 Thread Kewen.Lin via Gcc-patches
Hi Carl, The issue was tracked by PR109971 (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109971) and I think it had been resolved. btw, when the expected insn count changes, it does expose some issues but which can be either test or functionality issue, if it's taken as a test issue, it needs so

Re: [PATCH v7, rs6000] Implemented f[min/max]_optab by xs[min/max]dp [PR103605]

2023-07-18 Thread Kewen.Lin via Gcc-patches
Hi Haochen, on 2022/9/26 11:35, HAO CHEN GUI wrote: > Hi, > This patch implements optab f[min/max]_optab by xs[min/max]dp on rs6000. > Tests show that outputs of xs[min/max]dp are consistent with the standard > of C99 fmin/max. > > This patch also binds __builtin_vsx_xs[min/max]dp to fmin/max

Re: [PATCH V2] rs6000: Change GPR2 to volatile & non-fixed register for function that does not use TOC [PR110320]

2023-07-18 Thread Kewen.Lin via Gcc-patches
Hi Jeevitha, on 2023/7/17 11:40, P Jeevitha wrote: > > Hi All, > > The following patch has been bootstrapped and regtested on powerpc64le-linux. Since one line touched has (DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2) and powerpc64le-linux only adopts ABI_ELFv2, could you also test this

Re: PING^2 [PATCH] Adjust the symbol for SECTION_LINK_ORDER linked_to section [PR99889]

2023-07-19 Thread Kewen.Lin via Gcc-patches
Hi Fangrui, on 2023/7/19 14:33, Fangrui Song wrote: > On Thu, Nov 24, 2022 at 7:26 PM Kewen.Lin via Gcc-patches > wrote: >> >> Hi Richard, >> >> on 2022/11/23 00:08, Richard Sandiford wrote: >>> "Kewen.Lin" writes: >>>> Hi Richard, &g

[PATCH] testsuite: Add a test case for PR110729

2023-07-20 Thread Kewen.Lin via Gcc-patches
Hi, As PR110729 reported, there was one issue for .section __patchable_function_entries with -ffunction-sections, that is we put the same symbol as link_to section symbol for all functions wrongly. The commit r13-4294 for PR99889 has fixed this with the corresponding label LPFE* which sits in the

[PATCH] sccvn: Correct the index of bias for IFN_LEN_STORE [PR110744]

2023-07-20 Thread Kewen.Lin via Gcc-patches
Hi, Commit r14-2267-gb8806f6ffbe72e adjusts the arguments order of LEN_STORE from {len,vector,bias} to {len,bias,vector}, in order to make them consistent with LEN_MASK_STORE and MASK_STORE. But it missed to update the related handlings in tree-ssa-sccvn.cc, it caused the failure shown in PR 1107

Re: [PATCH 1/2] rs6000, add argument to function find_instance

2023-07-20 Thread Kewen.Lin via Gcc-patches
Hi Carl, on 2023/7/18 03:19, Carl Love wrote: > > GCC maintainers: > > The rs6000 function find_instance assumes that it is called for built- > ins with only two arguments. There is no checking for the actual > number of aruguments used in the built-in. This patch adds an > additional paramete

Re: [PATCH 2/2 ver 4] rs6000, fix vec_replace_unaligned built-in arguments

2023-07-20 Thread Kewen.Lin via Gcc-patches
Hi Carl, on 2023/7/18 03:20, Carl Love wrote: > GCC maintainers: > > Version 4, changed the new RS6000_OVLD_VEC_REPLACE_UN case statement > rs6000/rs6000-c.cc. The existing REPLACE_ELT iterator name was changed > to REPLACE_ELT_V along with the associated define_mode_attr. Renamed > VEC_RU to R

Re: [PATCH] testsuite: Add a test case for PR110729

2023-07-20 Thread Kewen.Lin via Gcc-patches
on 2023/7/20 20:34, Richard Sandiford wrote: > "Kewen.Lin" writes: >> Hi, >> >> As PR110729 reported, there was one issue for .section >> __patchable_function_entries with -ffunction-sections, that >> is we put the same symbol as link_to section symbol for all >> functions wrongly. The commit r13

Re: [PATCH] sccvn: Correct the index of bias for IFN_LEN_STORE [PR110744]

2023-07-20 Thread Kewen.Lin via Gcc-patches
on 2023/7/20 20:37, Richard Sandiford wrote: > "Kewen.Lin" writes: >> Hi, >> >> Commit r14-2267-gb8806f6ffbe72e adjusts the arguments order >> of LEN_STORE from {len,vector,bias} to {len,bias,vector}, >> in order to make them consistent with LEN_MASK_STORE and >> MASK_STORE. But it missed to upda

[PATCH] vect: Don't vectorize a single scalar iteration loop [PR110740]

2023-07-20 Thread Kewen.Lin via Gcc-patches
Hi, The function vect_update_epilogue_niters which has been removed by r14-2281 has some code taking care of that if there is only one scalar iteration left for epilogue then we won't try to vectorize it any more. Although costing should be able to care about it eventually, I think we still want

Re: [PATCH 4/9] vect: Adjust vectorizable_load costing on VMAT_ELEMENTWISE and VMAT_STRIDED_SLP

2023-07-23 Thread Kewen.Lin via Gcc-patches
Hi Iain, on 2023/7/22 23:58, Iain Sandoe wrote: > Hi Kewen, > > This patch breaks bootstrap on powerpc-darwin (which has Altivec, but not > VSX) while building libgfortran. > >> On 3 Jul 2023, at 04:19, Kewen.Lin via Gcc-patches >> wrote: > > Please

Re: [PATCH 1/2 ver 2] rs6000, add argument to function find_instance

2023-07-23 Thread Kewen.Lin via Gcc-patches
Hi Carl, on 2023/7/22 07:38, Carl Love wrote: > GCC maintainers: > > Version 2: Updated a number of formatting and spacing issues. Added > the NARGS description to the header comment for function find_instance. > This patch was tested on Power 8 LE/BE, Power 9 LE/BE and Power 10 LE > with no r

Re: [PATCH 2/2 ver 5] rs6000, fix vec_replace_unaligned built-in arguments

2023-07-23 Thread Kewen.Lin via Gcc-patches
Hi Carl, on 2023/7/22 07:38, Carl Love wrote: > GCC maintainers: > > Version 5, Fixed patch description, the first argument should be of > type vector. Fixed comment in vsx.md to say "Vector and scalar > extract_elt iterator/attr ". Removed a few of the changes in > version 4. Specifically

Re: [PATCHv2, rs6000] Generate mfvsrwz for all subtargets and remove redundant zero extend [PR106769]

2023-07-23 Thread Kewen.Lin via Gcc-patches
Hi Haochen, on 2023/7/21 09:32, HAO CHEN GUI wrote: > Hi, > This patch modifies vsx extract expand and generates mfvsrwz/stxsiwx > for all subtargets when the mode is V4SI and the index of extracted element > is 1 for BE and 2 for LE. Also this patch adds a insn pattern for mfvsrwz > which can h

Re: [PATCH] vect: Don't vectorize a single scalar iteration loop [PR110740]

2023-07-23 Thread Kewen.Lin via Gcc-patches
on 2023/7/21 19:49, Richard Biener wrote: > On Fri, Jul 21, 2023 at 8:08 AM Kewen.Lin wrote: >> >> Hi, >> >> The function vect_update_epilogue_niters which has been >> removed by r14-2281 has some code taking care of that if >> there is only one scalar iteration left for epilogue then >> we won't

[PATCH] vect: Treat VMAT_ELEMENTWISE as scalar load in costing [PR110776]

2023-07-25 Thread Kewen.Lin via Gcc-patches
Hi, PR110776 exposes one issue that we could query unaligned load for vector type but actually no unaligned vector load is supported there. The reason is that the costed load is with single-lane vector type and its memory access type is VMAT_ELEMENTWISE, we actually take it as scalar load and set

[PATCH] rs6000: Correct vsx operands output for xxeval [PR110741]

2023-07-25 Thread Kewen.Lin via Gcc-patches
Hi, PR110741 exposes one issue that we didn't use the correct character for vsx operands in output operand substitution, consequently it can map to the wrong registers which hold some unexpected values. Bootstrapped and regress-tested on powerpc64-linux-gnu P7/P8/P9 and powerpc64le-linux-gnu P9/P

Re: [PATCH] Fix typo in insn name.

2023-07-25 Thread Kewen.Lin via Gcc-patches
Hi Mike, on 2023/7/11 03:59, Michael Meissner wrote: > In doing other work, I noticed that there was an insn: > > vsx_extract_v4sf__load > > Which did not have an iterator. I removed the useless . It actually has a mode iterator, the "P" is used for clobber. The whole pattern of this de

Re: [PATCH] vect: Treat VMAT_ELEMENTWISE as scalar load in costing [PR110776]

2023-07-26 Thread Kewen.Lin via Gcc-patches
on 2023/7/26 18:02, Richard Biener wrote: > On Wed, Jul 26, 2023 at 4:52 AM Kewen.Lin wrote: >> >> Hi, >> >> PR110776 exposes one issue that we could query unaligned >> load for vector type but actually no unaligned vector load >> is supported there. The reason is that the costed load is >> with

Re: [PATCH] Optimize vec_splats of vec_extract for V2DI/V2DF (PR target/99293)

2023-07-28 Thread Kewen.Lin via Gcc-patches
Hi Mike, on 2023/7/11 03:50, Michael Meissner wrote: > This patch optimizes cases like: > > vector double v1, v2; > /* ... */ > v2 = vec_splats (vec_extract (v1, 0); /* or */ > v2 = vec_splats (vec_extract (v1, 1); > > Previously: > > vector long long > sp

Re: [PATCH, rs6000] Skip redundant vector extract if the element is first element of dword0 [PR110429]

2023-07-28 Thread Kewen.Lin via Gcc-patches
Hi Haochen, on 2023/7/5 11:22, HAO CHEN GUI wrote: > Hi, > This patch skips redundant vector extract insn to be generated when > the extracted element is the first element of dword0 and the destination "The first element" is confusing, it's easy to be misunderstood as element 0, but in fact the

Re: [PATCHv2, rs6000] Generate mfvsrwz for all subtargets and remove redundant zero extend [PR106769]

2023-07-30 Thread Kewen.Lin via Gcc-patches
Hi Haochen, on 2023/7/25 10:10, HAO CHEN GUI wrote: > Hi, > This patch modifies vsx extract expand and generates mfvsrwz/stxsiwx > for all subtargets when the mode is V4SI and the index of extracted element > is 1 for BE and 2 for LE. Also this patch adds a insn pattern for mfvsrwz > which helps

Re: [PATCH] rs6000: Fix __builtin_altivec_vcmpne{b,h,w} implementation

2023-07-30 Thread Kewen.Lin via Gcc-patches
Hi Carl, on 2023/7/28 23:00, Carl Love wrote: > GCC maintainers: > > The following patch cleans up the definition for the > __builtin_altivec_vcmpnet. The current implementation implies that the s/__builtin_altivec_vcmpnet/__builtin_altivec_vcmpne[bhw]/ > built-in is only supported on Power 9

Re: PING^1 [PATCH] inline: Rebuild target option node for caller [PR105459]

2022-07-01 Thread Kewen.Lin via Gcc-patches
Hi Richi, Thanks for the insightful comments! on 2022/7/1 16:40, Richard Biener wrote: > On Thu, Jun 23, 2022 at 4:03 AM Kewen.Lin wrote: >> >> Hi, >> >> Gentle ping https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596212.html >> >> BR, >> Kew

[PATCH] rs6000: Preserve REG_EH_REGION when replacing load/store [PR106091]

2022-07-07 Thread Kewen.Lin via Gcc-patches
Hi, As test case in PR106091 shows, rs6000 specific pass swaps doesn't preserve the reg_note REG_EH_REGION when replacing some load insn at the end of basic block, it causes the flow info verification to fail unexpectedly. Since memory reference rtx may trap, this patch is to ensure we copy REG_

Re: [PATCH] rs6000: Preserve REG_EH_REGION when replacing load/store [PR106091]

2022-07-07 Thread Kewen.Lin via Gcc-patches
on 2022/7/7 17:03, Richard Biener wrote: > On Thu, Jul 7, 2022 at 10:55 AM Kewen.Lin wrote: >> >> Hi, >> >> As test case in PR106091 shows, rs6000 specific pass swaps >> doesn't preserve the reg_note REG_EH_REGION when replacing >> some load insn at the end of basic block, it causes the >> flow in

Re: [PATCH/RFC] combine_completed global variable.

2022-07-08 Thread Kewen.Lin via Gcc-patches
Hi Roger, on 2022/7/8 03:40, Roger Sayle wrote: > > Hi Kewen (and Segher), > Many thanks for stress testing my patch to improve multiplication > by integer constants on rs6000 by using the rldmi instruction. > Although I've not been able to reproduce your ICE (using gcc135 > on the compile farm),

Re: [PATCH] inline: Rebuild target option node for caller [PR105459]

2022-07-10 Thread Kewen.Lin via Gcc-patches
on 2022/7/8 19:37, Martin Liška wrote: > On 6/6/22 08:20, Kewen.Lin wrote: >> |Hi, PR105459 exposes one issue in inline_call handling that when it decides >> to copy FP flags from callee to caller and rebuild the optimization node for >> caller fndecl, it's possible that the target option node is

Re: [PATCH] predict: Adjust optimize_function_for_size_p [PR105818]

2022-07-10 Thread Kewen.Lin via Gcc-patches
on 2022/6/15 14:20, Kewen.Lin wrote: > Hi Honza, > > Thanks for the comments! Some replies are inlined below. > > on 2022/6/14 19:37, Jan Hubicka wrote: >>> Hi, >>> >>> Function optimize_function_for_size_p returns OPTIMIZE_SIZE_NO >>> if func->decl is not null but no cgraph node is available fo

Re: [PATCH] HIGH part of symbol ref is invalid for constant pool

2022-07-13 Thread Kewen.Lin via Gcc-patches
Hi Jeff, Thanks for the patch, one question is inlined below. on 2022/7/4 14:58, Jiufu Guo wrote: > The high part of the symbol address is invalid for the constant pool. In > function rs6000_cannot_force_const_mem, we already return true for > "HIGH with UNSPEC" rtx. During debug GCC, I found tha

Re: [PATCH, rs6000] Additional cleanup of rs6000_builtin_mask

2022-07-13 Thread Kewen.Lin via Gcc-patches
Hi Will, Thanks for the cleanup! Some comments are inlined. on 2022/7/14 05:39, will schmidt wrote: > [PATCH, rs6000] Additional cleanup of rs6000_builtin_mask > > Hi, > Post the rs6000 builtins rewrite, some of the leftover builtin > code is redundant and can be removed. > This replace

Re: [PATCH, rs6000, v2] Additional cleanup of rs6000_builtin_mask

2022-07-19 Thread Kewen.Lin via Gcc-patches
Hi Will, on 2022/7/20 04:15, will schmidt wrote: > [PATCH, rs6000, v2] Additional cleanup of rs6000_builtin_mask > > Hi, > Post the rs6000 builtins rewrite, some of the leftover builtin > code is redundant and can be removed. > This replaces the usage of bu_mask in rs6000_target_modify_ma

[PATCH] rs6000: Suggest unroll factor for loop vectorization

2022-07-20 Thread Kewen.Lin via Gcc-patches
Hi, Commit r12-6679-g7ca1582ca60dc8 made vectorizer accept one unroll factor to be applied to vectorization factor when vectorizing the main loop, it would be suggested by target when doing costing. This patch introduces function determine_suggested_unroll_factor for rs6000 port, to make it be ab

[PATCH] rs6000/test: Fix empty TU in some cases of effective targets

2022-07-20 Thread Kewen.Lin via Gcc-patches
Hi, As the failure of test case gcc.target/powerpc/pr92398.p9-.c in PR106345 shows, some test sources for some powerpc effective targets use empty translation unit wrongly. The test sources could go with options like "-ansi -pedantic-errors", then those effective target checkings will fail unexpe

[PATCH] rs6000/test: Update some cases with -mdejagnu-tune

2022-07-20 Thread Kewen.Lin via Gcc-patches
Hi, As PR106345 shows, some test cases should be updated with -mdejagnu-tune, since their test points are sensitive to rs6000_tune, such as: group_ending_nop, loop align (ic), float conversion cost etc. This patch is to replace -mdejagnu-cpu with -mdejagnu-tune or append -mdejagnu-tune (keep the

Re: [PATCH] Teach VN about masked/len stores

2022-07-21 Thread Kewen.Lin via Gcc-patches
Hi Richi, on 2022/7/21 17:01, Richard Biener via Gcc-patches wrote: > The following teaches VN to handle reads from .MASK_STORE and > .LEN_STORE. For this push_partial_def is extended first for > convenience so we don't have to handle the full def case in the > caller (possibly other paths can be

Re: [PATCH] rs6000/test: Fix empty TU in some cases of effective targets

2022-07-21 Thread Kewen.Lin via Gcc-patches
Hi Segher, Thanks for the comments! on 2022/7/22 06:09, Segher Boessenkool wrote: > On Wed, Jul 20, 2022 at 05:32:01PM +0800, Kewen.Lin wrote: >> As the failure of test case gcc.target/powerpc/pr92398.p9-.c in >> PR106345 shows, some test sources for some powerpc effective >> targets use empty tr

Re: [PATCH] rs6000/test: Fix empty TU in some cases of effective targets

2022-07-21 Thread Kewen.Lin via Gcc-patches
Hi! on 2022/7/22 09:02, Segher Boessenkool wrote: > Hi! > > On Fri, Jul 22, 2022 at 08:41:43AM +0800, Kewen.Lin wrote: >> Hi Segher, >> >> Thanks for the comments! > > Always. > This patch is to fix empty TUs with one dummy variable definition accordingly. >>> >>> You can also use >>>

Re: [PATCH] rs6000/test: Update some cases with -mdejagnu-tune

2022-07-21 Thread Kewen.Lin via Gcc-patches
Hi Segher, Thanks for the comments! on 2022/7/22 02:48, Segher Boessenkool wrote: > Hi! > > On Wed, Jul 20, 2022 at 05:31:11PM +0800, Kewen.Lin wrote: >> As PR106345 shows, some test cases should be updated with >> -mdejagnu-tune, since their test points are sensitive to >> rs6000_tune, such as:

Re: [PATCH] rs6000/test: Update some cases with -mdejagnu-tune

2022-07-24 Thread Kewen.Lin via Gcc-patches
Hi Peter and Segher, on 2022/7/23 03:28, Peter Bergner wrote: > On 7/22/22 1:53 PM, Peter Bergner wrote: >> So I think the way the code above *should* work is: >> 1) Any -mdejagnu-cpu= usage should filter out all -mcpu= and -mtune= >> options. >> 2) Any -mdejagnu-tune= usage should filter all

[PATCH v2] rs6000/test: Fix empty TU in some cases of effective targets [PR106345]

2022-07-24 Thread Kewen.Lin via Gcc-patches
Hi, As the failure of test case gcc.target/powerpc/pr92398.p9-.c in PR106345 shows, some test sources for some powerpc effective targets use empty translation unit wrongly. The test sources could go with options like "-ansi -pedantic-errors", then those effective target checkings will fail unexpe

Re: [PATCH V1] HIGH part of symbol ref is invalid for constant pool

2022-07-25 Thread Kewen.Lin via Gcc-patches
Hi Jeff, on 2022/7/19 22:30, Jiufu Guo wrote: > Hi, > > In patch https://gcc.gnu.org/pipermail/gcc-patches/2022-July/597712.html, > test case was not added. After more check, a testcase is added for it. > Good to see that you constructed one actual test case, nice! :) > The high part of the

PING^3 [PATCH] rs6000: Handle unresolved overloaded builtin [PR105485]

2022-07-28 Thread Kewen.Lin via Gcc-patches
Hi, Gentle ping https://gcc.gnu.org/pipermail/gcc-patches/2022-May/594699.html BR, Kewen > >> on 2022/5/13 13:29, Kewen.Lin via Gcc-patches wrote: >>> Hi, >>> >>> PR105485 exposes that new builtin function framework doesn't handle >>> unre

PING^3 [PATCH v3] rs6000: Fix the check of bif argument number [PR104482]

2022-07-28 Thread Kewen.Lin via Gcc-patches
Hi, Gentle ping https://gcc.gnu.org/pipermail/gcc-patches/2022-May/595208.html BR, Kewen >>> Hi, >>> >>> As PR104482 shown, it's one regression about the handlings when >>> the argument number is more than the one of built-in function >>> prototype. The new bif support only catches the case tha

PING^1 [PATCH v4] rs6000: Adjust mov optabs for opaque modes [PR103353]

2022-07-28 Thread Kewen.Lin via Gcc-patches
Hi, Gentle ping https://gcc.gnu.org/pipermail/gcc-patches/2022-June/597286.html BR, Kewen on 2022/6/27 10:47, Kewen.Lin via Gcc-patches wrote: > Hi Segher! > > on 2022/6/25 00:49, Segher Boessenkool wrote: >> Hi! >> >> On Fri, Jun 24, 2022 at 09:03:59AM +0800, Kewen.

Re: [PATCH, rs6000] Add multiply-add expand pattern [PR103109]

2022-07-31 Thread Kewen.Lin via Gcc-patches
Hi Haochen, Thanks for the patch, some comments are inlined. on 2022/7/25 13:11, HAO CHEN GUI wrote: > Hi, > This patch adds an expand and several insns for multiply-add with > three 64bit operands. > > Bootstrapped and tested on powerpc64-linux BE and LE with no regressions. > Is this okay

Re: [PATCH, rs6000] TARGET_MADDLD should include TARGET_POWERPC64

2022-08-03 Thread Kewen.Lin via Gcc-patches
Hi Haochen, on 2022/8/3 16:24, HAO CHEN GUI wrote: > Hi, > This patch changes the definition of TARGET_MADDLD and includes > TARGET_POWERPC64, since maddld is a 64 bit instruction. > > maddld-1.c now checks "has_arch_ppc64". It depends on a patch which fixes > empty TU problem. > https://gcc.

Re: [PATCH, rs6000] Correct return value of check_p9modulo_hw_available

2022-08-04 Thread Kewen.Lin via Gcc-patches
Hi Haochen, on 2022/8/4 17:55, HAO CHEN GUI wrote: > Hi, > This patch corrects return value of check_p9modulo_hw_available. It should > return 0 when p9modulo is supported. Good catch! There is no case using p9modulo_hw for now, no coverage, sigh... > > Bootstrapped and tested on powerpc64

Re: [PATCH] rs6000: Fix incorrect RTL for Power LE when removing the UNSPECS [PR106069]

2022-08-08 Thread Kewen.Lin via Gcc-patches
Hi Xionghu, Thanks for the fix. on 2022/8/8 11:42, Xionghu Luo wrote: > The native RTL expression for vec_mrghw should be same for BE and LE as > they are register and endian-independent. So both BE and LE need > generate exactly same RTL with index [0 4 1 5] when expanding vec_mrghw > with vec_

Re: [PATCH v2, rs6000] Add multiply-add expand pattern [PR103109]

2022-08-08 Thread Kewen.Lin via Gcc-patches
Hi Haochen, Thanks for the patch. on 2022/8/8 14:04, HAO CHEN GUI wrote: > Hi, > This patch adds an expand and several insns for multiply-add with three > 64bit operands. > > Compared with last version, the main changes are: > 1 The "maddld" pattern is reused for the low-part generation. > 2

[PATCH] rs6000: Rework ELFv2 support for -fpatchable-function-entry* [PR99888]

2022-08-08 Thread Kewen.Lin via Gcc-patches
Hi, As PR99888 and its related show, the current support for -fpatchable-function-entry on powerpc ELFv2 doesn't work well with global entry existence. For example, with one command line option -fpatchable-function-entry=3,2, it got below w/o this patch: .LPFE1: nop nop

[PATCH] rs6000: Simplify some code with rs6000_builtin_is_supported

2022-08-08 Thread Kewen.Lin via Gcc-patches
Hi, In function rs6000_init_builtins, there is a oversight that in one target debugging hunk with TARGET_DEBUG_BUILTIN we missed to handle enum bif_enable ENB_CELL. It's easy to fix it by adding another if case. But considering the long term maintainability, this patch updates it with the existi

[PATCH] rs6000: Remove stale rs6000_global_entry_point_needed_p

2022-08-08 Thread Kewen.Lin via Gcc-patches
Hi, r10-631 had renamed rs6000_global_entry_point_needed_p to rs6000_global_entry_point_prologue_needed_p. This is to remove the stale function declaration. Bootstrapped and regtested on powerpc64-linux-gnu P8 and powerpc64le-linux-gnu P9 and P10. I'll push this soon. BR, Kewen - gcc/Chang

Re: [PATCH] rs6000: Rework ELFv2 support for -fpatchable-function-entry* [PR99888]

2022-08-09 Thread Kewen.Lin via Gcc-patches
Hi Segher, Thanks for the review comments! on 2022/8/9 18:35, Segher Boessenkool wrote: > Hi! > >> + /* As ELFv2 ABI shows, the allowable bytes past the global entry >> + point are 0, 4, 8, 16, 32 and 64. Considering there are two >> + non-prefixed instructions for global e

Re: [PATCH v2, rs6000] Add multiply-add expand pattern [PR103109]

2022-08-10 Thread Kewen.Lin via Gcc-patches
on 2022/8/10 05:34, Segher Boessenkool wrote: > On Tue, Aug 09, 2022 at 11:14:16AM +0800, Kewen.Lin wrote: >> on 2022/8/8 14:04, HAO CHEN GUI wrote: >>> +/* { dg-do run { target { has_arch_ppc64 } } } */ >>> +/* { dg-options "-O2 -mdejagnu-cpu=power9 -save-temps" } */ >>> +/* { dg-require-effective

Re: [PATCH] rs6000: Rework ELFv2 support for -fpatchable-function-entry* [PR99888]

2022-08-10 Thread Kewen.Lin via Gcc-patches
on 2022/8/10 05:10, Segher Boessenkool wrote: > Hi! > > On Tue, Aug 09, 2022 at 08:51:59PM +0800, Kewen.Lin wrote: >> on 2022/8/9 18:35, Segher Boessenkool wrote: +/* As ELFv2 ABI shows, the allowable bytes past the global entry + point are 0, 4, 8, 16, 32 and 64. Considering

Re: [PATCH v2] rs6000: Fix __builtin_altivec_vcmpne{b,h,w} implementation

2023-08-07 Thread Kewen.Lin via Gcc-patches
Hi Carl, Sorry for the late review. on 2023/8/2 02:29, Carl Love wrote: > > GCC maintainers: > > Ver 2: Re-worked the test vec-cmpne.c to create a compile only test > verify the instruction generation and a runnable test to verify the > built-in functionality. Retested the patch on Power 8 LE

Re: [PATCH V2] rs6000: Don't allow AltiVec address in movoo & movxo pattern [PR110411]

2023-08-07 Thread Kewen.Lin via Gcc-patches
Hi Jeevitha, on 2023/7/20 00:46, jeevitha wrote: > Hi All, > > The following patch has been bootstrapped and regtested on powerpc64le-linux. > > There are no instructions that do traditional AltiVec addresses (i.e. > with the low four bits of the address masked off) for OOmode and XOmode > objec

PING^4 [PATCH 0/9] rs6000: Rework rs6000_emit_vector_compare

2023-08-07 Thread Kewen.Lin via Gcc-patches
Hi, Gentle ping this series: https://gcc.gnu.org/pipermail/gcc-patches/2022-November/607146.html BR, Kewen >>> on 2022/11/24 17:15, Kewen Lin wrote: Hi, Following Segher's suggestion, this patch series is to rework function rs6000_emit_vector_compare for vector float and int

PING^3 [PATCH v2] sched: Change no_real_insns_p to no_real_nondebug_insns_p [PR108273]

2023-08-07 Thread Kewen.Lin via Gcc-patches
Hi, I'd like to gentle ping this patch: https://gcc.gnu.org/pipermail/gcc-patches/2023-March/614818.html BR, Kewen >> on 2023/3/29 15:18, Kewen.Lin via Gcc-patches wrote: >>> Hi, >>> >>> By addressing Alexander's comments, against v1 this >>>

PING^2 [PATCH v2] rs6000: Don't use optimize_function_for_speed_p too early [PR108184]

2023-08-07 Thread Kewen.Lin via Gcc-patches
Hi, Gentle ping this: https://gcc.gnu.org/pipermail/gcc-patches/2023-January/609993.html BR, Kewen > on 2023/1/16 17:08, Kewen.Lin via Gcc-patches wrote: >> Hi, >> >> As Honza pointed out in [1], the current uses of function >> optimize_function_for_speed_p in rs60

Re: [PATCH 1/3] targhooks: Extend legitimate_address_p with code_helper [PR110248]

2023-08-07 Thread Kewen.Lin via Gcc-patches
Hi Richi, on 2023/6/30 17:13, Kewen.Lin via Gcc-patches wrote: > Hi Richi, > > Thanks for your review! > > on 2023/6/30 16:56, Richard Biener wrote: >> On Fri, Jun 30, 2023 at 7:38 AM Kewen.Lin wrote: >>> >>> Hi, >>> >>> As PR110248 sh

Re: [PATCH ver 3] rs6000: Fix __builtin_altivec_vcmpne{b,h,w} implementation

2023-08-09 Thread Kewen.Lin via Gcc-patches
Hi Carl, on 2023/8/8 01:50, Carl Love wrote: > > GCC maintainers: > > Ver 3: Updated description to make it clear the patch fixes the > confusion on the availability of the builtins. Fixed the dg-require- > effective-target on the test cases and the dg-options. Change the test > case so the fo

Re: [PATCH] rs6000: Fix issue in specifying PTImode as an attribute [PR106895]

2023-08-09 Thread Kewen.Lin via Gcc-patches
Hi, on 2023/7/20 12:35, jeevitha via Gcc-patches wrote: > Hi All, > > The following patch has been bootstrapped and regtested on powerpc64le-linux. > > When the user specifies PTImode as an attribute, it breaks. Created > a tree node to handle PTImode types. PTImode attribute helps in generating

Re: [PATCHv4, rs6000] Generate mfvsrwz for all subtargets and remove redundant zero extend [PR106769]

2023-08-14 Thread Kewen.Lin via Gcc-patches
Hi Haochen, on 2023/8/14 10:18, HAO CHEN GUI wrote: > Hi, > This patch modifies vsx extract expand and generates mfvsrwz/stxsiwx > for all sub targets when the mode is V4SI and the extracted element is word > 1 from BE order. Also this patch adds a insn pattern for mfvsrwz which > helps eliminat

  1   2   3   4   5   6   7   8   9   10   >