Re: [PATCH, 11 backport] rs6000: Fix LE code gen for vec_cnt[lt]z_lsbb [PR95082]

2022-02-10 Thread Segher Boessenkool
Hi! On Thu, Feb 10, 2022 at 12:22:28PM -0600, Bill Schmidt wrote: > This is a backport from mainline 3f30f2d1dbb3228b8468b26239fe60c2974ce2ac. > These built-ins were misimplemented as always having big-endian semantics. What is different compared to the trunk version? Segher

Re: [PATCH, 11 backport] rs6000: Fix LE code gen for vec_cnt[lt]z_lsbb [PR95082]

2022-02-10 Thread Segher Boessenkool
suite/gcc.target/powerpc/vsu/vec-cntlz-lsbb-3.c > @@ -0,0 +1,15 @@ > +/* { dg-do compile { target { powerpc*-*-* } } } */ > +/* { dg-require-effective-target powerpc_p9vector_ok } */ > +/* { dg-options "-mdejagnu-cpu=power9 -mlittle" } */ And here you do it correctly :-) Okay with those fixes (all happen a few times). Thanks! Segher

Re: [PATCH, 11 backport] rs6000: Fix LE code gen for vec_cnt[lt]z_lsbb [PR95082]

2022-02-10 Thread Segher Boessenkool
on all subtargets. (I found that quite > surprising also.) Huh. Yeah I think I encountered that before. So this is because these options are in sysv4.opt . > Apparently this doesn't work on AIX, for example. But > -mlittle works everywhere. Go figure. ... and -mlittle is exactly the same? Wtw. I only looked at the .opt files, maybe one of them is handled directly, or more likely in specs? And not symmetrically? > That's something that should be fixed, I guess, but it's orthogonal > to this patch. Fixing it later is more work :-( Please at least open a bug report for it. The other things need fixing before the patch is okay. Segher

Re: [PATCH, 11 backport] rs6000: Fix LE code gen for vec_cnt[lt]z_lsbb [PR95082]

2022-02-10 Thread Segher Boessenkool
> selector. It is fully generic. I added it in commit 89453706e0032f9a9c2107631873d9dad38dc14c Author: Segher Boessenkool Date: Wed May 23 19:31:05 2018 +0200 testsuite: Introduce be/le selectors It is very useful, just like ilp32 / lp64 :-) > powerpc*-*-linux* understands &q

Re: [PATCH, 11 backport] rs6000: Fix LE code gen for vec_cnt[lt]z_lsbb [PR95082]

2022-02-10 Thread Segher Boessenkool
Hi! On Thu, Feb 10, 2022 at 04:28:02PM -0600, Bill Schmidt wrote: > On 2/10/22 4:11 PM, Segher Boessenkool wrote: > >> No, trunk has this, for example: > >> > >>   const signed int __builtin_altivec_vclzlsbb_v16qi (vsc); > >>     VCLZLSBB_V16QI vctzlsbb

Re: [PATCH], PR 104253, Fix __ibm128 conversions on IEEE 128-bit system

2022-02-11 Thread Segher Boessenkool
get/powerpc/pr104253.c: New test. > +/* { require-effective-target ppc_float128_sw } */ The documentation for this selector is wrong, btw: it says it tests whether this is emulated in software! But instead it just tests if it works, soft float, emulated, and hardware are all fine. The patch is okay for trunk, and backports later. Thanks! Segher

Re: [PATCH, rs6000] Remove TImode from mode iterator BOOL_128 [PR100694]

2022-02-14 Thread Segher Boessenkool
doing it and handling the ICEs later is fine, but in stage 1. (You'll also have to show it is *correct*, you need to prove (or show it really likely :-) ) that after this change there are no TImode things generated anywhere (anywhere!) that are no longer handled now). Segher

Re: [PATCH, rs6000] Remove TImode from mode iterator BOOL_128 [PR100694]

2022-02-15 Thread Segher Boessenkool
On Tue, Feb 15, 2022 at 11:01:03AM +0800, HAO CHEN GUI wrote: Hi! > On 15/2/2022 上午 5:36, Segher Boessenkool wrote: > > On Wed, Feb 09, 2022 at 10:43:17AM +0800, HAO CHEN GUI wrote: > > All that are arguments for expanding to split form, not for removing > > TImode from t

Re: [PATCH], PR target/99708 - Define __SIZEOF_FLOAT128__ and __SIZEOF_IBM128__

2022-02-15 Thread Segher Boessenkool
This should be tested directly, it should not depend on that some other code did what it does today. That would also make the code much more obvious. Segher

Re: [PATCH] rs6000: Retry tbegin. instructions that can fail intermittently

2022-02-15 Thread Segher Boessenkool
is normal for those to fail as well, and there needs to be a fallback there as well :-) ) The patch is fine. Okay for trunk and backports (after soak time ofc). Thanks! Segher > gcc/testsuite/ > * gcc.target/powerpc/htm-1.c: Retry intermittent failing tbegins.

Re: [PATCH], PR target/99708 - Define __SIZEOF_FLOAT128__ and __SIZEOF_IBM128__

2022-02-15 Thread Segher Boessenkool
On Tue, Feb 15, 2022 at 03:18:30PM -0500, Michael Meissner wrote: > On Tue, Feb 15, 2022 at 01:45:06PM -0600, Segher Boessenkool wrote: > > On Tue, Feb 15, 2022 at 12:49:41PM -0500, Michael Meissner wrote: > > > Define __SIZEOF_FLOAT128__ and __SIZEOF_IBM128__. > > > &g

Re: [PATCH] rs6000: Retry tbegin. instructions that can fail intermittently

2022-02-16 Thread Segher Boessenkool
re misled by not seeing it fail in any testing (it fails only .02% of the time you said). For that reason it helps to make testcases fail *more* often. That isn't very trivial to do with HTM of course. Since we don't do HTM anymore it will all fade away, and let's not bother, why am I typing still :-) Segher

Re: [PATCH], PR target/99708 - Define __SIZEOF_FLOAT128__ and __SIZEOF_IBM128__

2022-02-16 Thread Segher Boessenkool
On Tue, Feb 15, 2022 at 06:05:06PM -0500, Michael Meissner wrote: > On Tue, Feb 15, 2022 at 04:05:11PM -0600, Segher Boessenkool wrote: > > On all older compilers these macros will not be defined, but the types > > often are. If you are willing to not support older compilers prop

Re: [PATCH] combine: Fix up -fcompare-debug issue in the combiner [PR104544]

2022-02-16 Thread Segher Boessenkool
if (link->insn == i3 && link->regno == regno) About half of the similar loops in combine.c are still broken this way, from a quick sampling :-( Okay for trunk and all backports you may want. Thanks! Segher

Re: [PATCH] combine: Fix up -fcompare-debug issue in the combiner [PR104544]

2022-02-16 Thread Segher Boessenkool
On Wed, Feb 16, 2022 at 11:55:23AM +0100, Jakub Jelinek wrote: > On Wed, Feb 16, 2022 at 04:44:58AM -0600, Segher Boessenkool wrote: > > About half of the similar loops in combine.c are still broken this way, > > from a quick sampling :-( > > Looking for just NONDEBUG_IN

Re: [PATCH] rs6000: Workaround for new ifcvt behavior [PR104335]

2022-02-16 Thread Segher Boessenkool
(XEXP (op, 0))) == MODE_CC) > +return false; Why that first test? XEXP (op, 0) is required to not be nil. The patch is okay without that (if it passes testing of course :-) ) Thanks! Segher

Re: [PATCH, V3] PR target/99708- Define __SIZEOF_FLOAT128__ and __SIZEOF_IBM128__

2022-02-16 Thread Segher Boessenkool
8_TYPE__"); > + builtin_define ("__SIZEOF_FLOAT128__=16"); > +} if (TARGET_FLOAT128_TYPE) builtin_define ("__FLOAT128_TYPE__"); if (float128_type_node) builtin_define ("__SIZEOF_FLOAT128__=16"); if (ibm128_float_type_node) builtin_define ("__SIZEOF_IBM128__=16"); Okay like that. Thanks! Segher

Re: [PATCH] rs6000: __Uglify non-uglified local variables in headers

2022-02-17 Thread Segher Boessenkool
mmintrin.h: Likewise. > * /config/rs6000/pmmintrin.h: Likewise. > * /config/rs6000/smmintrin.h: Likewise. > * /config/rs6000/tmmintrin.h: Likewise. > * /config/rs6000/xmmintrin.h: Likewise. Okay for trunk. Thanks! Do you want to backport this as well? That's preapproved (if you think it is useful). Segher

Re: [PATCH, rs6000] Clean up Power10 fusion options

2022-02-17 Thread Segher Boessenkool
ocations > so that the hardware will fuse them to a single operation. */ > - if (TARGET_P10_FUSION && TARGET_P10_FUSION_2STORE > + if (TARGET_P10_FUSION >&& is_fusable_store (last_scheduled_insn, &mem1)) Please fit that on one line now :-) Okay for trunk with that triviality. Thanks! Segher

Re: [PATCH] Don't do int cmoves for IEEE comparisons, PR target/104256.

2022-02-17 Thread Segher Boessenkool
ve things, isn't it :-) > new file mode 100644 > index 000..d1bfab23482 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/ppc-fortran/pr104254.f90 > @@ -0,0 +1,25 @@ > +! { dg-do compile } > +! { dg-require-effective-target powerpc_p9vector_ok } > +! { dg-options "-mdejagnu-cpu=power9 -O1 -fnon-call-exceptions" } > + > +! PR target/104254. GCC would raise an assertion error if this program was PR104256. Segher

Re: [PATCH] Check if loading const from mem is faster

2022-02-22 Thread Segher Boessenkool
ust CSE, and we don't want to lose all those other things. So it will be a slow arduous affair of peeling off bits into separate passes, I think :-( Doing actual CSE without all the restrictive restrictions our pass has historically had isn't the hard part! Segher

Re: [PATCH 1/3] rs6000: Move g++.dg/ext powerpc tests to g++.target

2022-02-22 Thread Segher Boessenkool
ltivec-types-2.C: Likewise. > * g++.dg/ext/altivec-types-3.C: Likewise. > * g++.dg/ext/altivec-types-4.C: Likewise. > * g++.dg/ext/undef-bool-1.C: Likewise. Okay for trunk. Thanks! Segher

Re: [PATCH 0/3] rs6000: Move g++.dg powerpc tests to g++.target

2022-02-22 Thread Segher Boessenkool
get-specific powerpc subdirectory. Not "perhaps" :-) More specifically, powerpc.exp has # Exit immediately if this isn't a PowerPC target. if {![istarget powerpc*-*-*] } then { return } so anything run from that driver does not have to test for powerpc separately anymore. Segher

Re: [PATCH 2/3] rs6000: Move g++.dg powerpc PR tests to g++.target

2022-02-22 Thread Segher Boessenkool
4 would be needed either (there is no comment what it is needed for, for example). > --- a/gcc/testsuite/g++.dg/pr85657.C > +++ b/gcc/testsuite/g++.target/powerpc/pr85657.C > @@ -1,4 +1,4 @@ > -// { dg-do compile { target { powerpc*-*-linux* } } } > +// { dg-do compile { target { *-*-linux* } } } A comment here would help as well. All of that is pre-existing of course. Segher

Re: [PATCH] Check if loading const from mem is faster

2022-02-23 Thread Segher Boessenkool
r what, before the pass has finished that is. On all more modern architectures it is futile to think you can usefully consider the cost of an RTL expression and derive a real-world cost of the generated code from that. But there is so much more wrong with cse.c :-( Segher

Re: [PATCH] Check if loading const from mem is faster

2022-02-23 Thread Segher Boessenkool
e original insn cost. So why exactly do you need a new hook > >for this particular situation? > > Thanks for pointing out this! Segher also mentioned this before. > Currently, CSE is using rtx_cost. Using insn_cost to replace > rtx_cost would be a good idea for all necessary pl

Re: [PATCH, testsuite] Fix attr-retain-*.c testcases on 32-bit PowerPC [PR100407]

2022-02-24 Thread Segher Boessenkool
4 +1,5 @@ > /* { dg-do compile { target R_flag_in_section } } */ > +/* { dg-options "-G0" { target { powerpc*-*-* && ilp32 } } } */ This needs a comment exokaining what it is for. Okay for trunk with that, thanks! Segher

Re: [pushed] LRA, rs6000, Darwin: Amend lo_sum use for forced constants [PR104117].

2022-02-26 Thread Segher Boessenkool
patch does anything if TARGET_MACHO isn't true, so it is all fine with me. It does look good to me fwiw (the empty constraints are a bit nasty, but they aren't new). Okay for trunk wrt rs6000. Thanks! Segher

Re: [PATCH] Check if loading const from mem is faster

2022-02-28 Thread Segher Boessenkool
Hi! On Thu, Feb 24, 2022 at 03:48:54PM +0800, Jiufu Guo wrote: > Segher Boessenkool writes: > > That is the problem yes. You need insns to call insn_cost on. You can > > look in combine.c:combine_validate_cost to see how this can be done; but > > you need to have some co

Re: [PATCH] Check if loading const from mem is faster

2022-02-28 Thread Segher Boessenkool
On Thu, Feb 24, 2022 at 09:50:28AM +0100, Richard Biener wrote: > On Thu, 24 Feb 2022, Jiufu Guo wrote: > > And another thing as Segher pointed out, CSE is doing too > > much work. It may be ok to separate the constant handling > > logic from CSE. > > Not sure - CSE

Re: [PATCH v2] rs6000: Test case adjustments for new builtins

2021-11-17 Thread Segher Boessenkool
1 must be a 5-bit unsigned > literal" } */ > + __builtin_mtfsb0(32); /* { dg-error "argument 1 must be a 5-bit unsigned > literal" } */ > > - __builtin_mtfsb1(-1); /* { dg-error "Argument must be a constant between > 0 and 31" } */ > - __builtin_mtfsb1(32); /* { dg-error "Argument must be a constant between > 0 and 31" } */ > + __builtin_mtfsb1(-1); /* { dg-error "argument 1 must be a 5-bit unsigned > literal" } */ > + __builtin_mtfsb1(32); /* { dg-error "argument 1 must be a 5-bit unsigned > literal" } */ > > - __builtin_set_fpscr_rn(-1); /* { dg-error "Argument must be a value > between 0 and 3" } */ > - __builtin_set_fpscr_rn(4); /* { dg-error "Argument must be a value > between 0 and 3" } */ > + __builtin_set_fpscr_rn(-1); /* { dg-error "argument 1 must be a variable > or a literal between 0 and 3, inclusive" } */ > + __builtin_set_fpscr_rn(4); /* { dg-error "argument 1 must be a variable > or a literal between 0 and 3, inclusive" } */ > } This regressed as well. > --- a/gcc/testsuite/gcc.target/powerpc/vec-gnb-2.c > +++ b/gcc/testsuite/gcc.target/powerpc/vec-gnb-2.c > @@ -20,7 +20,7 @@ do_vec_gnb (vector unsigned __int128 source, int stride) > case 5: >return vec_gnb (source, 1);/* { dg-error "between 2 and 7" } */ > case 6: > - return vec_gnb (source, stride); /* { dg-error "unsigned > literal" } */ > + return vec_gnb (source, stride); /* { dg-error "literal" } */ > case 7: >return vec_gnb (source, 7); Terse :-) I think it will work fine though. Segher

Re: [PATCH v2] rs6000: Test case adjustments for new builtins

2021-11-17 Thread Segher Boessenkool
> > Hrm, make this say "must be a literal between 0 and 15, inclusive" like > > the other errors? > > The "n-bit unsigned literal" is the usual case. I'll provide more explanation > in the separate patch. We should use the same formulation always. I like the more verbose more exact less confusing and even *correct* formulation :-) > Again, I'm sorry this was difficult to review. Originally I thought it would > be easiest > to keep all these together, but that clearly wasn't helpful. I'll work on > breaking > this up. In general it is best to keep testcase changes together with the patch that necessitates them. You cannot use this now; this is one of the reasons why it is much better to do the changes step by step (where every change is immediately engaged!) This is more work up front, but it may well be less work in total. It certainly is less frustrating :-) Segher

Re: [PATCH] rs6000: Better error messages for power8/9-vector builtins

2021-11-17 Thread Segher Boessenkool
ot; and no "-mno-vsx". And no -mno-altivec. And and and. There is a huge web. > It's not a strong objection, since specifying "-mno-vsx" should be > uncommon. (Right?) And, specifying "-mcpu=power8 -mvsx" is harmless. Maybe the warning could say "requires -mcpu=power8 (and -mvsx)"? Is that clearer, to your eye? Segher

Re: [PATCH] rs6000: Better error messages for power8/9-vector builtins

2021-11-17 Thread Segher Boessenkool
On Tue, Nov 16, 2021 at 11:12:35AM -0600, Bill Schmidt wrote: > Hi! During a previous patch review, Segher asked that I provide better > messages when builtins are unavailable because they require both a minimum > CPU and the enablement of VSX instructions. This patch does

Re: [PATCH] rs6000: Builtins test changes for BFP scalar tests

2021-11-17 Thread Segher Boessenkool
__builtin_vec_scalar_test_neg_dp > __builtin_vec_scalar_test_neg_qp > which are redundant with the "real" overload: > __builtin_vec_scalar_test_neg > The latter maps to three builtins of the appropriate type. Yes. And the new ones are undocumented and useless just as well, they just have better names. Segher

Re: [PATCH] rs6000: Builtin test changes for int_128bit-runnable.c

2021-11-18 Thread Segher Boessenkool
an-assembler-times {\mvcmpgtuq\M} 26 } } */ > /* { dg-final { scan-assembler-times {\mvmuloud\M} 1 } } */ > /* { dg-final { scan-assembler-times {\mvmulesd\M} 1 } } */ > /* { dg-final { scan-assembler-times {\mvmulosd\M} 1 } } */ If you think it actually generates better code now, and this is expected code, then okay for trunk. Thanks! Segher

Re: [PATCH][V4] rs6000: Remove unnecessary option manipulation.

2021-11-18 Thread Segher Boessenkool
Hi! On Thu, Nov 18, 2021 at 01:45:30PM +0100, Martin Liška wrote: > @Segher: PING This is the first time I recieved this. Please resend, without line wrapping (format=flawed). Segher

Re: [PATCH 2/6] Add returns_zero_on_success/failure attributes

2021-11-18 Thread Segher Boessenkool
these things in Gimple and RTL as well, and not just on function calls: also on other expressions. Adding attributes that allow to describe this (partially, only per function) in C source code does not bring us closer to where we need to be. Segher

Re: [PATCH] rs6000: Builtins test changes for BFP scalar tests

2021-11-18 Thread Segher Boessenkool
this same kind of error message for the old code. Yes. And it still is a regression (in *this* case). Segher

Re: [PATCH] rs6000: Builtins test changes for BFP scalar tests

2021-11-18 Thread Segher Boessenkool
s builtin Y" with "overloaded builtin X is > implemented by builtin Y" as a better explanation? That is better (although builtin Y *does not exist* as far as the user is concerned: it is not documented, and you cannot write it in source code afaics). Segher

Re: [PATCH] rs6000: Builtins test changes for byte-in-set-2.c

2021-11-18 Thread Segher Boessenkool
On Thu, Nov 18, 2021 at 07:42:34AM -0600, Bill Schmidt wrote: > gcc/testsuite/ > * gcc.target/powerpc/byte-in-set-2.c: Adjust error message. "Adjust expected error message" maybe? Okay for trunk. Thanks! Segher

Re: [PATCH] rs6000: Builtins test changes for BFP scalar tests

2021-11-18 Thread Segher Boessenkool
On Thu, Nov 18, 2021 at 03:30:48PM -0600, Bill Schmidt wrote: > > On 11/18/21 3:16 PM, Segher Boessenkool wrote: > > Hi! > > > > On Wed, Nov 17, 2021 at 05:06:05PM -0600, Bill Schmidt wrote: > >>> I don't like that at all. The user didn't write

Re: [PATCH][V4] rs6000: Remove unnecessary option manipulation.

2021-11-19 Thread Segher Boessenkool
On Fri, Nov 19, 2021 at 12:32:09PM +0100, Martin Liška wrote: > On 11/18/21 19:59, Segher Boessenkool wrote: > >Please resend, without line wrapping (format=flawed). > > Done in the original [v4] email, see here: > https://gcc.gnu.org/pipermail/gcc-patches/2021-November/5842

Re: [PATCH][V4] rs6000: Remove unnecessary option manipulation.

2021-11-19 Thread Segher Boessenkool
umented Var(rs6000_optimize_swaps) Init(1) > Save > Analyze and remove doubleword swaps from VSX computations. > > munroll-only-small-loops > -Target Undocumented Var(unroll_only_small_loops) Init(0) Save > +Target Undocumented Var(unroll_only_small_loops) Init(0) Save > EnabledBy(funroll-loops) > ; Use conservative small loop unrolling. That is the opposite of the original logic. Segher

Re: [PATCH] rs6000: Add optimizations for _mm_sad_epu8

2021-11-19 Thread Segher Boessenkool
a, b); > +#endif So hrm, maybe we should have the vec_absd macro (or the builtin) always, just expanding to three insns if necessary. Okay for trunk with approproate changelog and commit message changes. Thanks! Segher

Re: [PATCH] rs6000: Add Power10 optimization for most _mm_movemask*

2021-11-19 Thread Segher Boessenkool
ntrinsics, when `_ARCH_PWR10`. > > 2021-10-21 Paul A. Clarke > > gcc > * config/rs6000/xmmintrin.h (_mm_movemask_ps): Use vec_extractm > when _ARCH_PWR10. > * config/rs6000/emmintrin.h (_mm_movemask_pd): Likewise. > (_mm_movemask_epi8): Likewise. Okay for trunk. Thanks! Segher

Re: [PATCH 2/6] Add returns_zero_on_success/failure attributes

2021-11-19 Thread Segher Boessenkool
On Thu, Nov 18, 2021 at 06:45:42PM -0500, David Malcolm wrote: > On Thu, 2021-11-18 at 14:08 -0600, Segher Boessenkool wrote: > > We need some way to describe these things in Gimple and RTL as well, > > and not just on function calls: also on other expressions.  Adding > > att

Re: [PATCH] rs6000: Add [power6-64] stanza to new builtin support

2021-11-22 Thread Segher Boessenkool
&& rs6000_cpu == PROCESSOR_CELL) (I do realise this entry isn't correct formatting either). Okay for trunk with those things fixed. Thanks! Segher

Re: [PATCH v6] rtl: builtins: (not just) rs6000: Add builtins for fegetround, feclearexcept and feraiseexcept [PR94193]

2021-11-24 Thread Segher Boessenkool
cit support. Which is a shame, but it seems we cannot avoid this. Especially the "fesetround should be a function, not a macro" argument is a showstopper :-/ Thanks, Segher

Re: [PATCH v7] rtl: builtins: (not just) rs6000: Add builtins for fegetround, feclearexcept and feraiseexcept [PR94193]

2021-11-25 Thread Segher Boessenkool
t; --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/builtin-fegetround.c > + int i, rounding, expected; > + const int rm[] = {FE_TONEAREST, FE_TOWARDZERO, FE_UPWARD, FE_DOWNWARD}; > + for (i = 0; i < sizeof(rm); i++) That should be sizeof rm / sizeof rm[0] ? It accesses out of bounds as it is. Maybe test more values? At least 0, but also combinations of these FE_ bits, and maybe even FE_INVALID? With such changes the rs6000 parts are okay for trunk. Thanks! I looked at the generic changes as well, and they all look fine to me. Segher

Re: [PATCH] rs6000/test: Add emulated gather test case

2021-11-26 Thread Segher Boessenkool
unk. Thanks! Segher

Re: [PATCH, committed] rs6000: Fix test_mffsl.c effective target check

2021-11-26 Thread Segher Boessenkool
if we can run modulo insns anyway :-) So please change the test here. And bonus points if you can rename p9modulo_hw and _ok (in a separate patch of course). Segher

Re: [PATCH] rs6000: Clarify overloaded builtin diagnostic

2021-11-26 Thread Segher Boessenkool
ngelog lines early, especially not after a colon. It looks like something might be missing (and interrupts the flow of reading anyway). Okay for trunk. Thanks! Segher

Re: [PATCH] rs6000: Fix some issues in rs6000_can_inline_p [PR102059]

2021-11-29 Thread Segher Boessenkool
P10 fusion types here, as well as MASK_P10_FUSION? > + > + if (always_inline) { > +caller_isa &= ~always_inline_safe_mask; > +callee_isa &= ~always_inline_safe_mask; > + } "{" starts a new line, indented. Segher

Re: [PATCH v2] rs6000: Modify the way for extra penalized cost

2021-11-29 Thread Segher Boessenkool
Hi! On Tue, Sep 28, 2021 at 04:16:04PM +0800, Kewen.Lin wrote: > This patch follows the discussions here[1][2], where Segher > pointed out the existing way to guard the extra penalized > cost for strided/elementwise loads with a magic bound does > not scale. > > The way with

Re: [PATCH v2] combine: Tweak the condition of last_set invalidation

2021-11-29 Thread Segher Boessenkool
that block of the insn which > + last_set_table_tick was set for. */ > + > + intlast_set_table_luid; I'm not sure what this variable is for. The comment says something else than the variable name does, and now I don't know what to believe :-) The name says it is for a SET, the explanation says it is for a USE. Segher

Re: [PATCH] rs6000: Remove builtin mask check from builtin_decl [PR102347]

2021-11-29 Thread Segher Boessenkool
a good idea to do backport something, especially if it isn't obviously super safe. Segher

Re: [PATCH] rs6000: Remove builtin mask check from builtin_decl [PR102347]

2021-11-29 Thread Segher Boessenkool
Hi! On Tue, Sep 28, 2021 at 04:13:40PM +0800, Kewen.Lin wrote: > PR target/102347 > * config/rs6000/rs6000-call.c (rs6000_builtin_decl): Remove builtin > mask check. (Don't wrap lines early please). Okay for trunk and all backports. Thanks! Segher

Re: [PATCH] Modify combine pattern by anding a pseudo with its nonzero bits

2021-11-30 Thread Segher Boessenkool
er-times {(?n)^\s+rldicl} 7790 { target lp64 } } > } */ > >  /* { dg-final { scan-assembler-times {(?n)^\s+rlwimi} 1692 { target ilp32 } > } } */ > -/* { dg-final { scan-assembler-times {(?n)^\s+rlwimi} 1666 { target lp64 } } > } */ > +/* { dg-final { scan-assembler-times {(?n)^\s+rlwimi} 1692 { target lp64 } } > } */ > >  /* { dg-final { scan-assembler-times {(?n)^\s+mulli} 5036 } } */ Are the new rlwimi's good to have, or can we do those with simpler or fewer insns? Segher

Re: [PATCH v2] rs6000: Modify the way for extra penalized cost

2021-11-30 Thread Segher Boessenkool
Hi! On Tue, Nov 30, 2021 at 01:05:48PM +0800, Kewen.Lin wrote: > on 2021/11/30 上午6:06, Segher Boessenkool wrote: > > On Tue, Sep 28, 2021 at 04:16:04PM +0800, Kewen.Lin wrote: > >> unsigned adjusted_cost = (nunits == 2) ? 2 : 1; > >> unsigned extra_co

Re: [PATCH] rs6000: Mirror fix for PR102347 into the new builtins support

2021-12-01 Thread Segher Boessenkool
resolved? Just in the new builtin code, don't spend time on the old stuff :-) Segher

Re: [PATCH v2] rs6000: Fix a handful of 32-bit built-in function problems

2021-12-01 Thread Segher Boessenkool
unk. Thanks! Could you put some short blurb about the changed prototype of the HTM reg builtins in the release notes please? Thanks x2 :-) Segher

Re: [PATCH] rs6000: Builtins test changes for BFP scalar tests

2021-12-01 Thread Segher Boessenkool
On Thu, Nov 18, 2021 at 03:59:41PM -0600, Bill Schmidt wrote: > On 11/18/21 3:32 PM, Segher Boessenkool wrote: > > On Thu, Nov 18, 2021 at 03:30:48PM -0600, Bill Schmidt wrote: > >> On 11/18/21 3:16 PM, Segher Boessenkool wrote: > >>> On Wed, Nov 17, 2021 at 05:06:

Re: [PATCH] rs6000: Builtins test changes for BFP scalar tests

2021-12-01 Thread Segher Boessenkool
.c: Likewise. > * gcc.target/powerpc/bfp/scalar-insert-exp-8.c: Likewise. > * gcc.target/powerpc/bfp/scalar-test-neg-2.c: Likewise. > * gcc.target/powerpc/bfp/scalar-test-neg-3.c: Likewise. > * gcc.target/powerpc/bfp/scalar-test-neg-5.c: Likewise. Segher

Re: [PATCH] rs6000: Builtins test changes for compare-bytes tests

2021-12-01 Thread Segher Boessenkool
7; requires builtin '__builtin_p6_cmpb' I am still not happy with this at all, it is clearly worse than what we had. But, okay for trunk, and hopefully we can fix it before GCC 12 release. Thanks! Segher

Re: [PATCH] rs6000: Builtins test changes for pr80315-*.c, pr88100.c

2021-12-01 Thread Segher Boessenkool
ssages while still introducing uniformity. This patch adjusts error > messages for some cases where this produces changed messages. > > Tested on powerpc64le-linux-gnu and powerpc64-linux-gnu (-m32/-m64) with > no regressions. is this okay for trunk? We should have opnly the middle two of those messages. But, okay for trunk if you put this on some to-do list. Thanks! Segher

Re: [PATCH] rs6000: Builtins test changes for pragma_misc9.c

2021-12-01 Thread Segher Boessenkool
-gnu (-m32/-m64) > with no regressions. Is this okay for trunk? Okay. Thanks! Segher

Re: [PATCH] rs6000: Builtins test changes for test_fpscr_[d]rn_builtin_error.c

2021-12-01 Thread Segher Boessenkool
t changes from previous > messages while still introducing uniformity. This patch adjusts error > messages for some cases where this produces changed messages. In > particular, some messages are improved because previously they did not > admit the possibility that an argument could hold a variable. Same comment as on the previous patch. But, okay for trunk. Thanks! Segher

Re: [PATCH] rs6000: Builtins test changes for pr80315-*.c, pr88100.c

2021-12-02 Thread Segher Boessenkool
On Wed, Dec 01, 2021 at 04:42:19PM -0600, Bill Schmidt wrote: > On 12/1/21 4:29 PM, Segher Boessenkool wrote: > > On Thu, Nov 18, 2021 at 10:15:21AM -0600, Bill Schmidt wrote: > >> All error messages are now one of the following: > >> "argument %d m

Re: [PATCH] rs6000: Builtins test changes for test_fpscr_[d]rn_builtin_error.c

2021-12-02 Thread Segher Boessenkool
On Thu, Dec 02, 2021 at 10:43:24AM -0600, Bill Schmidt wrote: > The new built-in infrastructure is now enabled! Congratulations, and thanks for all the work! Segher

Re: [PATCH] rs6000: testsuite: Add rop_ok effective-target function

2021-12-02 Thread Segher Boessenkool
Hi! On Thu, Nov 11, 2021 at 04:12:08PM -0600, Peter Bergner wrote: > This patch adds a new effective-target function that tests whether > it is safe to emit the ROP-protect instructions and updates the > ROP test cases to use it. > > Segher, as we discussed offline, this uses the

Re: [PATCH] rs6000: Fix use of wrong enum for built-in function code.

2021-12-03 Thread Segher Boessenkool
ers. What an informative changelog ;-) Okay for trunk. Thanks! Segher

Re: [PATCH v2] rs6000: Fix some issues in rs6000_can_inline_p [PR102059]

2021-12-06 Thread Segher Boessenkool
t it into variables, as you found out then you get the usual naming problem. But you can just split it in the code: if (important_condition || another_important_one /* comment explaining things */ || bla1 || bla2 || bla3 || bla4 || bla5) > > Why are there OPTION_MASKs for separate P10 fusion types here, as well as > > MASK_P10_FUSION? > > Mike helped to explain the history, I've updated all of them using > OPTION_MASK_ > to avoid potential confusion. That is one thing, sure, but why are both needed? Both the "main" flag, and the "details" flags. (The latter should soon go away btw). Segher

Re: [PATCH 0/6] RFC: adding support to GCC for detecting trust boundaries

2021-12-06 Thread Segher Boessenkool
ddress spaces either! :-) > >Other attributes > > > > > >Patch 2 in the kit adds: > > __attribute__((returns_zero_on_success)) > >and > > __attribute__((returns_nonzero_on_success)) > >as hints to the analyzer that it's worth bifurcating the analysis of > >such functions (to explore failure vs success, and thus to better > >explore error-handling paths). It's also a hint to the human reader of > >the source code. > > I thing being able to express something along these lines would > be useful even outside the analyzer, both for warnings and, when > done right, perhaps also for optimization. So I'm in favor of > something like this. I'll just reiterate here the comment on > this attribute I sent you privately some time ago. What is "success" though? You probably want it so some checker can make sure you do handle failure some way, but how do you see what is handling failure and what is handling the successful case? Segher

Re: [PATCH 1/6] rs6000: Remove new_builtins_are_live and dead code it was guarding

2021-12-06 Thread Segher Boessenkool
te_init_file): Don't initialize new_builtins_are_live. > * config/rs6000/rs6000.c (rs6000_builtin_vectorized_function): Remove > test for new_builtins_are_live and simplify. > (rs6000_builtin_md_vectorized_function): Likewise. > (rs6000_builtin_reciprocal): Likewise. > (add_condition_to_bb): Likewise. > (rs6000_atomic_assign_expand_fenv): Likewise. You could have said "Remove old builtins code." everywhere ;-) Okay for trunk. Thanks! Segher

Re: [PATCH 0/6] RFC: adding support to GCC for detecting trust boundaries

2021-12-08 Thread Segher Boessenkool
Hi! On Wed, Dec 08, 2021 at 07:06:30PM -0500, David Malcolm wrote: > On Mon, 2021-12-06 at 13:40 -0600, Segher Boessenkool wrote: > > Named address spaces are completely target-specific.  Defining them > > with > > a pragma like this does not allow you to set the pointer

Re: [PATCH 0/6] RFC: adding support to GCC for detecting trust boundaries

2021-12-09 Thread Segher Boessenkool
On Thu, Dec 09, 2021 at 09:42:04AM -0700, Martin Sebor wrote: > On 12/6/21 12:40 PM, Segher Boessenkool wrote: > >Named address spaces are completely target-specific. > > My understanding of these kernel/user address spaces that David > is adding for the benefit of the an

[PATCH] Always enable LRA

2022-10-13 Thread Segher Boessenkool
, but it then crashes with /home/segher/src/kernel/drivers/tty/serial/serial_core.c:1029:1: internal compiler error: maximum number of generated reload insns per insn achieved (90) (and in three more files) which can mean anything unfortunately. c6x is more exciting: /home/segher/src/kernel/fs

Re: [PATCH] Always enable LRA

2022-10-14 Thread Segher Boessenkool
Ideally LRA should do > a better job; right now I believe it doesn't really do these things at all. > Targets like pdp11 and vax would like these. So what does it do now? Break every more complex addressing mode apart again? Or ICE? Or something in between? Segher

Re: [PATCH] Always enable LRA

2022-10-14 Thread Segher Boessenkool
On Fri, Oct 14, 2022 at 03:20:40PM +0900, Takayuki 'January June' Suwa wrote: > On 2022/10/14 8:56, Segher Boessenkool wrote: > > And finally, xtensa does > > /home/segher/src/gcc/libgcc/libgcc2.c:840:1: error: insn does not satisfy > > its constraints: > >

Re: [PATCH] Always enable LRA

2022-10-14 Thread Segher Boessenkool
Hi! On Thu, Oct 13, 2022 at 10:47:20PM -0600, Jeff Law wrote: > On 10/13/22 17:56, Segher Boessenkool wrote: > >h8300 fails during GCC build: > >/home/segher/src/gcc/libgcc/unwind.inc: In function > >'_Unwind_SjLj_RaiseException': > >/home/segher/src/gcc/libg

Re: [PATCH] Always enable LRA

2022-10-14 Thread Segher Boessenkool
It is the only way it can know if it needs to reload more. Even if it somehow can assume it doesn't have to check this in some cases, an assert (inside a CHECKING_P) would be nice? Segher

Re: [PATCH] Always enable LRA

2022-10-14 Thread Segher Boessenkool
On Fri, Oct 14, 2022 at 07:58:39PM +, Koning, Paul wrote: > > On Oct 14, 2022, at 2:03 PM, Jeff Law via Gcc-patches > > wrote: > > On 10/14/22 11:35, Segher Boessenkool wrote: > >> On Fri, Oct 14, 2022 at 11:07:43AM -0600, Jeff Law wrote: > >>>> LRA

Re: [PATCH, rs6000] Tests of ARCH_PWR8 and -mno-vsx option. (1/2)

2022-10-17 Thread Segher Boessenkool
{ dg-do preprocess } */ > +/* Test whether the ARCH_PWR8 define remains set after disabling vsx. > + This also confirms __ALTIVEC__ remains set when VSX is disabled. */ > +/* This is the primary test at issue in GCC PR 101865 */ > +/* { dg-options "-dM -E -mdejagnu-cpu=power9 -mno-vsx" } */ > +/* {xfail *-*-*} */ An xfail always needs a comment :-) Segher

Re: [PATCH, rs6000] Split TARGET_POWER8 from TARGET_DIRECT_MOVE [PR101865] (2/2)

2022-10-17 Thread Segher Boessenkool
NT_SCALAR_64BIT) > { >if (op0_regno == op1_regno) > return ASM_COMMENT_START " vec_extract to same register"; > > - else if (INT_REGNO_P (op0_regno) && TARGET_DIRECT_MOVE > + else if (INT_REGNO_P (op0_regno) && TARGET_POWER8 > && TARGET_POWERPC64) That fits on one line now. Thanks, Segher

Re: [PATCH, rs6000] Split TARGET_POWER8 from TARGET_DIRECT_MOVE [PR101865] (2/2)

2022-10-18 Thread Segher Boessenkool
Hi! On Tue, Oct 18, 2022 at 10:17:30AM -0500, will schmidt wrote: > On Mon, 2022-10-17 at 13:08 -0500, Segher Boessenkool wrote: > > It did not happen in GCC 9 obviously. Do you want to take a > > shot? It > > doesn't have to be all at once, it's probably best if

Re: [PATCH 0/2] Add a Fourth parameter for prefetch and Support Intel PREFETCHI

2022-10-19 Thread Segher Boessenkool
Power code, or such? Segher

Re: [PATCH 1/2] Add a parameter for the builtin function of prefetch to align with LLVM

2022-10-19 Thread Segher Boessenkool
didn't change btw. Call the new one instruction_prefetch or something equally boring maybe :-) When you send an updated patch, please split it up better? Generic changes and documentation in one patch, target changes in a separate patch or patches, and testsuite is distinct as well. It isn't nice to have to scroll through thousands of lines to see if there is anything relevant to you. Thanks, Segher

Re: [PATCH 1/2] Add a parameter for the builtin function of prefetch to align with LLVM

2022-10-19 Thread Segher Boessenkool
es all existing code using these builtins invalid. If you need such testcase changes, that is a red flag. Segher

Re: [PATCH 1/2] Add a parameter for the builtin function of prefetch to align with LLVM

2022-10-20 Thread Segher Boessenkool
th the mistake of having unified L1 caches it seems natural, but thankfully most machines do not do that. Segher

Re: [PATCH 1/2] Add a parameter for the builtin function of prefetch to align with LLVM

2022-10-20 Thread Segher Boessenkool
On Thu, Oct 20, 2022 at 11:12:01AM +0800, Hongtao Liu wrote: > On Thu, Oct 20, 2022 at 9:39 AM Hongtao Liu wrote: > > On Thu, Oct 20, 2022 at 5:08 AM Segher Boessenkool > > > Please use a separate pattern for this, and leave prefetch to mean data > > > prefetch, as doc

Re: [PATCH 1/2] Add a parameter for the builtin function of prefetch to align with LLVM

2022-10-20 Thread Segher Boessenkool
parameter to indicate r/w and locality in > prefetch. I > suppose it is quite similar in this case. Since the pattern is already there, > I prefer > reusing them. You can use the data prefetch RTL code for all data loads just as well, it is more closely related than this -- but most people would call that insanity! Segher

Re: [PATCH] Rename nonzero_bits to known_zero_bits.

2022-10-21 Thread Segher Boessenkool
cc > @@ -816,7 +816,7 @@ handle_builtin_alloca (gcall *call, gimple_stmt_iterator > *iter) >tree redzone_size = build_int_cst (size_type_node, ASAN_RED_ZONE_SIZE); > >/* Extract lower bits from old_size. */ > - wide_int size_nonzero_bits = get_nonzero_bits (old_size); > + wide_int size_nonzero_bits = get_known_zero_bits (old_size); Such variables should also be renamed :-( Segher

Re: [PATCH] Rename nonzero_bits to known_zero_bits.

2022-10-21 Thread Segher Boessenkool
On Fri, Oct 21, 2022 at 06:51:17PM +0200, Jakub Jelinek wrote: > On Fri, Oct 21, 2022 at 11:45:33AM -0500, Segher Boessenkool wrote: > > On Fri, Oct 21, 2022 at 03:14:26PM +0200, Aldy Hernandez via Gcc-patches > > wrote: > > > * asan.cc (handle_builtin_alloca): Rename

Re: [PATCH] Rename nonzero_bits to known_zero_bits.

2022-10-21 Thread Segher Boessenkool
or known_zero_bits and known_one_bits vs. known_bits and known_bit_values, but the latter is a bit more costly to compute, but more importantly it is usually a lot less convenient in use. (A third option is known_bits and known_zero_bits?) Segher

Re: [PATCH 1/2] Add a parameter for the builtin function of prefetch to align with LLVM

2022-10-21 Thread Segher Boessenkool
ice; and that wouldn't > work anyway because in the end we do need distinct instructions. Right. The builtin as well as the RTL expressions. But having nasty builtin definitions hurts our users, and nasty RTL only ourselves ;-) Segher

Re: [PATCH 1/2] Add a parameter for the builtin function of prefetch to align with LLVM

2022-10-24 Thread Segher Boessenkool
On Mon, Oct 24, 2022 at 11:00:26AM +0100, Richard Sandiford wrote: > Segher Boessenkool writes: > > On Thu, Oct 20, 2022 at 07:34:13AM +, Jiang, Haochen wrote: > >> > > + /* Argument 3 must be either zero or one. */ > >> > > + if

Re: [PATCH-2, rs6000] Reverse V8HI on Power8 by vector rotation [PR100866]

2022-10-24 Thread Segher Boessenkool
an be unrolled. Okay for trunk. Thanks! Segher

[PATCH] rs6000: Add CCANY; replace signed by

2022-10-25 Thread Segher Boessenkool
This is in preparation for adding CCFP, and maybe CCEQ, and whatever other CC mode we may want later. CCANY is used for CC mode consumers that actually can take any of the four CR field bits. Tested on p7 and p9; committing, Segher 2022-10-25 Segher Boessenkool * config/rs6000

Re: [PATCH] x86: Replace ne:CCC/ne:CCO with UNSPEC_CC_NE in neg patterns

2022-10-28 Thread Segher Boessenkool
for that yourself), there is nothing the generic code knows about the semantics of any unspec after all. AFIACS there is no way to express overflow in a CC, but an unspec can help, sure. You need to fix the setter side as well though. Segher

<    2   3   4   5   6   7   8   9   10   11   >