Re: [PATCH, rs6000] Add multiply-add expand pattern [PR103109]

2022-08-01 Thread Segher Boessenkool
ot sure there is some coverage for this kind of multiply-add (promoted first > then mul and add), if no, it seems better to add one runnable test case. Good point. We won't automatically see it from the compiler build itself for example, int128 isn't used there. Segher

Re: [PATCH, rs6000] Add multiply-add expand pattern [PR103109]

2022-08-01 Thread Segher Boessenkool
ot; "r"))) > + 8))] > + "TARGET_POWERPC64 && TARGET_MADDLD && BYTES_BIG_ENDIAN" > + "maddld %0,%1,%2,%3" > + [(set_attr "type" "mul")]) So, hrm. This (as well as the _le version) simplifies to just the :DI ops, without subreg. Not properly simplified patterns like this will not ever match, so most optimisations on this will not work :-( Segher

Re: [PATCH] Some additional zero-extension related optimizations in simplify-rtx.

2022-08-02 Thread Segher Boessenkool
ign extending > unmodified (if it > > is 0x for an extend from SI to DI for example). > > Fortunately, C[LT]Z_DEFINED_VALUE_AT_ZERO being defined to return a negative > result, such as -1 is already handled (accounted for) in nonzero_bits. The > relevant > code in rtlanal.cc's nonzero_bits1 is: A negative result, yes. But that was not my example. Segher

Re: [PATCH, rs6000] TARGET_MADDLD should include TARGET_POWERPC64

2022-08-03 Thread Segher Boessenkool
t is a bit confusing then. Sorry for confusing things :-( Add a test for SImode maddld as well? Please fix things up once again and resend? Sorry again! Segher

Re: [PATCH, V2] Do not enable -mblock-ops-vector-pair.

2022-08-03 Thread Segher Boessenkool
code setting -mblock-ops-vector-pair. Okay for trunk (and any backports you may need). Thanks! Segher

Re: [PATCH, rs6000] Correct return value of check_p9modulo_hw_available

2022-08-04 Thread Segher Boessenkool
explicit as well). Terse is good. Explicit is good as well :-) (You don't have to make this change here of course, but keep it in mind for the future :-) ) Segher

Re: [PATCH, rs6000] TARGET_MADDLD should include TARGET_POWERPC64

2022-08-04 Thread Segher Boessenkool
Hi! On Thu, Aug 04, 2022 at 11:17:48AM +0800, HAO CHEN GUI wrote: > On 4/8/2022 上午 12:54, Segher Boessenkool wrote: > > Hrm. But the maddld insn is useful for SImode as well, in 32-bit mode, > > it is just its name that is a bit confusing then. Sorry for confusing > > thin

Re: [PATCH 0/5] IEEE 128-bit built-in overload support.

2022-08-05 Thread Segher Boessenkool
e for the > explicit _Float128/__float128 types, to always use TFmode for the long double > type, no matter which 128-bit floating point type is used, and IFmode for the > explicit __ibm128 type. Making TFmode different from KFmode and IFmode is not an improvement. NAK. Segher

Re: [PATCH] rs6000: Rework ELFv2 support for -fpatchable-function-entry* [PR99888]

2022-08-09 Thread Segher Boessenkool
target powerpc_elfv2 } */ > +/* Specify -mcpu=power9 to ensure global entry is needed. */ > +/* { dg-options "-mdejagnu-cpu=power9" } */ Why would it be needed for p9, and not older, or newer? Every function always has a GEP, so I'm not sure what you are trying to say here anyway :-) Rest looks good to me. Segher

Re: [PATCH] rs6000: Rework ELFv2 support for -fpatchable-function-entry* [PR99888]

2022-08-09 Thread Segher Boessenkool
Hi! On Tue, Aug 09, 2022 at 08:51:59PM +0800, Kewen.Lin wrote: > on 2022/8/9 18:35, Segher Boessenkool wrote: > >> +/* As ELFv2 ABI shows, the allowable bytes past the global entry > >> + point are 0, 4, 8, 16, 32 and 64. Considering there are two > >> +

Re: [PATCH v2, rs6000] Add multiply-add expand pattern [PR103109]

2022-08-09 Thread Segher Boessenkool
to run resp. compile testing makes things even more clear :-) > Nit: better to add one explicit "return 0;" to avoid possible warning. This is in main(), the C standard requires this to work without return (and it is common). But, before C99 the implicit return value from main() was undefined, so yes, it could warn then. Does it? Segher

Re: [PATCH v2, rs6000] Add multiply-add expand pattern [PR103109]

2022-08-09 Thread Segher Boessenkool
;)]) I suppose attr "size" isn't relevant for any of the cpus that implement these instructions? Okay for trunk. Thanks! (The testcase improvements can be done later). Segher

Re: [PATCH] rs6000: Fix incorrect RTL for Power LE when removing the UNSPECS [PR106069]

2022-08-09 Thread Segher Boessenkool
can be mapped into vmrghb or vmrglb, this looks > misleading. Maybe we can add the corresponding _direct_le and _direct_be > versions, both are mapped into the same insn but have different RTL > patterns. If that is the best we can do, that is the best we can do. It would be lovely if there was something nicer we can do though :-) Segher

Re: [PATCH] rs6000: Enable generate const through pli+pli+rldimi

2022-08-10 Thread Segher Boessenkool
I am not against it, but some more rationale would be good :-) Btw, this splitter uses operands[2] and [3] in the replacement, and neither of those exists. The replacement never is used of course. Instead, rs6000_emit_set_const is called always. It would be less misleading if the replacement text was just "(pc)" or such. Segher

Re: [PATCH 0/5] IEEE 128-bit built-in overload support.

2022-08-10 Thread Segher Boessenkool
On Wed, Aug 10, 2022 at 02:23:27AM -0400, Michael Meissner wrote: > On Fri, Aug 05, 2022 at 01:19:05PM -0500, Segher Boessenkool wrote: > > On Thu, Jul 28, 2022 at 12:43:49AM -0400, Michael Meissner wrote: > > > These patches lay the foundation for a set of follow-on patches that

Re: [PATCH v2] rs6000: Fix incorrect RTL for Power LE when removing the UNSPECS [PR106069]

2022-08-10 Thread Segher Boessenkool
gt;patterns. Looking forward to Segher's and David's suggestions. > > Thanks! Do you mean same RTL patterns with different hw insn? A pattern called altivec_vmrghb_direct_le should always emit a vmrghb instruction, never a vmrglb instead. Misleading names are an expensive problem. Segher

Re: [PATCH v3] Modify combine pattern by a pseudo AND with its nonzero bits [PR93453]

2022-08-10 Thread Segher Boessenkool
27;t be recoged later. */ Can this not be done with a splitter in the machine description? Segher

Re: [PATCH] rs6000: Enable generate const through pli+pli+rldimi

2022-08-11 Thread Segher Boessenkool
Hi! On Thu, Aug 11, 2022 at 08:52:49PM +0800, Jiufu Guo wrote: > Segher Boessenkool writes: > > On Wed, Aug 10, 2022 at 03:11:23PM +0800, Jiufu Guo wrote: > >> @@ -9659,7 +9659,7 @@ (define_split > >> ;; When non-easy constants can go in the TOC, this should us

Re: [PATCH V3 1/4] rs6000: build constant via li;rotldi

2023-06-16 Thread Segher Boessenkool
arget/powerpc/const-build.c > @@ -0,0 +1,54 @@ > +/* { dg-do run } */ > +/* { dg-options "-O2 -save-temps" } */ > +/* { dg-require-effective-target has_arch_ppc64 } */ Please put a tiny comment here saying what this test is *for*? The file name is a bit of hint already, but you can indicate much more in one or two lines :-) With those adjustments, okay for trunk. Thanks! (If -c doesn't work, it needs more explanation). Segher

Re: [PATCH, V6] Fix power10 fusion and -fstack-protector, PR target/105325

2023-06-20 Thread Segher Boessenkool
patterns did not handle the possibility that the > load > + might be prefixed. The -fstack-protector option is needed to show the > + bug. */ Mention the PR number somewhere in the text as well? For grep etc. Okay for trunk, with some more reasonable commmit message. Thank you! Also okay for all backports. Segher

Re: [PATCH] rs6000: Don't ICE when generating vector pair load/store insns [PR110411]

2023-07-06 Thread Segher Boessenkool
,21 @@ > +/* PR target/110411 */ > +/* { dg-options "-O2 -mdejagnu-cpu=power10 -S -mblock-ops-vector-pair" } */ -S in testcases is wrong. Why do you want this? It is *good* if this is hauled through the assembler as well! If you *really* want this you use "dg-do assemble", but you shouldn't. Segher

Re: [PATCH] rs6000: Don't ICE when generating vector pair load/store insns [PR110411]

2023-07-06 Thread Segher Boessenkool
On Thu, Jul 06, 2023 at 02:48:19PM -0500, Peter Bergner wrote: > On 7/6/23 12:33 PM, Segher Boessenkool wrote: > > On Wed, Jul 05, 2023 at 05:21:18PM +0530, P Jeevitha wrote: > >> --- a/gcc/config/rs6000/rs6000.cc > >> +++ b/gcc/config/rs6000/rs6000

Re: [PATCH] Fix typo in insn name.

2023-07-10 Thread Segher Boessenkool
have some example that makes better machine code after this change? Or would a better change perhaps be to just remove this pattern completely, if it doesn't do anything useful? I.e., please include a new testcase. Segher

Re: [PATCH v2] swap: Fix incorrect lane extraction by vec_extract() [PR106770]

2023-10-28 Thread Segher Boessenkool
commit message. Ideally the commit messsage will tell everything needed to understand the patch (so also to review the patch). Maybe add examples where needed. So reviewing the code in the patch should be an easy thing to do, after reading the commit message :-) Segher

Re: [PATCH] rs6000, Add missing overloaded bcd builtin tests

2023-10-31 Thread Segher Boessenkool
arget/powerpc/bcd-4.c tests all > > these > > OK, my simple scripts are not going to pickup the stuff in altivec.h. > They were just grepping for the built-in name in the test file > directory. You could use gcov to see which rs6000 builtins are not exercised by anything in the testsuite, maybe. This probably can be automated pretty nicely. Segher

Re: [committed] powerpc: Fix up expansion for WIDEN_MULT_PLUS_EXPR [PR108787]

2023-02-15 Thread Segher Boessenkool
x (power7, tested -m32/-m64), > powerpc64le-linux (power8 and another on power9 with > --with-cpu-64=power9 --with-tune-64=power9), preapproved by Segher in the > PR, committed to trunk. Thanks again :-) Segher

Re: [PATCH] rs6000: Fix vector parity support [PR108699]

2023-02-16 Thread Segher Boessenkool
R" > +{ > + rtx op1 = gen_lowpart (V16QImode, operands[1]); > + rtx res = gen_reg_rtx (V16QImode); > + emit_insn (gen_popcountv16qi2 (res, op1)); > + emit_insn (gen_p9v_parityb2 (operands[0], > + gen_lowpart (mode, res))); > + > + DONE; > +}) So first do a patch that is essentially just this? Later patches can do all other things (also, not do this expand for TImode at all, ho hum). Segher

Re: [PATCH] rs6000: Fix vector parity support [PR108699]

2023-02-16 Thread Segher Boessenkool
Hi! On Thu, Feb 16, 2023 at 08:06:02PM +0800, Kewen.Lin wrote: > on 2023/2/16 19:14, Segher Boessenkool wrote: > > On Thu, Feb 16, 2023 at 05:23:40PM +0800, Kewen.Lin wrote: > >> This patch is to fix the handling with one more pre-insn > >> vpopcntb. It also fixes

Re: [PATCH] rs6000: fmr gets used instead of faster xxlor [PR93571]

2023-02-17 Thread Segher Boessenkool
used on any CPU that has floating point registers. The eight alternative (the existing xxlor one) has "wa" constraints (via ) so it implicitly requires VSX to be enabled. You need to do something similar for what you want, but you also need to still allow fmr. Segher

Re: [PATCH] rs6000: Fix vector parity support [PR108699]

2023-02-20 Thread Segher Boessenkool
Hi! On Fri, Feb 17, 2023 at 11:33:16AM +0800, Kewen.Lin wrote: > on 2023/2/16 23:10, Segher Boessenkool wrote: > > No, you are right that the semantics are pretty much the same. Please > > just keep UNSPEC_PARITY everywhere. > > OK, since it has UNSPEC, I would hope the re

Re: [PATCH] rs6000: fmr gets used instead of faster xxlor [PR93571]

2023-02-21 Thread Segher Boessenkool
ould write this differently (and %xN is harmless then). > + return "unreachable"; No, never do that. There is "gcc_unreachable ()" if you need it. So, let's first do actual timings, and see if it is better on p9 and p10 as well (or at least not worse). Segher

Re: [PATCH] rs6000: fmr gets used instead of faster xxlor [PR93571]

2023-02-21 Thread Segher Boessenkool
On Tue, Feb 21, 2023 at 06:00:52PM +0530, Ajit Agarwal wrote: > On 21/02/23 4:34 pm, Segher Boessenkool wrote: > > Please domn't use a switch, it isn't needed. Instead use the "isa" > > attribute (with p7v here), and put the preferred alternative first. >

Re: [PATCH] rs6000: fmr gets used instead of faster xxlor [PR93571]

2023-02-24 Thread Segher Boessenkool
p7v,*, *, *, > - *, *, *, *, *, > - *, p8v,p8v,p10") > +"*, *, p7p8,p9v,p9v, > + p7v, p7v,*, *, *, > + *, *, *, *, *, > + *, p8v,p8v, *, *, > + p10") So, you swapped the xxlor and fmr entries, and added two nextra fmr entries at the end?! Segher

Re: [PATCH] swap: Fix incorrect lane extraction by vec_extract() [PR106770]

2023-02-27 Thread Segher Boessenkool
e? > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/pr106770.c > @@ -0,0 +1,20 @@ > +/* { dg-do compile } */ > +/* { dg-require-effective-target powerpc_p8vector_ok } */ > +/* { dg-options "-mdejagnu-cpu=power8 -O3 " } */ Is -O3 required? Use -O2 if you can. And no trailing spaces please. > +/* { dg-final { scan-assembler-times "xxpermdi" 2 } } */ Those two remaining are superfluous, so comment that please. Segher

Re: [PATCH] Fix RTL simplifications of FFS, POPCOUNT and PARITY.

2023-02-27 Thread Segher Boessenkool
_operation_1) : > Avoid generating FFS with mismatched operand and result modes, Please have at least one word after a colon, so that it doesn't look like something is missing. Changelog lines are 80 positions long :-) The patch is okay for trunk. Thank you! Segher

Re: [PATCH, rs6000] Tweak modulo define_insns to eliminate register copy

2023-02-27 Thread Segher Boessenkool
dg-options "-mdejagnu-cpu=power9 -O2" } */ ... the -mcpu= forces it to true always. > +/* Verify r3 is used as source and target, no copy inserted. */ > +/* { dg-final { scan-assembler-not {\mmr\M} } } */ That is probably good enough, yeah, since the test results in only a handful of insns. Segher

Re: [PATCH, rs6000] Tweak modulo define_insns to eliminate register copy

2023-02-27 Thread Segher Boessenkool
Hi! On Mon, Feb 27, 2023 at 02:12:23PM -0600, Pat Haugen wrote: > On 2/27/23 11:08 AM, Segher Boessenkool wrote: > >On Mon, Feb 27, 2023 at 09:11:37AM -0600, Pat Haugen wrote: > >>The define_insns for the modulo operation currently force the target > >>regist

Re: [PATCH] optabs: Fix up expand_doubleword_shift_condmove for shift_mask == 0 [PR108803]

2023-02-27 Thread Segher Boessenkool
h allow to make more inputs result in known (to the compiler) outputs. Segher

Re: [PATCH] optabs: Fix up expand_doubleword_shift_condmove for shift_mask == 0 [PR108803]

2023-02-27 Thread Segher Boessenkool
D note from there > and just adds it on insn 78 (note, besides this REG_DEAD issue the > IL is otherwise still sane, the previous cc setter 71 and its previous > uses 72 and 76 in between the move have been optimized away already in > an earlier successful combination). > And things go wild with the next successful combination: Yup. Segher

Re: [PATCH, rs6000] Tweak modulo define_insns to eliminate register copy

2023-02-27 Thread Segher Boessenkool
On Mon, Feb 27, 2023 at 04:03:56PM -0600, Pat Haugen wrote: > On 2/27/23 2:53 PM, Segher Boessenkool wrote: > >"Slightly". It takes 12 cycles for the two in parallel (64-bit, p9), > >but 17 cycles for the "cheaper" sequence (divd+mulld+subf, 12+5+2). It >

Re: [PATCH, V3] PR 107299, GCC does not build on PowerPC when long double is IEEE 128-bit

2023-03-02 Thread Segher Boessenkool
ail. This is PR 98645, and it is > assigned to Nathan Sidwell. And this one too. Any new failures need analysis. Always. This is why we have regression tests at all! Segher

Re: [PATCH] swap: Fix incorrect lane extraction by vec_extract() [PR106770]

2023-03-03 Thread Segher Boessenkool
Hi! On Fri, Mar 03, 2023 at 04:29:57PM +0530, Surya Kumari Jangala wrote: > On 27/02/23 9:58 pm, Segher Boessenkool wrote: > > On Wed, Jan 04, 2023 at 01:58:19PM +0530, Surya Kumari Jangala wrote: > >> + register swaps of permuting loads/stores have been removed. */ >

Re: [PATCH 1/2] PR target/107299: Fix build issue when long double is IEEE 128-bit

2023-03-03 Thread Segher Boessenkool
r to me that this is good to have at all -- it causes new non-trivial problems after all -- but you say it allows people to at least bootstrap, in more cases than before. So with comments like I said above: okay for trunk. And not okay for any backports. Thanks, Segher

Re: [PATCH 2/2] Rework 128-bit complex multiply and divide.

2023-03-03 Thread Segher Boessenkool
s -Wno-psabi needed here? What is the error you get without it / on which configurations? Cargo-culting hiding the warnings makes you see fewer warnings, but that is the opposite of a good idea. > +/* { dg-final { scan-assembler "bl __divtc3" } } */ This name depends on what object format and ABI is in use (some have an extra leading underscore, or a dot, or whatever). Segher

Re: [PATCH] PR rtl-optimization/106594: Preserve zero_extend in combine when cheap.

2023-03-04 Thread Segher Boessenkool
st cutoffs are not okay anywhere in combine, either. If expand_compound_operation and friends misbehave (not really an "if", unfortunately), then please fix that, instead of randomly disabling parts of combine? Segher

Re: [PATCH] PR rtl-optimization/106594: Preserve zero_extend in combine when cheap.

2023-03-05 Thread Segher Boessenkool
> > > > The regression for AArch64 needs to be fixed in GCC 13. The hit is too > > > big just > > to "take". > > > > > > So we need a way forward, even if it's stage-4. > > Then it needs to be in a way that works within the design cons

Re: [PATCH] PR rtl-optimization/106594: Preserve zero_extend in combine when cheap.

2023-03-06 Thread Segher Boessenkool
Hi! On Sun, Mar 05, 2023 at 03:33:40PM -0600, Segher Boessenkool wrote: > On Sun, Mar 05, 2023 at 08:43:20PM +, Tamar Christina wrote: > Yes, *look* better: I have seen no proof or indication that this would ("looks", I cannot type, sorry) > actually generate better cod

Re: [PATCH] combine: Try harder to form zero_extends [PR106594]

2023-03-06 Thread Segher Boessenkool
On Mon, Mar 06, 2023 at 12:47:06PM +, Richard Sandiford wrote: > How about the patch below? What about it? What would make it any better than the previous? Oh, and please do not send new patches in old threads :-( Segher

Re: [PATCH] combine: Try harder to form zero_extends [PR106594]

2023-03-06 Thread Segher Boessenkool
s instead of *_EXTEND, no? > So, at least we'd need something like Segher ran to test it on various > targets on Linux kernel (but would be really nice to get also i?86/x86_64). It is running. Still without x86 though, but I'll add that later hopefully, also for the previous runs. >

Re: [PATCH] combine: Try harder to form zero_extends [PR106594]

2023-03-06 Thread Segher Boessenkool
u think this is a problem for aarch64 only? If it actually is, you can fix it in the aarch64 config! Either with or without new hooks, whatever works best. Segher

Re: [PATCH] combine: Try harder to form zero_extends [PR106594]

2023-03-06 Thread Segher Boessenkool
On Mon, Mar 06, 2023 at 04:34:59PM +, Richard Sandiford wrote: > Jakub Jelinek writes: > > On Mon, Mar 06, 2023 at 03:08:00PM +, Richard Sandiford via Gcc-patches > > wrote: > >> Segher Boessenkool writes: > >> > On Mon, Mar 06, 2023 at 12:47

Re: [PATCH] combine: Try harder to form zero_extends [PR106594]

2023-03-06 Thread Segher Boessenkool
Hi! On Mon, Mar 06, 2023 at 07:13:08PM +, Richard Sandiford wrote: > Segher Boessenkool writes: > > Most importantly, what makes you think this is a problem for aarch64 > > only? If it actually is, you can fix it in the aarch64 config! Either > > with or without new

Re: [PATCH] combine: Try harder to form zero_extends [PR106594]

2023-03-08 Thread Segher Boessenkool
On Wed, Mar 08, 2023 at 11:58:51AM +, Richard Sandiford wrote: > Segher Boessenkool writes: > > An #ifdef is a way of making a change that is not finished yet not hurt > > the other targets. It still hurts generic development, which indirectly > > hurts all targets.

Re: [PATCH] rs6000: Accept const pointer operands for MMA builtins [PR109073]

2023-03-09 Thread Segher Boessenkool
to tell that this bif doesn't modify the memory pointed by > the given pointer. That looks like a bug. Well it is one even. Is it fixed on trunk? Since the patch is a strict improvement already, it is okay for 11 and 10. But you (Peter) may want to flesh it out a bit first? Or first commit only this if that works better for you. Segher

Re: [PATCH 2/2] Rework 128-bit complex multiply and divide.

2023-03-09 Thread Segher Boessenkool
On Thu, Mar 09, 2023 at 11:11:34AM -0500, Michael Meissner wrote: > On Fri, Mar 03, 2023 at 03:35:44PM -0600, Segher Boessenkool wrote: > > > +/* { dg-final { scan-assembler "bl __divtc3" } } */ > > > > This name depends on what object format and ABI is in u

Re: [PATCH] [rs6000] adjust return_pc debug attrs

2023-03-13 Thread Segher Boessenkool
Calls that output insns after bl need DW_AT_call_return_pc to be > +;; adjusted. rs6000_call_offset_return_label uses this attribute to > +;; conservatively recognize the relevant patterns. > +(define_attr "call_needs_return_offset" "none,direct,indirect" > + (const_string "none")) Like I said above, this is all just because of a misdesign here: we should calculate the return address from first principles, not try to undo adding all the stuff we wrongly did. This attribute should not exist. Segher

Re: [PATCH] rs6000: Accept const pointer operands for MMA builtins [PR109073]

2023-03-13 Thread Segher Boessenkool
Hi! On Thu, Mar 09, 2023 at 07:24:58PM -0600, Peter Bergner wrote: > On 3/9/23 8:55 AM, Segher Boessenkool wrote: > >> Nit: Maybe we can build them out of the loop once and then just use the > >> built one in the loop. > > > > Or as globals even. Currently w

Re: [PATCH] [powerpc] Add a peephole2 to eliminate redundant move from VSX_REGS to GENERAL_REGS when it's from memory.

2023-05-15 Thread Segher Boessenkool
RGET_POWERPC64 && VECTOR_MEM_VSX_P (mode) > + && peep2_reg_dead_p (2, operands[0])" > + [(set (match_dup 2) (match_dup 1))]) The condition does not make sense, even assuming the peephole does (it does not). Why would you care if the compiler is allowed to generate 64-bit insns here? The formatting is messed up as well. Segher

Re: [PATCH v5 1/4] rs6000: Enable REE pass by default

2023-05-16 Thread Segher Boessenkool
e say PowerPC here. With that the patch is okay for trunk. Thank you! Segher

Re: [PATCH v1] tree-ssa-sink: Improve code sinking pass.

2023-05-18 Thread Segher Boessenkool
> + && !is_gimple_call (last_stmt) > + && (gimple_code (last_stmt) != GIMPLE_SWITCH) > + && (gimple_code (last_stmt) != GIMPLE_COND) > + && (gimple_code (last_stmt) != GIMPLE_GOTO) > + && (!gimple_vdef (use) || !def_use_same_block (def_stmt))) Please no unnecessary parens. At first I didn't notice the last line here *does* need it! Segher

Re: [PATCH] [testsuite] [powerpc] adjust -m32 counts for fold-vec-extract*

2023-05-25 Thread Segher Boessenkool
at it properly, and instead people will just do blind "update counts" patches like this :-/ *Good* insn count tests are quite valuable, but harder to write. But maintenance costs noticably bigger than zero for a testcase are not good, how many testcases do we run in the testsuite? So, can we fix the underlying problem here please? Thanks, Segher

Re: [PATCH] [testsuite] [powerpc] adjust -m32 counts for fold-vec-extract*

2023-05-25 Thread Segher Boessenkool
Hi Alex, On Thu, May 25, 2023 at 10:55:37AM -0300, Alexandre Oliva wrote: > On May 25, 2023, Segher Boessenkool wrote: > > Fwiw, updating the insn counts blindly like this > > ... is a claim that carries a wildly incorrect and insulting underlying > assumption: Sorry you f

Re: [PATCH] Only use NO_REGS in cost calculation when !hard_regno_mode_ok for GENERAL_REGS and mode.

2023-05-25 Thread Segher Boessenkool
_one_insn): Only use NO_REGS in cost > > calculation when !hard_regno_mode_ok for GENERAL_REGS and > > mode, otherwise still use GENERAL_REGS. > > Thank you for the patch.  It looks good for me.  It is ok to commit it > into the trunk. Thanks everyone involved for fixing this nasty regression! Much appreciated. Segher

Re: [BACKPORT] Apply fix for PR libgcc/97643 to gcc 10 branch

2021-01-21 Thread Segher Boessenkool
On Wed, Jan 20, 2021 at 08:28:57PM -0500, Michael Meissner wrote: > On Wed, Jan 20, 2021 at 06:46:14PM -0600, Segher Boessenkool wrote: > > Is there a reason we do not have that testcase in the testsuite, btw? > > In order to test it you need to build a compiler + toolchain wh

Re: [PATCH 1/4] unroll: Add middle-end unroll factor estimation

2021-01-21 Thread Segher Boessenkool
Hi! What is holding up this patch still? Ke Wen has pinged it every month since May, and there has still not been a review. Segher On Thu, May 28, 2020 at 08:19:59PM +0800, Kewen.Lin wrote: > > gcc/ChangeLog > > 2020-MM-DD Kewen Lin > > * cfgloop.h (struc

Re: [PATCH 3/4] rs6000: Enable vec_insert for P8 with rs6000_expand_vector_set_var_p8

2021-01-21 Thread Segher Boessenkool
New function. > > gcc/testsuite/ChangeLog: > > 2020-10-10 Xionghu Luo > > * gcc.target/powerpc/pr79251.p8.c: New test. If testing on P9 LE and P7 BE (32-bit and 64-bit) worked, this is okay for trunk. Thanks! (Let me know if you need help testing.) Segher

Re: [PATCH 4/4] rs6000: Update testcases' instruction count

2021-01-21 Thread Segher Boessenkool
assume you tested all those changed counts are actual wanted code? Okay for trunk if so. Thanks! Segher

Re: [PATCH/RFC] combine: Tweak the condition of last_set invalidation

2021-01-21 Thread Segher Boessenkool
Hi Ke Wen, On Fri, Jan 15, 2021 at 04:06:17PM +0800, Kewen.Lin wrote: > on 2021/1/15 上午8:22, Segher Boessenkool wrote: > > On Wed, Dec 16, 2020 at 04:49:49PM +0800, Kewen.Lin wrote: > >>... op regX // this regX could find wrong last_set below > >>regX = ...

Re: [PATCH 1/4] unroll: Add middle-end unroll factor estimation

2021-01-22 Thread Segher Boessenkool
On Fri, Jan 22, 2021 at 02:47:06PM +0100, Richard Biener wrote: > On Thu, 21 Jan 2021, Segher Boessenkool wrote: > > What is holding up this patch still? Ke Wen has pinged it every month > > since May, and there has still not been a review. Richard Sandiford wrote: > FAOD (si

Re: [PATCH] rs6000: Fix up __m64 typedef in mmintrin.h [PR97301]

2021-01-22 Thread Segher Boessenkool
achieved), but I would rather not make it worse than needed ;-) Segher > 2021-01-22 Jakub Jelinek > > PR testsuite/97301 > * config/rs6000/mmintrin.h (__m64): Add __may_alias__ attribute. > > --- gcc/config/rs6000/mmintrin.h.jj 2021-01-04 10:25:46.794143679

Re: [PATCH] testsuite: Fix sse2-andnpd-1.c and sse-andnps-1.c testscases on powerpc

2021-01-22 Thread Segher Boessenkool
check_##UINON_TYPE (UINON_TYPE u, const VALUE_TYPE *v) \ > > > {\ > > On powerpc64le the tests suffer from the exact same issue. > > Tested on powerpc64le-linux, ok for trunk? So what is the actual error here? Th

Re: [PATCH 4/4] rs6000: Update testcases' instruction count

2021-01-22 Thread Segher Boessenkool
based on it being addi, nothing else; this is a shot in the dark). It could of course be something different just as well :-) Segher > > > * gcc.target/powerpc/fold-vec-insert-char-p8.c: Adjust > > > instruction counts. > > > * gcc.target/powerpc/fo

Re: [PATCH] rs6000: Fix up __m64 typedef in mmintrin.h [PR97301]

2021-01-22 Thread Segher Boessenkool
Hi! On Sat, Jan 23, 2021 at 01:03:31AM +0100, Jakub Jelinek wrote: > On Fri, Jan 22, 2021 at 05:45:54PM -0600, Segher Boessenkool wrote: > > On Fri, Jan 22, 2021 at 07:02:04PM +0100, Jakub Jelinek wrote: > > > The x86 __m64 type is defined as: > > > /* The Intel API

Re: [PATCH] testsuite: Fix sse2-andnpd-1.c and sse-andnps-1.c testscases on powerpc

2021-01-23 Thread Segher Boessenkool
Hi! On Sat, Jan 23, 2021 at 09:41:23AM +0100, Jakub Jelinek wrote: > On Fri, Jan 22, 2021 at 06:56:37PM -0600, Segher Boessenkool wrote: > > So what is the actual error here? This whole union stuff is because we > > *do* want proper aliasing, afaics. > > The reading throu

Re: [PATCH, rs6000] Deprecate unnecessary __builtin_dfp_dtstsfi_*_dd and td overloads

2021-01-25 Thread Segher Boessenkool
gt; * gcc.target/powerpc/dfp/dtstsfi-77.c: Same. > * gcc.target/powerpc/dfp/dtstsfi-78.c: Same. > * gcc.target/powerpc/dfp/dtstsfi-79.c: Same. > * gcc.target/powerpc/pr92661.c: Same. This is okay for trunk if Bill thinks it is the right direction. Thanks! Segher

Re: [PATCH 1/4] unroll: Add middle-end unroll factor estimation

2021-01-25 Thread Segher Boessenkool
Hi! On Mon, Jan 25, 2021 at 05:59:23PM +, Richard Sandiford wrote: > Richard Biener writes: > > On Fri, 22 Jan 2021, Segher Boessenkool wrote: > >> But what could have been done differently that would have helped? Of > >> course Ke Wen could have written a better

Re: [PATCH 3/8] [RS6000] rs6000_rtx_costs tidy AND

2021-01-25 Thread Segher Boessenkool
return true; > } > } > - > - *total = COSTS_N_INSNS (1); >return false; I still do not see what this improves, I only see possible obvious regressions :-( Segher

Re: [PATCH 4/8] [RS6000] rs6000_rtx_costs tidy break/return

2021-01-25 Thread Segher Boessenkool
don't do this part. The rest is okay. Thanks! Segher

Re: [PATCH 5/8] [RS6000] rs6000_rtx_costs cost IOR

2021-01-25 Thread Segher Boessenkool
rtx left = XEXP (x, 0); > + left = XEXP (x, 0); > rtx_code left_code = GET_CODE (left); > > /* rotate-and-mask: 1 insn. */ > @@ -21452,9 +21537,16 @@ rs6000_rtx_costs (rtx x, machine_mode mode, int > outer_code, >return false; > > case IOR: > - /* FIXME */ >*total = COSTS_N_INSNS (1); > - return true; > + left = XEXP (x, 0); > + if (GET_CODE (left) == AND > + && CONST_INT_P (XEXP (left, 1))) > + { > + right = XEXP (x, 1); > + if (rotate_insert_cost (left, right, mode, speed, total)) > + return true; > + } > + return false; > > case CLZ: > case XOR: Please wait this until stage 1. Sorry. Segher

Re: [PATCH,rs6000] Combine patterns for p10 load-cmpi fusion

2021-01-25 Thread Segher Boessenkool
8\")\n"; > + print " (set_attr \"length\" \"8\")])\n"; > + print "\n"; You can also use a here-document (<<) for long prints (you can interpolate variables in that just fine if you use <<"HERE", i.e. double-quote the terminator string). Anyway, the only thing you really need to improve in the Perl code now is "use strict;". The rest you can do later :-) > --- a/gcc/config/rs6000/rs6000.c > +++ b/gcc/config/rs6000/rs6000.c > @@ -4423,6 +4423,12 @@ rs6000_option_override_internal (bool global_init_p) >if (TARGET_POWER10 && (rs6000_isa_flags_explicit & OPTION_MASK_MMA) == 0) > rs6000_isa_flags |= OPTION_MASK_MMA; > > + if (TARGET_POWER10 && (rs6000_isa_flags_explicit & OPTION_MASK_P10_FUSION) > == 0) > +rs6000_isa_flags |= OPTION_MASK_P10_FUSION; > + > + if (TARGET_POWER10 && (rs6000_isa_flags_explicit & > OPTION_MASK_P10_FUSION_LD_CMPI) == 0) if (TARGET_POWER10 && (rs6000_isa_flags_explicit & OPTION_MASK_P10_FUSION_LD_CMPI) == 0) > +bool > +address_is_non_pfx_d_or_x (rtx addr, machine_mode mode, > +enum non_prefixed_form non_prefixed_format) > +{ > + enum insn_form result_form; > + > + result_form = address_to_insn_form (addr, mode, non_prefixed_format); > + > + switch (non_prefixed_format) > +{ > +case NON_PREFIXED_D: > + switch (result_form) > + { > + case INSN_FORM_X: > + case INSN_FORM_D: > + case INSN_FORM_DS: > + case INSN_FORM_BASE_REG: > + return true; > + default: > + break; > + } "default: break;" always is superfluous. Also, please "return false;" everywhere you do "break" to just get there. > --- a/gcc/config/rs6000/t-rs6000 > +++ b/gcc/config/rs6000/t-rs6000 > @@ -47,6 +47,9 @@ rs6000-call.o: $(srcdir)/config/rs6000/rs6000-call.c > $(COMPILE) $< > $(POSTCOMPILE) > > +$(srcdir)/config/rs6000/fusion.md: $(srcdir)/config/rs6000/genfusion.pl > + $(srcdir)/config/rs6000/genfusion.pl > $(srcdir)/config/rs6000/fusion.md Ah, so you *do* generate it always. Hrm, I'm not sure I like that, certainly not now. Comment out this line, and maybe enable it again in stage 1? Okay for trunk with those things taken into account. Thank you! Segher

Re: [PATCH 1/4] unroll: Add middle-end unroll factor estimation

2021-01-26 Thread Segher Boessenkool
On Tue, Jan 26, 2021 at 04:53:25PM +0800, Kewen.Lin wrote: > on 2021/1/26 上午4:37, Segher Boessenkool wrote: > > On Mon, Jan 25, 2021 at 05:59:23PM +, Richard Sandiford wrote: > >> Richard Biener writes: > >>> On Fri, 22 Jan 2021, Segher Boessenkool wrote: > &

Re: [PATCH 1/4] unroll: Add middle-end unroll factor estimation

2021-01-26 Thread Segher Boessenkool
machine latencies, makes "too sharp" decisions, the code will run lousy on a slightly different (say, newer) machine. But certainly there needs to be *some* idea of how parallel some code can run, yes. Segher

Re: [PATCH] testsuite: Fix sse2-andnpd-1.c and sse-andnps-1.c testscases on x86 and powerpc

2021-01-26 Thread Segher Boessenkool
On Tue, Jan 26, 2021 at 06:29:47PM +0100, Jakub Jelinek wrote: > On Sat, Jan 23, 2021 at 03:10:10PM -0600, Segher Boessenkool wrote: > > > The reason I chose the "no-strict-aliasing" attribute (and already > > > committed based on Richi's ack) was consistency

Re: [PATCH, rs6000] improve vec_ctf invalid parameter handling. (pr91903)

2021-01-27 Thread Segher Boessenkool
{ dg-skip-if "" { powerpc*-*-darwin* } } */ Please skip this line. If the test does not work for Darwin Iain can easily disable it, but if you do, no one will find out if it does work. Okay for trunk with those things fixed, and the -2 thing looked at. Thanks! Segher

Re: [PATCH] PowerPC: Map IEEE 128-bit long double built-ins.

2021-01-27 Thread Segher Boessenkool
On Tue, Jan 19, 2021 at 12:24:51PM -0500, Michael Meissner wrote: > On Fri, Jan 15, 2021 at 03:43:13PM -0600, Segher Boessenkool wrote: > > Hi! > > > > On Thu, Jan 14, 2021 at 11:59:19AM -0500, Michael Meissner wrote: > > > >From 78435dee177447080434cdc08fc76b10

Re: [Ping] PowerPC: Map IEEE 128-bit long double built-ins.

2021-01-27 Thread Segher Boessenkool
r changes. Send those separately, don't make me do much more work than needed). Segher

Re: [PATCH] PowerPC: Map IEEE 128-bit long double built-ins.

2021-01-27 Thread Segher Boessenkool
h ideally is not the case. Stronger that that: I need to know what changed! So please just explain what changed, in just a short sentence or two, or more if that is needed (but not if it is not needed). Segher

Re: [Ping] PowerPC: Add float128/Decimal conversions.

2021-01-27 Thread Segher Boessenkool
tch? The whole thread is at https://patchwork.ozlabs.org/project/gcc/patch/2020112524.ga...@ibm-toto.the-meissners.org/ . I approved *that* version of the patch. Segher

Re: [PATCH,rs6000] Fusion patterns for logical-logical

2021-01-27 Thread Segher Boessenkool
the same template as for alt 0 for them, which probably is easier to read, like: "@ and %3,%1,%0\;and %3,%3,%2 and %3,%1,%0\;and %3,%3,%2 and %3,%1,%0\;and %3,%3,%2 and %4,%1,%0\;and %3,%4,%2" Do you agree? Or is that nasty for other patterns maybe :-) Have you checked that all these pattern combinations canonicalise to the RTL you use here? Okay for trunk with those things considered / fixed. Thanks! Segher

Re: [PATCH] testsuite: Run vec_insert case on P8 and P9 with option specified

2021-01-28 Thread Segher Boessenkool
e the changelog a bit). But the patch is fine as far as I can see. Okay for trunk. Thanks! Segher

Re: [Ping] PowerPC: Add float128/Decimal conversions.

2021-01-28 Thread Segher Boessenkool
That is > what this particular question is about. I see no second patch? > So, I rewrote the whole patch so that it will work with older GLIBC's. Did you check in what I approved or not? It is a very simple question, it's just about facts, with a simple yes/no answer. Segher

Re: [Ping] PowerPC: Add float128/Decimal conversions.

2021-01-28 Thread Segher Boessenkool
On Thu, Jan 28, 2021 at 02:30:56PM -0500, Michael Meissner wrote: > On Thu, Jan 28, 2021 at 12:59:18PM -0600, Segher Boessenkool wrote: > > On Thu, Jan 28, 2021 at 01:10:39PM -0500, Michael Meissner wrote: > > > > The whole thread is at > > > > https://patch

Re: [Ping] PowerPC: Add float128/Decimal conversions.

2021-01-28 Thread Segher Boessenkool
On Thu, Jan 28, 2021 at 01:58:26PM -0600, Peter Bergner wrote: > On 1/28/21 1:47 PM, Segher Boessenkool wrote: > > On Thu, Jan 28, 2021 at 02:30:56PM -0500, Michael Meissner wrote: > >> The second patch I want you to review is: > > > > "This patch r

Re: [PATCH] PR target/98870: Fix IEEE 128-bit fortran test

2021-01-29 Thread Segher Boessenkool
ichael Meissner > > PR testsuite/98870 > * gcc.target/powerpc/ppc-fortran/ieee128-math.f90: Fix mapping of > the built-in function. It changes the expected result, it doesn't change the mapping. Okay for trunk with the changelog fixed. Thanks! Segher

Re: [PATCH] Make asm not contain prefixed addresses.

2021-02-02 Thread Segher Boessenkool
t {\m[@]pcrel\M} } } */ You can just write @ instead of [@]. Putting \m immediately in front of a non-letter (or \M immediately after) does not do anything, either. Segher

Re: [PATCH] rs6000: Fix MMA API - Add support for compatibility built-ins

2021-02-04 Thread Segher Boessenkool
s matters, because if you disable the compatibility builtins the vsx_ one should still be there, but not the old name. (It also makes more sense of course). Okay for trunk with that fixed. Also okay for gcc-10 after watching a week or so for fallout. Thanks! Segher

Re: [PATCH] testsuite: Fix up pr25376.c on powerpc64-linux and array-quals-1.c on powerpc-linux [PR98325]

2021-02-04 Thread Segher Boessenkool
1.c test fails because on powerpc-linux the symbols > are emitted into .sdata section rather than one of the expected ones. > > Tested on powerpc64-linux (-m32/-m64) and x86_64-linux (-m32/-m64), ok for > trunk? The Tcl looks okay; whether it actually does the right thing, I do not know :-) > Whether this fixes also AIX, I have no idea. Yeah me neither. So okay for trunk, if David thinks it is okay as well. Thanks! Segher

Re: [PATCH] rs6000: Fix MMA API - Add support for compatibility built-ins

2021-02-05 Thread Segher Boessenkool
On Thu, Feb 04, 2021 at 10:05:19PM -0600, Peter Bergner wrote: > On 2/4/21 3:16 PM, Segher Boessenkool wrote: > > On Thu, Feb 04, 2021 at 02:40:20PM -0600, Peter Bergner wrote: > >> The LLVM and GCC teams agreed to rename the __builtin_mma_assemble_pair and > >> __b

Re: [PATCH] rs6000: Fix MMA API - Add support for compatibility built-ins

2021-02-05 Thread Segher Boessenkool
has_builtin should work for any builtin function whatsoever, and there is nothing special about these compatibility builtins (it is just a name, it is defined as any other). Segher

Re: [PATCH, rs6000, expand, hooks]: Fix PR98872, handle uninitialized opaque mode variables

2021-02-08 Thread Segher Boessenkool
> > +++ b/gcc/testsuite/gcc.target/powerpc/pr98872.c > > @@ -0,0 +1,20 @@ > > +/* PR target/98872 */ > > +/* { dg-do compile } */ > > +/* { dg-require-effective-target power10_ok } */ > > +/* { dg-options "-O2 -mdejagnu-cpu=power10" } */ > > + > > +/* Verify we do not ICE on the tests below. */ Do the existing tests already check the expected code for this? I would expect the code that initialises uninitialised values to handle this, instead (possibly call the same hook, but :-) ) Segher

<    1   2   3   4   5   6   7   8   9   10   >