Re: [PATCH 09/34] rs6000: Add more type nodes to support builtin processing

2021-08-23 Thread Segher Boessenkool
; (rs6000_builtin_types[RS6000_BTI_ptr_V16QI]) Not new of course, but those outer parens are pointless. In macros write extra parens around uses of parameters, and nowhere else. Okay for trunk with the formatting fixed. Thanks! Segher

Re: [PATCH 10/34] rs6000: Add Power10 builtins

2021-08-23 Thread Segher Boessenkool
On Thu, Jul 29, 2021 at 08:30:57AM -0500, Bill Schmidt wrote: > * config/rs6000/rs6000-builtin-new.def: Add power10 and power10-64 > stanzas. > + void __builtin_altivec_tr_stxvrbx (vsq, signed long, signed char *); > +TR_STXVRBX vsx_stxvrbx {stvec} > + > + void __builtin_altivec_

Re: [PATCH v2] rs6000: Add vec_unpacku_{hi,lo}_v4si

2021-08-24 Thread Segher Boessenkool
: it is possible it makes more opportunities to use unpack etc. insns invisible than that it helps over unspec. This needs to be tested, and the usual idioms need testcases, is that what you add here? (/me reads on...) > + if (BYTES_BIG_ENDIAN) > +emit_insn (gen_altivec_vmrgh (res, vzero, op1)); > + else > +emit_insn (gen_altivec_vmrgl (res, op1, vzero)); Ah, so it is *not* using unspecs? Excellent. Okay for trunk. Thank you! Segher

Re: [PATCH 08/34] rs6000: Add Power9 builtins

2021-08-24 Thread Segher Boessenkool
Hi! On Tue, Aug 24, 2021 at 09:20:09AM -0500, Bill Schmidt wrote: > On 8/23/21 4:40 PM, Segher Boessenkool wrote: > >On Thu, Jul 29, 2021 at 08:30:55AM -0500, Bill Schmidt wrote: > >>+; These things need some review to see whether they really require > >>+; MASK_POW

Re: [PATCH v2] x86: Allow CONST_VECTOR for vector load in combine

2021-08-24 Thread Segher Boessenkool
u should do this like change_zero_ext is done, and perhaps make sure you do not introduce new is_just_move insns that can make 2->2 combinations do the wrong thing. Also somehow make this not take exponential time? It looks like this should onle be done in cases where change_zero_ext is not, and the reverse, so this will work fine with a little attention to detail. gl;hf, Segher

Re: [PATCH] Change illegitimate constant into memref of constant pool in change_zero_ext.

2021-08-24 Thread Segher Boessenkool
xt* thing is there because combine *itself* creates a lot of extra zext*, whether those exist for the target or not. So this isn't obvious precedent (and that wouldn't mean it is a good idea anyway ;-) ) Segher

Re: [PATCH] rs6000: Make some BIFs vectorized on P10

2021-08-24 Thread Segher Boessenkool
On Fri, Aug 13, 2021 at 11:18:46AM +0800, Kewen.Lin wrote: > on 2021/8/12 下午11:51, Segher Boessenkool wrote: > > It is a bad idea to initialise things unnecessary: it hinders many > > optimisations, but much more importantly, it silences warnings without > > fixing the proble

Re: [PATCH] rs6000: Make some BIFs vectorized on P10

2021-08-24 Thread Segher Boessenkool
Hi! On Fri, Aug 13, 2021 at 10:34:46AM +0800, Kewen.Lin wrote: > on 2021/8/12 下午11:10, Segher Boessenkool wrote: > >> + && VECTOR_UNIT_ALTIVEC_OR_VSX_P (in_vmode)) > >> +{ > >> + machine_mode exp_mode = DImode; > >> + mach

Re: [PATCH, rs6000] Disable gimple fold for float or double vec_minmax when fast-math is not set

2021-08-24 Thread Segher Boessenkool
ig/rs6000/rs6000-call.c +++ > >b/gcc/config/rs6000/rs6000-call.c @@ -12159,6 +12159,11 @@ > >rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi) return true; /* > >flavors of vec_min. */ case VSX_BUILTIN_XVMINDP: + case format=flawed :-( Segher

Re: [PATCH] Make xxsplti*, xpermx, xxeval be vecperm type.

2021-08-25 Thread Segher Boessenkool
ribute. Use > UUNSPEC_XXSPLTIDP instead of UNSPEC_XXSPLTID. Typo ("UU"). Okay for trunk with those trivial fixes. Also okay for backport to 11, it is trivial enough. Thanks! Out of interest, did you notice any scheduling differences with this? Segher

Re: [PATCH] Make xxsplti*, xpermx, xxeval be vecperm type.

2021-08-25 Thread Segher Boessenkool
On Wed, Aug 25, 2021 at 02:22:06PM -0400, Michael Meissner wrote: > On Wed, Aug 25, 2021 at 12:44:16PM -0500, Segher Boessenkool wrote: > > Out of interest, did you notice any scheduling differences with this? > > I don't use the built-ins so I wouldn't notice a differen

Re: [PATCH 11/34] rs6000: Add MMA builtins

2021-08-25 Thread Segher Boessenkool
Hi! On Thu, Jul 29, 2021 at 08:30:58AM -0500, Bill Schmidt wrote: > * config/rs6000/rs6000-builtin-new.def: Add mma stanza. Okay for trunk. Thanks! Segher

Re: [PATCH 12/34] rs6000: Add miscellaneous builtins

2021-08-25 Thread Segher Boessenkool
On Thu, Jul 29, 2021 at 08:30:59AM -0500, Bill Schmidt wrote: > * config/rs6000/rs6000-builtin-new.def: Add ieee128-hw, dfp, > crypto, and htm stanzas. Okay for trunk. Thanks! Segher

Re: [PATCH 13/34] rs6000: Add Cell builtins

2021-08-25 Thread Segher Boessenkool
On Thu, Jul 29, 2021 at 08:31:00AM -0500, Bill Schmidt wrote: > * config/rs6000/rs6000-builtin-new.def: Add cell stanza. This one is fine, too. Thanks! Segher

Re: [PATCH 14/34] rs6000: Add remaining overloads

2021-08-25 Thread Segher Boessenkool
; ** > +; Deprecated overloads that should never have existed at all > +; ** > +; ** The coding conventions say not to use showy block comments like that, but it seems appropriate here :-) Okay for trunk with the looked at. Please don't repost this one. Thanks! Segher

Re: [PATCH] Inline IBM long double __gcc_qsub

2021-08-26 Thread Segher Boessenkool
to do serious engineering on. If we want any serious optimisation on it we should do that at tree level (why does that not happen yet anyway?), and inline all of this. This patch is really just to make benchmark results saner ;-) Thanks David! Segher

Re: [PATCH 14/34] rs6000: Add remaining overloads

2021-08-26 Thread Segher Boessenkool
Hi! On Thu, Aug 26, 2021 at 07:59:04AM -0500, Bill Schmidt wrote: > On 8/25/21 6:27 PM, Segher Boessenkool wrote: > >On Thu, Jul 29, 2021 at 08:31:01AM -0500, Bill Schmidt wrote: > >>* config/rs6000/rs6000-overload.def: Add remaining overloads. > >>+; TODO: Note

Re: [PATCH v3] Fix for powerpc64 long double complex divide failure

2021-08-26 Thread Segher Boessenkool
c. My use of $HOME/usr for PATH and > LD_LIBRARY_PATH may be inadequate for bypassing the older environment. > I don't have access to a current (el8 or later) IBM environment. You need a newer glibc to get new features provided by that. But, everything should work with the older versions as well. Segher

Re: [PATCH v3] Fix for powerpc64 long double complex divide failure

2021-08-26 Thread Segher Boessenkool
, always refer to That is historical as well. Long ago the only 128-bit format was double-double, and those mode names became part of the symbol names. Changing this to __divic3 etc. would not really help. Since the internal GCC symbol names are not really user-visible it does not matter so much. Segher

Re: [PATCH, rs6000 V2] Add store fusion support for Power10

2021-08-26 Thread Segher Boessenkool
g-final { scan-assembler-not {stfd 1,8\(3\)\n\tstfd 3,16\(3\)} } } */ Heh. A little fragile, the compiler could reorder the stores for other reasons, but the best we can do here I guess. Okay for trunk with the trivial cleanups. Thanks! Segher

Re: [PATCH v2] Inline IBM long double __gcc_qsub

2021-08-26 Thread Segher Boessenkool
cc_qsub use? This is fine for complexity, it is just a simple tail-call jump, just wondering what the compiler thinks is best here (it matters in other cases, if the inline function has conditional branches for example). Segher

Re: [PATCH 15/34] rs6000: Execute the automatic built-in initialization code

2021-08-26 Thread Segher Boessenkool
ave an #ifdef but an empty macro (or a "do {} while (0)"), etc. Okay for trunk, if this is revisited later. Thanks! Segher

Re: [PATCH 15/34] rs6000: Execute the automatic built-in initialization code

2021-08-27 Thread Segher Boessenkool
On Fri, Aug 27, 2021 at 07:35:05AM -0500, Bill Schmidt wrote: > On 8/26/21 6:15 PM, Segher Boessenkool wrote: > >On Thu, Jul 29, 2021 at 08:31:02AM -0500, Bill Schmidt wrote: > >>+ /* Execute the autogenerated initialization code for builtins. */ > >>+ rs6000_autoin

Re: [PATCH] Fix float128-call.c test for power8 IEEE 128 and power10.

2021-08-27 Thread Segher Boessenkool
quot;p?" as well, or would it be bad if that ever is used, is this test testing it is not done? Segher

Re: [PATCH 16/34] rs6000: Darwin builtin support

2021-08-27 Thread Segher Boessenkool
unk. Thanks! Segher

Re: [PATCH 17/34] rs6000: Add sanity to V2DI_type_node definitions

2021-08-27 Thread Segher Boessenkool
e should be optimised for the human reader, the compiler does not care at all, you almost never can use that as excuse :-) ) Anyway, you know what is needed :-) Okay for trunk. Thanks! Segher

Re: [PATCH 18/34] rs6000: Always initialize vector_pair and vector_quad nodes

2021-08-27 Thread Segher Boessenkool
though those types are called "vector", they are not, so this does work correctly even if not TARGET_EXTRA_BUILTINS. Ideally we will that macro always enabled eventually, but that is later work. Okay for trunk. Thanks! Segher

Re: [PATCH 19/34] rs6000: Handle overloads during program parsing

2021-08-27 Thread Segher Boessenkool
ror ("invalid parameter combination for AltiVec intrinsic %qs", name); > +return error_mark_node; > + } A huge function with a lot of "goto bad;" just *screams* "this needs to be factored". > +case ENB_P5: > + if (!TARGET_POPCNTB) > + return false; > + break; case ENB_P5: return TARGET_POPCNTB; and similar for all further cases. It is shorter and does not have negations, win-win! > + break; > +}; Stray semicolon. Did this not warn? Could you please try to factor this better? Segher

Re: [PATCH] Only simplify TRUNCATE to SUBREG on TRULY_NOOP_TRUNCATION targets

2021-08-28 Thread Segher Boessenkool
d the motivating > >example above is derived from the behaviour of backend patches not yet > >in the tree [nvptx is currently a STORE_FLAG_VALUE=-1 target]. Should the TARGET_TRULY_NOOP_TRUNCATION documentation be updated now? In particular the part that talks about TARGET_MODES_TIEABLE_P. Segher

rs6000: Backports of mode promote patches to GCC 11

2021-08-30 Thread Segher Boessenkool
Hi! I have backported 9080a3bf2329, a3f6bd789149, and f0529d96f567 to the GCC 11 branch. This solves PR102062, but is a >2% performance win for such a trivial patch, too, which is enough reason on its own :-) Segher

Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

2020-09-11 Thread Segher Boessenkool
On Fri, Sep 11, 2020 at 04:29:16PM -0500, Qing Zhao wrote: > > On Sep 11, 2020, at 4:03 PM, Segher Boessenkool > > wrote: > >> The parameters that are passed to sys call will be destroyed, therefore, > >> the attack will likely failed. > > > >

Re: [PATCH] rs6000: Rename mffgpr/mftgpr instruction types

2020-09-11 Thread Segher Boessenkool
instructions. > * config/rs6000/power10.md (power10-mffgpr, power10-mftgpr): Rename to > mtvsr/mfvsr. Please spell out the new names in full, so that it can be searched for. Okay for trunk. Thank you! Segher

Re: [PATCH] [PATCH] PR rtl-optimization/96791 Check precision of partial modes

2020-09-14 Thread Segher Boessenkool
On Mon, Sep 14, 2020 at 09:46:11AM +0200, Richard Biener wrote: > On Fri, Sep 11, 2020 at 4:18 PM Segher Boessenkool > wrote: > > Until 2014 (and documented just days ago ;-) ) all bits of a partial > > integer mode were considered unknown. > > All bits or all bits outs

Re: [PATCH]rs6000: Remove useless insns fed into lvx/stvx [PR97019]

2020-09-14 Thread Segher Boessenkool
is is not a predicate. Do not name it _p please. > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/pr97019.c > @@ -0,0 +1,79 @@ > +/* { dg-do compile { target { powerpc_p8vector_ok && le } } } */ Why only on LE? (If there is a reason, the testcase should say; if there is not, well, it shouldn't say it does then :-) ) Please resend with those things fixed. Segher

Re: [PATCH v2] rs6000: Expand vec_insert in expander instead of gimple [PR79251]

2020-09-14 Thread Segher Boessenkool
On Thu, Sep 10, 2020 at 12:08:44PM +0200, Richard Biener wrote: > On Wed, Sep 9, 2020 at 6:03 PM Segher Boessenkool > wrote: > > There often are problems over function calls (where the compiler cannot > > usually *see* how something is used). > > Yep. The best way wou

Re: [PATCH v2] rs6000: Expand vec_insert in expander instead of gimple [PR79251]

2020-09-14 Thread Segher Boessenkool
GPR[RA]. The contents of bits 32:63 of GPR[RB] are placed into byte elements index:index+3 of VSR[VRT+32]. All other byte elements of VSR[VRT+32] are not modified. If index is greater than 12, the result is undefined. Segher

Re: [PATCH v2] rs6000: Expand vec_insert in expander instead of gimple [PR79251]

2020-09-14 Thread Segher Boessenkool
requires this > treatment and other optabs don't. Yeah. The register allocator is normally very good in using the same reg in both places, if that is useful. And it also handles the case where your machine insns require the two to be the same pretty well. Not restricting this stuff before RA should be a win. Segher

[PATCH] bb-reorder: Fix for ICEs caused by 69ca5f3a9882

2020-09-14 Thread Segher Boessenkool
After the previous patch we are left with an unreachable BB. This will ICE if either we have -fschedule-fusion, or we do not have peephole2. This fixes it. Okay for trunk? Segher 2020-09-14 Segher Boessenkool PR rtl-optimization/96475 * bb-reorder.c

Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

2020-09-14 Thread Segher Boessenkool
On Fri, Sep 11, 2020 at 05:41:47PM -0500, Qing Zhao wrote: > > On Sep 11, 2020, at 4:51 PM, Segher Boessenkool > > wrote: > > It is definitely *not* effective if there are gadgets that set rax to > > a value the attacker wants and then do a syscall. > >

Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

2020-09-14 Thread Segher Boessenkool
one > major issue with this as > Segher mentioned, The middle end does not know some details on the registers, > lacking such > detailed information might result incorrect code generation at middle end. > > For example, on x86_64 target, when “return” with pop, the scratch regis

Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

2020-09-14 Thread Segher Boessenkool
On Mon, Sep 14, 2020 at 05:33:33PM +0100, Richard Sandiford wrote: > > However, for the cases on Power as Segher mentioned, there are also some > > scratch registers used for > > Other purpose, not sure whether we can correctly generate zeroing in > > middle-end for Pow

Re: [PATCH] bb-reorder: Fix for ICEs caused by 69ca5f3a9882

2020-09-15 Thread Segher Boessenkool
On Tue, Sep 15, 2020 at 08:32:54AM +0200, Richard Biener wrote: > On Tue, Sep 15, 2020 at 12:06 AM Segher Boessenkool > wrote: > > > > After the previous patch we are left with an unreachable BB. This will > > ICE if either we have -fschedule-fusion, or we do not have

Re: [PATCH] rs6000: inefficient 64-bit constant generation for consecutive 1-bits

2020-09-15 Thread Segher Boessenkool
ns do). > +/* { dg-final { scan-assembler "rldic r?\[0-9\]+,r?\[0-9\]+,8,8" } } */ > +/* { dg-final { scan-assembler "rldic r?\[0-9\]+,r?\[0-9\]+,24,8" } } */ > +/* { dg-final { scan-assembler "rldic r?\[0-9\]+,r?\[0-9\]+,40,8" } } */ > +/* { dg-final { scan-assembler "rldic r?\[0-9\]+,r?\[0-9\]+,40,48" } } */ > +/* { dg-final { scan-assembler "rldic r?\[0-9\]+,r?\[0-9\]+,40,23" } } */ Please use {} quotes, and \m\M. \d can be helpful, too. Segher

Re: [PATCH] rs6000: inefficient 64-bit constant generation for consecutive 1-bits

2020-09-15 Thread Segher Boessenkool
rotldi 3,3,26 > blr > > would be two valid possibilities for test3 and test5 that don't use > rldic. Ideally the test would verify the actual values generated by > the test functions and count instructions. Well, the point of the test is to verify we get the expected code for this? Maybe we should just count insns here? But that would be a different test. I'm a bit worried about how often the one-bit thing will do something unexpected, but the rest should be fine and not cause churn. Segher

Re: [PATCH v2] rs6000: Remove useless insns fed into lvx/stvx [PR97019]

2020-09-15 Thread Segher Boessenkool
ejagnu-cpu=power8" } */ Do you need to test for LE? If not, just always run it? If it works, it works, it doesn't matter that you do not expect it to ever fail (we do not really expect *any* test we have to ever fail *anywhere*, heh). > +/* { dg-final { scan-assembler-not "rldicr\[ \t\]+\[0-9\]+,\[0-9\]+,0,59" } > } */ Please use {} quotes, and \s and \d. You can also use {(?n)rldicr.*,0,59} since (?n) makes . not match newlines anymore. Okay for trunk with or without those suggestions. Thanks! Segher

Re: [PATCH v2] rs6000: Expand vec_insert in expander instead of gimple [PR79251]

2020-09-15 Thread Segher Boessenkool
ote that the builtin is not the same as the machine instruction -- here there shouldn't be a difference if compiling for a new enough ISA, but the builtin is available on anything with at least AltiVec. Segher

Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

2020-09-15 Thread Segher Boessenkool
On Mon, Sep 14, 2020 at 10:07:31PM -0500, Qing Zhao wrote: > > On Sep 14, 2020, at 6:09 PM, Segher Boessenkool > > wrote: > >> Gadget 1: > >> > >> mov rax, value > >> syscall > >> ret > > > > No, just > > > >

Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

2020-09-15 Thread Segher Boessenkool
On Tue, Sep 15, 2020 at 12:46:00PM +0100, Richard Sandiford wrote: > Segher Boessenkool writes: > > On Mon, Sep 14, 2020 at 05:33:33PM +0100, Richard Sandiford wrote: > >> > However, for the cases on Power as Segher mentioned, there are also some > >> > scratc

Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

2020-09-15 Thread Segher Boessenkool
u need to prevent the inserted (zeroing) insns from moving -- if you don't, the code after some zeroing can be used as gadget! You want to always have all zeroing insns after *any* computational insn, or it becomes a gadget. Segher

Re: [PATCH] rs6000: inefficient 64-bit constant generation for consecutive 1-bits

2020-09-15 Thread Segher Boessenkool
at describes what regex patterns are allowed. "man re_syntax", and "man tcl" for the one-page tcl intro (it describes the whole language: the substitutions, the quotes, etc.) https://www.tcl.tk/man/tcl8.6/TclCmd/re_syntax.htm https://www.tcl.tk/man/tcl8.6/TclCmd/Tcl.htm > This all said, Alan's rtx_costs patch touches this same area and he talked > about removing a similar splitter, so I think I will wait for his code to > be committed and then rework this on top of his changes. Yes, good plan. Thanks! Segher

Re: [PATCH 2/4, revised patch applied] PowerPC: Rename functions for min, max, cmove

2020-09-15 Thread Segher Boessenkool
fs on these expressions, because of some constexpr issue I > haven't really looked into. Yeah, the system compiler is 4.8.5 (this is centos7). > I'm testing this patch. I'll check it in when I'm done. It is pre-approved, just check it in already please! Segher > ---

Re: [PATCH 4/4] PowerPC: Add power10 xscmp{eq,gt,ge}qp support

2020-09-15 Thread Segher Boessenkool
xample where it is used: it can much easier say something much more generic! (And then send a patch first doing FP just as SFDF and replacing it where we want it; and then a later patch adding KF. That way, your patch might be readable!) Thanks, Segher

Re: [RS6000] Count rldimi constant insns

2020-09-15 Thread Segher Boessenkool
Wow, did I miss that? Whoops. (That was PR93012, 72b2f3317b44.) Okay for trunk. Thanks! Segher

Re: [RS6000] rs6000_rtx_costs for PLUS/MINUS constant

2020-09-15 Thread Segher Boessenkool
Okay for trunk. Thanks! (Btw, please use [patch 2/6] etc. markers? It helps refer to them :-) ) Segher

Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

2020-09-15 Thread Segher Boessenkool
other passes here are harmful (maybe the shorten stuff)? But. Targets can also insert more passes here. If you want the zeroing insns to stay with the return, you have to express that in RTL. Anything else is extremely fragile. Segher

Re: [PATCH] rs6000: Fix misnamed built-in

2020-09-16 Thread Segher Boessenkool
ot;might break users' code" thing ;-) Segher

Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

2020-09-16 Thread Segher Boessenkool
On Tue, Sep 15, 2020 at 08:51:57PM -0500, Qing Zhao wrote: > > On Sep 15, 2020, at 6:09 PM, Segher Boessenkool > > wrote: > > If you want the zeroing insns to stay with the return, you have to > > express that in RTL. > > What do you mean by “express that in RT

Re: [PATCH v2] rs6000: Expand vec_insert in expander instead of gimple [PR79251]

2020-09-16 Thread Segher Boessenkool
On Wed, Sep 16, 2020 at 10:31:54AM +0200, Richard Biener wrote: > On Tue, Sep 15, 2020 at 6:18 PM Segher Boessenkool > wrote: > > > > On Tue, Sep 15, 2020 at 08:51:09AM +0200, Richard Biener wrote: > > > On Tue, Sep 15, 2020 at 5:56 AM luoxhu wrote:

Re: [PATCH] rs6000: Add rs6000_cfun_pcrel_p

2020-09-16 Thread Segher Boessenkool
On Wed, Sep 16, 2020 at 08:36:35AM -0500, Bill Schmidt wrote: > This is a cleanup requested by Segher in a previous review. Most > uses of rs6000_pcrel_p are called for the current function. A > specialized version for cfun is more efficient for these uses. Then rename th

Commit 052204fac580 was never sent to gcc-patches

2020-09-16 Thread Segher Boessenkool
... and it causes testsuite regressions on Power. We haven't determined et if it actually is worse code, but there are testcases that trip on it. Either way, all patches should be send to gcc-patches@, whether pre-approved or not. Please correct this. Thanks! Segher

Re: Commit 052204fac580 was never sent to gcc-patches

2020-09-16 Thread Segher Boessenkool
Hi! On Wed, Sep 16, 2020 at 08:37:34PM +0200, Andrea Corallo wrote: > Segher Boessenkool writes: > > > ... and it causes testsuite regressions on Power. We haven't determined > > et if it actually is worse code, but there are testcases that trip on > > it. Eith

Re: [PATCH] aarch64: Add extend-as-extract-with-shift pattern [PR96998]

2020-09-16 Thread Segher Boessenkool
de, too. > If we do that, we should be able to remove the handling of > extract-based addresses in aarch64_classify_index & co. If we do what? I don't follow, sorry. (Patch to combine sounds fine fwiw; patches welcome, as always.) Segher

Re: [RS6000] rs6000_rtx_costs comment

2020-09-16 Thread Segher Boessenkool
ore > + complex" is determined by having a higher set_src_cost. So for > + example, if we want a plain (reg) address to be replaced with > + (plus (reg) (const)) when possible then PLUS needs to cost more > + than zero here. */ Maybe it helps if you more prominenty mention set_rtx_cost and set_src_cost? Either way, okay for trunk. Thanks! Segher

Re: [RS6000] rs6000_rtx_costs multi-insn constants

2020-09-16 Thread Segher Boessenkool
(rs6000_rtx_costs): Cost multi-insn > constants. Okay for trunk. Note that some p10 insns take a floating point immediate, but those need to be handled specially anyway. Thanks! Segher

Re: [RS6000] rs6000_rtx_costs cost IOR

2020-09-16 Thread Segher Boessenkool
ction? Just the insert insns part, not all IOR. Okay for trunk with the comments changed to the correct syntax, and factoring masked insert out to a separate function pre-approved if you want to do that. Thanks! Segher

Re: [RS6000] rs6000_rtx_costs reduce cost for SETs

2020-09-17 Thread Segher Boessenkool
t, don't we have a problem already? Generic things. Please split this patch up when sending it again, it does too many different things, and many of those are not obvious. All such changes that aren't completely obvious (like the previous ones were) should have some measurement. We are in stage1, and we will notice (non-trivial) degradations, but if we can expect degradations (like for this patch), it needs benchmarking. Since you add !speed all over the place, maybe we should just have a separate function that does !speed? It looks like quite a few things will simplify. Segher

Re: [PATCH] aarch64: Add extend-as-extract-with-shift pattern [PR96998]

2020-09-17 Thread Segher Boessenkool
Hi! On Thu, Sep 17, 2020 at 08:10:22AM +0100, Richard Sandiford wrote: > Alex Coplan writes: > The combine parts LGTM otherwise, but Segher should have the > final say. I am doubtful this does not regress on many targets. I'll test it, we'll see :-) Segher

Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

2020-09-17 Thread Segher Boessenkool
ay, to resources it does not know about), but you can use it for anything you want executed approximately as written. > UNSPEC_VOLATILEs can't be deleted. (If they are executed at all, anyway ;-) ) Segher

Re: [PATCH 3/4 v3] ivopts: Consider cost_step on different forms during unrolling

2020-09-17 Thread Segher Boessenkool
Hi Jeff, On Thu, Sep 17, 2020 at 05:12:17PM -0600, Jeff Law wrote: > On 9/3/20 4:37 PM, Segher Boessenkool wrote: > >> Apart from that, one P9 specific point is that the update form load isn't > >> preferred, the reason is that the instruction can not retire until bo

Re: [PATCH] vect/test: Don't check for epilogue loop [PR97075]

2020-09-18 Thread Segher Boessenkool
-vec-length-full-6.c: Adjust. The testcase part of course is okay for trunk, if this is the expected (and good :-) ) code.Thanks, Segher

Re: [RS6000] rs6000_rtx_costs reduce cost for SETs

2020-09-18 Thread Segher Boessenkool
On Fri, Sep 18, 2020 at 01:08:42PM +0930, Alan Modra wrote: > On Thu, Sep 17, 2020 at 12:51:25PM -0500, Segher Boessenkool wrote: > > > - if (CONST_INT_P (XEXP (x, 1)) > > > - && satisfies_constraint_I (XEXP (x, 1))) > > > + if (!speed) > >

Re: [PATCH v2 1/2] IFN: Implement IFN_VEC_SET for ARRAY_REF with VIEW_CONVERT_EXPR

2020-09-18 Thread Segher Boessenkool
ee_code code = TREE_CODE (TREE_TYPE(view_op0)); ^ Missing space here ---/ Thanks, Segher

Re: [PATCH v2 2/2] rs6000: Expand vec_insert in expander instead of gimple [PR79251]

2020-09-18 Thread Segher Boessenkool
gt; + operands[2]); > + rtx sub_target > + = simplify_gen_subreg (GET_MODE (operands[0]), target, V16QImode, 0); > + emit_insn (gen_rtx_SET (operands[0], sub_target)); > + DONE; > +} Please handle this in rs6000_expand_vector_set, instead. It is fine to call rs6000_vector_set_var early in that, for example. > diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md > index dd750210758..7e82690d12d 100644 > --- a/gcc/config/rs6000/vsx.md > +++ b/gcc/config/rs6000/vsx.md > @@ -5349,7 +5349,7 @@ (define_expand "xl_len_r" >rtx rtx_vtmp = gen_reg_rtx (V16QImode); >rtx tmp = gen_reg_rtx (DImode); > > - emit_insn (gen_altivec_lvsl_reg (shift_mask, operands[2])); > + emit_insn (gen_altivec_lvsl_reg_di2 (shift_mask, operands[2])); So this becomes emit_insn (gen_altivec_lvsl_reg (DImode, shift_mask, operands[2])); if you use parameterized names. Tests... You don't need lp64 anywhere, but then you probably need to disallow the 64-bit tests. That may be hard? When you resend, please split it into separate patches for separate things. Small patches are fine, no, *good* even! Segher

Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

2020-09-18 Thread Segher Boessenkool
this just to make the xor thing work? i386 has a peephole to transform the mov to a xor for this (and the backend could just handle it in its mov patterns, maybe a peephole was easier for i386, no idea). Segher

Re: [PATCH] bpf: use xBPF signed div, mod insns when available

2020-09-18 Thread Segher Boessenkool
the libgcc routines are just too slow? Is it because (the generic) libgcc does not trap for MIN_INT / -1 ? Some other reason? (I'm just curious; I cannot figure it out :-) ) Segher

Re: [PATCH] CSE negated multiplications and divisions

2020-09-18 Thread Segher Boessenkool
at (it depends on the insn costs if that can succeed). On gimple it is always cheaper, of course. Segher

Re: [RS6000] rs6000_rtx_costs cost IOR

2020-09-21 Thread Segher Boessenkool
Hi! On Thu, Sep 17, 2020 at 01:12:19PM +0930, Alan Modra wrote: > On Wed, Sep 16, 2020 at 07:02:06PM -0500, Segher Boessenkool wrote: > > > + /* Test both regs even though the one in the mask is > > > + constrained to be equal to th

Re: [RS6000] rotate and mask constants

2020-09-21 Thread Segher Boessenkool
Please write (in comments) how much of each insn are expected, and possibly for what function? Also, bonus points if you make this work for 32 bit as well (it is almost required even). Segher

Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

2020-09-21 Thread Segher Boessenkool
On Mon, Sep 21, 2020 at 09:13:58AM -0500, Qing Zhao wrote: > > On Sep 18, 2020, at 5:51 PM, Segher Boessenkool > > wrote: > >> B. Will provide a default definition in middle end to generate the > >> zeroing insn for selected registers. Then

Re: [PATCH] aarch64: Add extend-as-extract-with-shift pattern [PR96998]

2020-09-21 Thread Segher Boessenkool
uot;): the unpatched compiler ICEs! (At least three times, even). during RTL pass: reload /home/segher/src/kernel/kernel/cgroup/cgroup.c: In function 'rebind_subsystems': /home/segher/src/kernel/kernel/cgroup/cgroup.c:1777:1: internal compiler error: in lra_set_insn_recog_data,

Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

2020-09-21 Thread Segher Boessenkool
On Mon, Sep 21, 2020 at 03:58:25PM -0500, Qing Zhao wrote: > > On Sep 21, 2020, at 3:34 PM, Segher Boessenkool > > wrote: > > But you cannot *add* anything with this interface, and it cannot return > > different results depending on which return insn this is. It is no

Re: [PATCH] aarch64: Add extend-as-extract-with-shift pattern [PR96998]

2020-09-22 Thread Segher Boessenkool
Hi Alex, On Tue, Sep 22, 2020 at 08:40:07AM +0100, Alex Coplan wrote: > On 21/09/2020 18:35, Segher Boessenkool wrote: > Thanks for doing this testing. The results look good, then: no code size > changes and no build regressions. No *code* changes. I cannot test aarch64 likme this

Re: [RFC] update COUNTs of BB in loop.

2020-09-22 Thread Segher Boessenkool
e, and it isn't (only function can be predicates, by definition). "new_exit_prob" maybe? > + /* Update BB counts in loop body. > + COUNT = COUNT > + COUNT = COUNT * exit_edge_probility > + The COUNT=COUNT * old_exit_p / new_exit_p. */ Spaces around the "=" please? Segher

Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

2020-09-22 Thread Segher Boessenkool
volatile references, any intervening > volatile insn might affect machine state. */ Confusingly stated, but essentially correct (it is possible we place the volatile at I2, and everything would still be sequenced correctly, but combine does not guarantee that). > is_volatile_p = volatile_refs_p (PATTERN (insn)) > ? volatile_refs_p > : volatile_insn_p; Too much subtlety in there, heh. Segher

Re: PR97107, libgo fails to build for power10

2020-09-22 Thread Segher Boessenkool
+ tail calls. Tail calls don't count against crtl->is_leaf. */ > + for (insn = get_topmost_sequence ()->first; insn; insn = NEXT_INSN > (insn)) > + if (CALL_P (insn)) > + break; > + if (!insn) > + return; > +} I don't think that get_topmost_sequence is correct. Other than that this is fine for trunk (and backports). Thanks! Segher

Re: [RS6000] Power10 libffi fixes

2020-09-22 Thread Segher Boessenkool
(ffi_call_LINUX64): Don't emit global > entry when __PCREL__. Call using @notoc. > (ffi_closure_LINUX64, ffi_go_closure_linux64): Likewise. This is okay for trunk, and for backports (possibly expedited, talk with Peter for what is wanted/needed for AT). Thanks! Segher

Re: PR97107, libgo fails to build for power10

2020-09-22 Thread Segher Boessenkool
On Wed, Sep 23, 2020 at 10:00:01AM +0930, Alan Modra wrote: > Hi Segher, > > On Tue, Sep 22, 2020 at 06:59:42PM -0500, Segher Boessenkool wrote: > > On Tue, Sep 22, 2020 at 09:55:12AM +0930, Alan Modra wrote: > > >if (!info->push_p) > > > -return; >

Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

2020-09-23 Thread Segher Boessenkool
On Wed, Sep 23, 2020 at 09:28:33AM -0500, Qing Zhao wrote: > > On Sep 22, 2020, at 5:37 PM, Segher Boessenkool > > wrote: > >> which is very similar to the unspec_volatile case we're talking about. > > > > So just like volatile memory accesses, they ha

Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

2020-09-23 Thread Segher Boessenkool
ot use unspecs unless you have to: they hinder optimisation much, and if that was your actual *goal*, you will often find that they do not prevent every optimisation you wanted them to. Segher

Re: [PATCH] rs6000: Add 'd' for doubleword variant of vector insert

2020-09-23 Thread Segher Boessenkool
d.texi: Add 'd' for doubleword variant of > vector insert instruction. Pushed to trunk as trivial and obvious. Thanks Paul! Segher

Re: [PATCH v2 2/2] rs6000: Expand vec_insert in expander instead of gimple [PR79251]

2020-09-24 Thread Segher Boessenkool
pc does _not_ have a VSX instruction > like xxinsertw r34, r8, r12 where r8 denotes > the vector element (or byte position or whatever). vins[bhwd][v][lr]x does this. Those are Power10 instructions. Segher

Re: [PATCH v2 2/2] rs6000: Expand vec_insert in expander instead of gimple [PR79251]

2020-09-24 Thread Segher Boessenkool
9,2,.LC0@toc@ha mtvsrwz 32,5 mtvsrwz 33,6 addi 9,9,.LC0@toc@l lxvw4x 45,0,9 xxspltw 32,32,1 xxspltw 33,33,1 vcmpequw 0,0,13 xxsel 34,34,33,32 blr Segher

Re: [PATCH 1/2, rs6000] int128 sign extention instructions (partial prereq)

2020-09-24 Thread Segher Boessenkool
TI 0 "gpc_reg_operand" "=v") > +(unspec:TI [(match_operand:TI 1 "gpc_reg_operand" "v")] > + UNSPEC_EXTENDDITI2))] > + "TARGET_POWER10" > + "vextsd2q %0,%1" > + [(set_attr "type" "exts")]) This should use something with sign_extend. Okay for trunk. Thanks! But the unspecs really need to go sooner rather than later (these are by far not the only ones, so :-( ). Segher

Re: [PATCH 1/2] rs6000: Support _mm_insert_epi{8,32,64}

2020-09-24 Thread Segher Boessenkool
ong const __D, int const __N) > +{ > + __v2di result = (__v2di)__A; > + > + result [(__N & 0b1)] = __D; Especially single-digit numbers look really goofy (like 0x0, but even worse for binary somehow). Anyway, okay for trunk, with or without those things improved. Thanks! Segher

Re: [PATCH 2/2] rs6000: Add tests for _mm_insert_epi{8,32,64}

2020-09-24 Thread Segher Boessenkool
On Wed, Sep 23, 2020 at 05:12:45PM -0500, Paul A. Clarke wrote: > Copied from gcc.target/i386. Okay for trunk then. Thanks! (I peeked, it is just fine ;-) ) Segher

Re: [PATCH 2/2, rs6000] VSX load/store rightmost element operations

2020-09-24 Thread Segher Boessenkool
t and zero/sign extend). */ > + > +/* { dg-do compile {target power10_ok} } */ > +/* { dg-do run {target power10_hw} } */ > +/* { dg-require-effective-target power10_ok } */ > +/* { dg-options "-mdejagnu-cpu=power10 -O0" } */ Please comment here what that -O0 is for? So that we still know when we read it decades from now ;-) > +/* { dg-final { scan-assembler-times {\mlxvrwx\M} 2 } } */ > +/* { dg-final { scan-assembler-times {\mlwax\M} 0 } } */ Maybe all of {\mlwa} here? Segher

Re: [PATCH, rs6000] correct an erroneous BTM value in the BU_P10_MISC define

2020-09-25 Thread Segher Boessenkool
ly removed. No, it cannot. This is used for pdepd/pextd/cntlzdm/cnttzdm/cfuged, all of which do need 64-bit registers to do anything sane. This should really have defined some new builtin class, and I thought we could just be tricky and take a massive shortcut. Bill has been hit by this already as well, sigh :-( Segher

Re: [PATCH 0/2] Rework adding Power10 IEEE 128-bit min, max, and conditional move

2020-09-25 Thread Segher Boessenkool
ecause the RTL would be undefined! > In ISA 3.1 (power10) the decision was made to only provide the "C" form on > maximum and minimum. ... for quad precision. Segher

Re: [PATCH] powerpc, libcpp: Fix gcc build with clang on power8 [PR97163]

2020-09-25 Thread Segher Boessenkool
ersion > for GCC >= 4.5. Okay for trunk (and whatever backports you want of course, if any). Thanks! Segher

Re: [PATCH v2 2/2] rs6000: Expand vec_insert in expander instead of gimple [PR79251]

2020-09-25 Thread Segher Boessenkool
On Fri, Sep 25, 2020 at 08:58:35AM +0200, Richard Biener wrote: > On Thu, Sep 24, 2020 at 9:38 PM Segher Boessenkool > wrote: > > after which I get (-march=znver2) > > > > setg: > > vmovd %edi, %xmm1 > > vmovd %esi, %xmm2 >

<    1   2   3   4   5   6   7   8   9   10   >