[PATCH] simplify-rtx: Simplify sign_extend of lshiftrt to zero_extend (PR68330)

2015-11-14 Thread Segher Boessenkool
everything works as expected. Bootstrapped and tested on powerpc64-linux. Is this okay for trunk? Segher 2015-11-15 Segher Boessenkool PR rtl-optimization/68330 * simplify-rtx.c (simplify_unary_operation_1): Simplify SIGN_EXTEND of LSHIFTRT by a non-zero constant

Re: [PATCH][combine] PR rtl-optimization/68381: Only restrict pure simplification in mult-extend subst case, allow other substitutions

2015-11-19 Thread Segher Boessenkool
act, with this patch I'm seeing improved codegen for aarch64 with some > widening multiplications > combined with constant operands and ending up in bitfield move/insert > instructions. > > Bootstrapped and tested on aarch64, arm, x86_64. > > Ok for trunk? I'll have a look what it does for code quality on other targets. Segher

Re: [PATCH][combine] PR rtl-optimization/68381: Only restrict pure simplification in mult-extend subst case, allow other substitutions

2015-11-19 Thread Segher Boessenkool
n. > Thanks. In light of the above I think this patch happens to avoid > the issue highlighted above but we should fix the above code separately? Yes, if your patch creates better code we want it (and fix the regression), but you exposed a separate bug as well :-) Segher

Re: [PATCH v2] Add uaddv_optab, usubv4_optab

2015-11-22 Thread Segher Boessenkool
t; NE in the second comparison, and then converted a CCCmode compare to a > CCZmode compare. It sees the *first* comparison, and its use, and has simplified that. As far as I see, anyway. (It will never look outside a basic block, combine isn't *that* scary!) 0xff + x < 0xff (everything as unsigned char) is the same as x != 0 . Segher

Re: [PATCH][combine] PR rtl-optimization/68381: Only restrict pure simplification in mult-extend subst case, allow other substitutions

2015-11-23 Thread Segher Boessenkool
recog_for_combine on them > individually, ignoring the clobber. Before I made this use is_parallel_of_n_reg_sets the code used to test if it is a parallel of two sets, and no clobbers allowed. So it would never allow a clobber of zero. But now it does. I'll fix this in is_parallel_of_n_reg_sets. Thanks for finding the problem! Segher

[PATCH] combine: Handle aborts in is_parallel_of_n_reg_sets (PR68381)

2015-11-23 Thread Segher Boessenkool
de should be done for the -O3 problem. Thank you for tracking down this nastiness! Segher 2015-11-24 Segher Boessenkool PR rtl-optimization/68381 * combine.c (is_parallel_of_n_reg_sets): Return false if the pattern is poisoned. --- gcc/combine.c | 3 ++- 1 file ch

[PATCH] rs6000: Fix for and_operand oversight (PR68332, PR67677)

2015-11-23 Thread Segher Boessenkool
Calling rs6000_is_valid_and_mask on a reg instead of on a const_int is not a good idea, as PR68332 and PR67677 as well as testing with --enable-checking=yes,rtl show. Fix this. Bootstrapped and tested on powerpc64-linux. Is this okay for trunk? Segher 2015-11-24 Segher Boessenkool

[PATCH] shrink-wrap: Fix thinko (PR68520)

2015-11-24 Thread Segher Boessenkool
in the PR, with an x86_64 cross-compiler; bootstrap+regcheck in progress (on powerpc64-linux). Is this okay for trunk if that succeeds? Segher 2015-11-24 Segher Boessenkool PR rtl-optimization/68520 * shrink-wrap.c (try_shrink_wrapping): Don't push a block to VEC if

Re: [PATCH][combine] PR rtl-optimization/68381: Only restrict pure simplification in mult-extend subst case, allow other substitutions

2015-11-26 Thread Segher Boessenkool
hat I can see we don't lose any of the multiply-extend-accumulate > opportunities that we gained from the original combine patch. > > So can we take this patch in as well? See the patch mail... Segher

Re: [PATCH][combine] PR rtl-optimization/68381: Only restrict pure simplification in mult-extend subst case, allow other substitutions

2015-11-26 Thread Segher Boessenkool
n x unchanged if it is a no-op substitution. > > 2015-11-19 Kyrylo Tkachov > > PR rtl-optimization/68381 > * gcc.c-torture/execute/pr68381.c: New test. This is fine for trunk. Thanks. Segher

Re: basic asm and memory clobbers

2015-11-27 Thread Segher Boessenkool
asm("lolz"); b = 31; } } === does the asms in a loop, followed by the two stores. Making it asm("lolz" ::: "memory"); works as you seem to expect. It has behaved like this since at least 4.0 (the oldest compiler I have around currently). [ Yes I'm a broken record. ] Segher

[PATCH] rs6000: Optimise SImode cstore on 64-bit

2015-12-01 Thread Segher Boessenkool
we previously generated in half of the cases (and the same cost in the other cases). After this, the only sequence left that is using the mfcr insn is the one doing signed comparison of Pmode registers. Testing in progress. Okay for trunk if that succeeds? Segher 2015-12-01 Segher

Re: [PATCH] rs6000: Optimise SImode cstore on 64-bit

2015-12-01 Thread Segher Boessenkool
On Wed, Dec 02, 2015 at 01:50:46PM +1030, Alan Modra wrote: > On Wed, Dec 02, 2015 at 01:55:17AM +0000, Segher Boessenkool wrote: > > + emit_insn (gen_subdi3 (tmp, op1, op2)); > > + emit_insn (gen_lshrdi3 (tmp2, tmp, GEN_INT (63))); > > + emit_insn (gen_anddi3 (t

Re: [PATCH] rs6000: Optimise SImode cstore on 64-bit

2015-12-02 Thread Segher Boessenkool
On Tue, Dec 01, 2015 at 09:39:30PM -0600, Segher Boessenkool wrote: > On Wed, Dec 02, 2015 at 01:50:46PM +1030, Alan Modra wrote: > > On Wed, Dec 02, 2015 at 01:55:17AM +0000, Segher Boessenkool wrote: > > > + emit_insn (gen_subdi3 (tmp, op1, op2)); > > > + emit_i

[PATCH] Fix shrink-wrap bug with anticipating into loops (PR67778, PR68634)

2015-12-02 Thread Segher Boessenkool
s is a bit hard / expensive to compute, so instead this patch allows a block PRE only if PRE does not post-dominate any of its successors (other than itself). Tested on the two testcases from the PRs. Also regression checked on powerpc64-linux. Is this okay for trunk? Segher 2015-12-

Re: [PATCH] Fix shrink-wrap bug with anticipating into loops (PR67778, PR68634)

2015-12-02 Thread Segher Boessenkool
On Wed, Dec 02, 2015 at 08:19:05PM +0100, Jakub Jelinek wrote: > On Wed, Dec 02, 2015 at 06:21:47PM +0000, Segher Boessenkool wrote: > > --- a/gcc/shrink-wrap.c > > +++ b/gcc/shrink-wrap.c > > @@ -752,7 +752,11 @@ try_shrink_wrapping (edge *entry_edge, bitmap_head > > *

Re: [PATCH] Fix shrink-wrap bug with anticipating into loops (PR67778, PR68634)

2015-12-03 Thread Segher Boessenkool
On Thu, Dec 03, 2015 at 12:35:51PM +0100, Bernd Schmidt wrote: > On 12/02/2015 07:21 PM, Segher Boessenkool wrote: > >After shrink-wrapping has found the "tightest fit" for where to place > >the prologue, it tries move it earlier (so that frame saves are run > >earl

Re: [PATCH] Fix shrink-wrap bug with anticipating into loops (PR67778, PR68634)

2015-12-03 Thread Segher Boessenkool
On Thu, Dec 03, 2015 at 12:31:53PM +0100, Bernd Schmidt wrote: > On 12/02/2015 07:21 PM, Segher Boessenkool wrote: > >After shrink-wrapping has found the "tightest fit" for where to place > >the prologue, it tries move it earlier (so that frame saves are run > >earl

[PATCH 2/2] rs6000: Clean up the cstore code a bit

2015-12-04 Thread Segher Boessenkool
"register_operand" was a bit confusing. Also some other minor cleanups. Tested on powerpc64-linux; okay for mainline? Segher 2015-12-04 Segher Boessenkool * (cstore4_unsigned): Use gpc_reg_operand instead of register_operand. Remove empty constraints. Use

[PATCH 1/2] rs6000: Implement cstore for signed Pmode register compares

2015-12-04 Thread Segher Boessenkool
powerpc64-linux; okay for mainline? Segher 2015-12-04 Segher Boessenkool * (cstore4_signed): New expander. (cstore4): Call it. FAIL instead of calling rs6000_emit_sCOND. --- gcc/config/rs6000/rs6000.md | 50 +++-- 1 file changed, 48

[PATCH v2] Fix shrink-wrapping bug (PR67778, PR68634)

2015-12-07 Thread Segher Boessenkool
se then we will never need to duplicate that block (it will always be executed with prologue). Tested on the two testcases from the PRs. Also regression checked on powerpc64-linux (actually, that is still running). Is this okay for trunk? Segher 2015-12-07 Segher Boessenkool

Re: [PATCH v2] Fix shrink-wrapping bug (PR67778, PR68634)

2015-12-07 Thread Segher Boessenkool
On Tue, Dec 08, 2015 at 12:59:54AM +0100, Jakub Jelinek wrote: > Isn't this missing vec.release () for the pro == entry case? Yes it is. Fixed in v3, thanks! Segher

[PATCH v3] Fix shrink-wrapping bug (PR67778, PR68634)

2015-12-07 Thread Segher Boessenkool
se then we will never need to duplicate that block (it will always be executed with prologue). Tested on the two testcases from the PRs. Also regression checked on powerpc64-linux (actually, that is still running). Is this okay for trunk? Segher 2015-12-07 Segher Boessenkool

[PATCH] rtlanal: Fix bits/bytes confusion in set_noop_p (PR68814)

2015-12-09 Thread Segher Boessenkool
The meaning of ZERO_EXTRACT depends on BITS_BIG_ENDIAN, not on BYTES_BIG_ENDIAN. This caused PR68814. Testing in progress on powerpc64le-linux; if it passes, is this okay for trunk? Segher 2015-12-09 Segher Boessenkool PR rtl-optimization/68814 * rtlanal.c (set_noop_p

[PATCH v4] Fix shrink-wrapping bug (PR67778, PR68634)

2015-12-09 Thread Segher Boessenkool
est of the algorithm, potentially useful, and doesn't really cost more. Tested on the two testcases from the PRs. Also regression checked on powerpc64-linux. Is this okay for trunk? Segher 2015-12-09 Segher Boessenkool PR rtl-optimization/67778 PR rtl-optimization/6863

Re: [PATCH] rtlanal: Fix bits/bytes confusion in set_noop_p (PR68814)

2015-12-09 Thread Segher Boessenkool
3 [ c ])) ]) and if it has a parallel of two which doesn't match, it sees if it just needs one arm because the other is a noop set, and that ends up with deleting noop move 21 because of the wrong test, making the testcase fail. (powerpc64le has BITS_BIG_ENDIAN set, a bit unusual). Segher

Re: [PATCH] rtlanal: Fix bits/bytes confusion in set_noop_p (PR68814)

2015-12-10 Thread Segher Boessenkool
re copied. This really does not belong here I'd say (whatever creates this RTL should already simplify it), but I'm just fixing a bug ;-) Segher

Re: [PATCH][combine] Don't create LSHIFTRT of zero bits in change_zero_ext

2015-12-10 Thread Segher Boessenkool
in the 80's. I wouldn't use simplify_shift_const here, but simply simplify_gen_binary. The patch is okay with or without that change. Segher

[PATCH] bb-reorder: Remove a misfiring micro-optimization (PR96475)

2020-08-07 Thread Segher Boessenkool
deletes the other use of single_pred_p, which has the same problem in principle, I just never have triggered it so far. Tested on powerpc64-linux {-m32,-m64} like before. Is this okay for trunk? Segher 2020-08-07 Segher Boessenkool PR rtl-optimization/96475 * bb-reorder.c

Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

2020-08-07 Thread Segher Boessenkool
t; some volatile registers as well, including to hold sensitive info. And > > of course everything is different if you use separate shrink-wrapping, > > but that work is done already when you get here (so it is too late?) > > Could you please explain this part a little bit more? For example, on PowerPC, to restore the return address we first have to load it into a general purpose register (and then move it to LR). Usually r0 is used, and r0 is call-clobbered (but not used for parameter passing or return value passing). The return address of course is very sensitive information (exposing any return address makes ASLR useless immediately). But this isn't in the scope of this protection, I see. Thanks for the explanations, much appreciated, Segher

Re: [PATCH] rs6000: MMA built-ins reject typedefs of MMA types

2020-08-07 Thread Segher Boessenkool
(fromtype) == PXImode) > + if (frommode == PXImode) The element_mode vs. TYPE_MODE here does not matter, because we never deal with vector modes here, and they will error elsewhere anyway? Okay for trunk if that is true (or with the necessary adjustments), and okay for 10 after letting it soak for a bit. Thanks! Segher

Re: RFC: make combine do as advertised (cheaper-than)?

2020-08-10 Thread Segher Boessenkool
just fixing perceived inconsistencies. All the other cases reduce the number of RTL insns, which is a clear cost metric (and originally the only one!) If you have RTL insns that correspond to more than just single machine insns, this can hurt you; otherwise, there are a few cases where you want to trick combine, but not many (doing that tends to backfire anyway, better keep it to a minimum). Segher

Re: [PATCH] emit-rtl.c: Allow splitting of RTX_FRAME_RELATED_P insns?

2020-08-10 Thread Segher Boessenkool
ous difference is that the splitters run many times, while peep2 runs only once, very late. If you make this only do stuff for reload_completed splitters, that difference is gone as well. > But could you split the code out of peep2_attempt into a subroutine > (probably still in recog.c) and reuse it in try_split? Yes please :-) Segher

Re: [RS6000] PR96493, powerpc local call linkage failure

2020-08-11 Thread Segher Boessenkool
is > testcase. The testcases said it wanted power8, so why did it fail? GCC shouldn't use anything that requires p10 support in binutils then, or what do I miss here? Segher > --- a/gcc/testsuite/gcc.target/powerpc/pr96493.c > +++ b/gcc/testsuite/gcc.target/powerpc/pr96493.c &g

Re: [PATCH 1/2] PowerPC: Rename min/max/cmove functions.

2020-08-11 Thread Segher Boessenkool
_COND if it is zero/false. > + > + Return 0 if the operation cannot be generated, and 1 if we could generate > + the instruction. */ ... and 1 if we *did* generate it. Change this to a bool, and rename to "maybe_emit" etc.? "generate_" is a horrible name... We already have "gen_" things, with different semantics, and this emits the insn, doesn't just generate it. Thanks, Segher

Re: [RS6000] PR96493, powerpc local call linkage failure

2020-08-11 Thread Segher Boessenkool
On Tue, Aug 11, 2020 at 12:36:28PM -0500, Peter Bergner wrote: > On 8/11/20 11:35 AM, Segher Boessenkool wrote: > > Hi Alan, > > > > On Tue, Aug 11, 2020 at 06:38:53PM +0930, Alan Modra wrote: > >> This fixes a fail when power10 isn't supported by binutils,

Re: [PATCH] emit-rtl.c: Allow splitting of RTX_FRAME_RELATED_P insns?

2020-08-11 Thread Segher Boessenkool
sns, anyway? I can think of many things that could go wrong, but all of those can go wrong with 1-1 splits as well. Maybe this all just works because not very many 1-1 splits are used in practice? So many questions, feel free to ignore all :-) Segher

Re: [PATCH] rs6000: Update powerpc test cases to use -mdejagnu-cpu=.

2020-08-11 Thread Segher Boessenkool
/testsuite/gcc.target/powerpc/ -type f) You might find more false positives than I did with that in other dirs, but if not, might be useful to keep in a script ;-) (And I know it doesn't find everything, yes :-/ ) So other then the SPE ones it is fine, but perhaps you missed some? Segher

Re: [PATCH 2/2] PowerPC: Add power10 IEEE 128-bit min/max/cmove.

2020-08-11 Thread Segher Boessenkool
*/ > +/* { dg-final { scan-assembler "xsmaxcqp" } } */ > +/* { dg-final { scan-assembler "xsmincqp" } } */ > +/* { dg-final { scan-assembler "xxsel" } } */ Use \m \M please. Don't ask for powerpc* in gcc.target/powerpc/ (it is implied there). Why does it need lp64? There should be some test for __float128, instead. All in all, this is very hard to review :-( Segher

Re: [PATCH] rs6000: ICE when using an MMA type as a function param

2020-08-11 Thread Segher Boessenkool
uite/gcc.target/powerpc/pr96506.c > @@ -0,0 +1,61 @@ > +/* PR target/96506 */ > +/* { dg-do compile } */ > +/* { dg-require-effective-target power10_ok } */ > +/* { dg-options "-mdejagnu-cpu=power10 -O2 -w" } */ Do you need -w or could a less heavy hammer work as well? Okay for trunk (and backports after some simmering) with those things looked at. Thanks! Segher

Re: [PATCH] rs6000: ICE when using an MMA type as a function param

2020-08-11 Thread Segher Boessenkool
what you set here is unused. > > It's a static local variable, so how is it always zero and unused? Oh, trickiness with it being called a second time. Ouch! This needs a H U G E comment then... Or better, get rid of that? Segher

Re: [PATCH] Fix up flag_cunroll_grow_size handling in presence of optimize attr [PR96535]

2020-08-12 Thread Segher Boessenkool
documentation. The second half isn't relevant (you did that already :-) ) Segher

Re: [PATCH 2/2] PowerPC: Add power10 IEEE 128-bit min/max/cmove.

2020-08-12 Thread Segher Boessenkool
smaxcqp instructions. Oh wow, I hadn't noticed that before (or I had pushed it all the way back, like any other bad dream). Eww. So we are limited to only generating this insn with -ffast-math. Bah. (smin/smax on float cannot be used without fast math). Segher

Re: [PATCH] ipa-inline: Improve growth accumulation for recursive calls

2020-08-12 Thread Segher Boessenkool
nsider leaf functions and x86 xmm reg > ABI across calls). Even with large loop depth abstraction penalty removal can > make inlining worth it. For the testcase the recursiveness is what looks > special (recursion from a deeper loop nest level). Yes, the loop stuff / register pressure issues might help for the exchange result, but what about the other five above? Segher

Re: [PATCH] rs6000: ICE when using an MMA type as a function param

2020-08-12 Thread Segher Boessenkool
On Wed, Aug 12, 2020 at 02:24:33PM -0500, Peter Bergner wrote: > On 8/11/20 9:00 PM, Segher Boessenkool wrote: > > On Sun, Aug 09, 2020 at 10:03:35PM -0500, Peter Bergner wrote: > >> +/* { dg-options "-mdejagnu-cpu=power10 -O2 -w" } */ > > > > Do you need

Re: [PATCH v2] rs6000: ICE when using an MMA type as a function param or return value [PR96506]

2020-08-12 Thread Segher Boessenkool
So I am worried about that; other than that, this is just fine (if you tune the comment a bit). Thanks, Segher

Re: [Patch 1/5] rs6000, Add 128-bit sign extension support

2020-08-13 Thread Segher Boessenkool
"-O2 -mdejagnu-cpu=power9 -save-temps" } */ Is -save-temps needed? Not for the scan-assembler at least. Okay for trunk with those details take care of. Thanks! Segher

Re: [Patch 1/5] rs6000, Add 128-bit sign extension support

2020-08-13 Thread Segher Boessenkool
mment to read something like "ISA 3.0 sign extend builtins". Sounds good. > My thought for calling it out is that they could be back ported to an > earlier GCC version since they use Power 9 instructions but it is > probably not worth the effort unless there is an explicit request for > them. Yeah. Thanks for the explanation! Segher

Re: [PATCH v2] rs6000: ICE when using an MMA type as a function param or return value [PR96506]

2020-08-13 Thread Segher Boessenkool
On Thu, Aug 13, 2020 at 01:58:31PM -0500, Peter Bergner wrote: > On 8/12/20 8:59 PM, Peter Bergner wrote: > > On 8/12/20 8:00 PM, Segher Boessenkool wrote: > >> On Wed, Aug 12, 2020 at 03:32:18PM -0500, Peter Bergner wrote: > > Ok, how about this comment then? >

Re: [RFC PATCH v1 1/1] PPC64: Implement POWER Architecture Vector Function ABI.

2020-08-13 Thread Segher Boessenkool
return -1; > +return 0; > + default: > +gcc_unreachable (); > +} > +} Please don't use switch statements where a simple "if" would do. static int rs6000_simd_clone_usable (struct cgraph_node *node) { gcc_assert (node->simdclone->vecsize_mangle == 'b'); if (TARGET_VSX) return 0; return -1; } (If it looks too complicated, it probably is; don't remove whitelines to make it shorter, that only makes things worse). Anyway, please give some context in the proposed commit message: like pointing to *what* it implements, and where that is described, and where the binding is described. And no dead links please ;-) Segher

Re: [Patch 1/5] rs6000, Add 128-bit sign extension support

2020-08-13 Thread Segher Boessenkool
builtins defined in the ISA! The insns are just ISA 3.0 instructions. Segher

Re: [PATCH] rs6000, restrict bfloat convert intrinsic to Power 10. Fix BU_P10V macro definitions.

2020-08-14 Thread Segher Boessenkool
uot;__builtin_vsx_" NAME, /* NAME */ \ > /* Builtins for vector instructions added in ISA 3.1 (power10). */ > -BU_P10V_2 (VCLRLB, "vclrlb", CONST, vclrlb) > +BU_P10V_AV_2 (VCLRLB, "vclrlb", CONST, vclrlb) Maybe you should just keep "V" for insns using only the VRs (which you call V_AV now), and do "VS" for those working on all VSRs (which you call V_VSX here)? Segher

Re: [PATCH] options: Make --help= to emit values post-overrided

2020-08-14 Thread Segher Boessenkool
+ > + FOR_EACH_VEC_ELT (help_option_arguments, i, arg) > + print_help (opts, lang_mask, arg); > +} > } The patch looks just fine to me. But, not my call :-) Segher

Re: [PATCH] Add support for putting jump table into relocation read-only section

2020-08-14 Thread Segher Boessenkool
= select_jump_table_section (current_function_decl)); switch_to_section (section); (but it would be better if we didn't indent so deeply here). I think this should be split into a function just selecting the relro section (either directly, or from the rodata selection function), and then separately the jumptable section thing. There is a lot of stuff here that seems confused, and a lot of that already was there :-( Segher

Re: [PATCH] rs6000, restrict bfloat convert intrinsic to Power 10. Fix BU_P10V macro definitions.

2020-08-14 Thread Segher Boessenkool
Hi! On Fri, Aug 14, 2020 at 03:32:47PM -0700, Carl Love wrote: > On Fri, 2020-08-14 at 16:33 -0500, Segher Boessenkool wrote: > > So _vsx if it is for all VSRs, but _altivec for just the VRs? > > Yes, I worked off the rule that Altivec registers are designated with > 5-bits and

Re: [PATCH] middle-end: Fix PR middle-end/85811: Introduce tree_expr_maybe_nan_p et al.

2020-08-15 Thread Segher Boessenkool
|| tree_expr_maybe_signaling_nan_p (TREE_OPERAND (x, 1)); Can those ever return a SNaN? What does GCC do for FP_SNANS_ALWAYS_SIGNAL? All looks good to me except the SNaN stuff (which may be just me not understanding it). I find "maybe_" stuff very hard to read and understand btw, but there may be no escaping that :-/ Thanks, Segher

Re: [PATCH] middle-end: Simplify (sign_extend:HI (truncate:QI (ashiftrt:HI X 8)))

2020-08-15 Thread Segher Boessenkool
ks like it :-) You could say combine should be smarter about this, but this is a valid simplification in itself. So, okay for trunk. Thank you! Segher

Re: [PATCH] New test for PR rtl-optimization/96298.

2020-08-15 Thread Segher Boessenkool
ons "-O2 -fno-tree-forwprop" } */ > +/* { dg-additional-options "-mno-sse" { target x86_64-*-* i?86-*-* } } */ I would do a test with these target flags in gcc.target/i386/, and one without any in gcc.dg? It looks fine to me either way, I think you can use vector_size like

Re: [PATCH] Add support for putting jump table into relocation read-only section

2020-08-17 Thread Segher Boessenkool
ata section associated with > +@deftypefn {Target Hook} {section *} TARGET_ASM_FUNCTION_RODATA_SECTION > (tree @var{decl}, bool @var{section_reloc}) > +Return the readonly or reloc readonly data section associated with Should this take the 2-bit int "reloc" field like other functions, instead of this bool? Segher

Re: [PATCH] x86_64: PR rtl-optimization/92180: class_likely_spilled vs. cant_combine_insn.

2020-08-17 Thread Segher Boessenkool
won't propagate move insns from a hard non-fixed register to a pseudo into other insns, yeah. But that does not apply here? > So, following that > change, there is no point for fwprop to create instructions that > combine won't be able to process. Alternatively, perhaps fwprop

Re: [RFC PATCH v1 1/1] PPC64: Implement POWER Architecture Vector Function ABI.

2020-08-17 Thread Segher Boessenkool
multaneously. > > x86_64 has already done something very similar so I thought I would adapt as > much of their > documentation and implementation as I could for PPC64. > > Let's start with that. Comments so far? That sounds like libmvec? I still don't know what this is. Segher

Re: [PATCH] rs6000: unaligned VSX in memcpy/memmove expansion

2020-08-17 Thread Segher Boessenkool
rlap && (num_reg+1) >= MAX_MOVE_REG > + && bytes > move_bytes) > + return 0; The "num_reg+1" isn't obvious, and the comment doesn't say (we usually write is as "num_reg + 1" fwiw, and the parens are superfluous). Looks good, thanks! Okay for trunk with or without such changes. Segher

Re: [RFC PATCH v1 1/1] PPC64: Implement POWER Architecture Vector Function ABI.

2020-08-17 Thread Segher Boessenkool
On Mon, Aug 17, 2020 at 06:05:09PM -0400, David Edelsohn wrote: > The Power Vector ABI is available at > > https://github.com/power8-abi-doc/vector-function-abi > > It apparently did not attach correctly to the sourceware wiki or the > filename is different. Thanks! Segher

Re: [PATCH] middle-end: Fix PR middle-end/85811: Introduce tree_expr_maybe_nan_p et al.

2020-08-17 Thread Segher Boessenkool
On Mon, Aug 17, 2020 at 10:31:08PM +, Joseph Myers wrote: > On Sat, 15 Aug 2020, Segher Boessenkool wrote: > > On Sat, Aug 15, 2020 at 12:10:42PM +0100, Roger Sayle wrote: > > > I'll quote Joseph Myers (many thanks) who describes things clearly as: > > > > (a

Re: [PATCH] bb-reorder: Remove a misfiring micro-optimization (PR96475)

2020-08-17 Thread Segher Boessenkool
Ping (added some Cc:s). Thanks in advance, Segher On Fri, Aug 07, 2020 at 09:51:04PM +, Segher Boessenkool wrote: > When the compgotos pass copies the tail of blocks ending in an indirect > jump, there is a micro-optimization to not copy the last one, since the > original block

Re: [PATCH] rs6000: Rename instruction xvcvbf16sp to xvcvbf16spn

2020-08-18 Thread Segher Boessenkool
s instruction is in > an MMA conversion built-in function, so there is little to no compatibility > issue. > > I just pushed the patch that does the rename to binutils today. > > Ok for trunk and the GCC 10 branch after testing is clean? Yes, okay everywhere. Thanks! Segher

Re: [RFC PATCH v1 1/1] PPC64: Implement POWER Architecture Vector Function ABI.

2020-08-18 Thread Segher Boessenkool
I already pointed out fixed, that explanation added, and working links to the documentation? Thanks in advance, Segher

Re: [EXTERNAL] Re: [Patch 1/5] rs6000, Add 128-bit sign extension support

2020-08-18 Thread Segher Boessenkool
On Thu, Aug 13, 2020 at 06:53:56PM -0500, will schmidt wrote: > On Thu, 2020-08-13 at 17:55 -0500, Segher Boessenkool wrote: > > > As long as there are no issues defining the builtins for 3.0 here. > > > AFAIK they are not documented in ISA 3.0. This is a happy accident >

Re: [PATCH] rs6000: Add builtins for fegetround, feclearexcept and feraiseexcept [PR94193]

2020-08-18 Thread Segher Boessenkool
et/powerpc/ are run for powerpc already; just /* { dg-do run } */ please. > +/* { dg-options "-lm -fno-builtin" } */ Does that work everywhere? AIX, Darwin, other non-Linux systems, systems without OS, etc. > +#include That header does not exist everywhere. You can just declare the things you need (the FE_ constants?) Or perhaps you want to include /* { dg-require-effective-target fenv_exceptions } */ (probably easiest and nicest, I think). The rs6000 parts (including the testcases) look fine other than those things. Please fix and resend? For the generic parts someone else will have to review it (it looks fine to me, if that helps). Segher

Re: [PATCH 1/2] Add new RTX instruction class FILLER_INSN

2020-08-19 Thread Segher Boessenkool
g after leading filler insns, btw; you could get rid of the "fail" variable altogether, just call fatal_insn as soon as you see some unexpected RTX code. > + rtx_insn* i = make_insn_raw (pattern); rtx_insn *i = ... Segher

Re: [PATCH 1/2] Add new RTX instruction class FILLER_INSN

2020-08-19 Thread Segher Boessenkool
On Wed, Aug 19, 2020 at 11:13:40AM +0200, Andrea Corallo wrote: > Segher Boessenkool writes: > > So I wonder if this cannot be done with some kind of NOTE, instead? > > I was having a look into reworking this using an insn note as (IIUC) > suggested. The idea is appealing but

Re: [PATCH] bb-reorder: Remove a misfiring micro-optimization (PR96475)

2020-08-19 Thread Segher Boessenkool
On Wed, Aug 19, 2020 at 01:10:36PM +0100, Richard Sandiford wrote: > Segher Boessenkool writes: > > When the compgotos pass copies the tail of blocks ending in an indirect > > jump, there is a micro-optimization to not copy the last one, since the > > original block will

Re: [PATCH] rs6000: Enable more sibcalls when TOC is not preserved

2020-08-19 Thread Segher Boessenkool
restriction based on TOC preservation rules. However, a > caller that does preserve r2 cannot make a sibcall to a callee that > does not. This looks fine. _Is_ fine even, afaics :-) Okay for trunk. Thanks! Segher

Re: [EXTERNAL] Re: [PATCH] rs6000, restrict bfloat convert intrinsic to Power 10. Fix BU_P10V macro definitions.

2020-08-19 Thread Segher Boessenkool
On Wed, Aug 19, 2020 at 02:19:12PM -0500, Peter Bergner wrote: > On 8/14/20 7:42 PM, Segher Boessenkool wrote: > > I think your current code is fine; I hadn't considered Bill's upcoming > > rewrite. It is more important to make that go smoother than to fix some

Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

2020-08-19 Thread Segher Boessenkool
rom a security perspective, this isn't clear though. But that is a lot of extra research ;-) Segher

Re: [PATCH] bb-reorder: Remove a misfiring micro-optimization (PR96475)

2020-08-19 Thread Segher Boessenkool
ant one, then I think it'd > be better to make the max_size stuff conditional on !single_pred_p > rather than drop the test entirely. Okay, I'll look at that. Thanks, Segher

Re: [PATCH, rs6000] Add non-relative jump table support on Power Linux

2020-08-19 Thread Segher Boessenkool
, those are flush left like in outer braces in a C function are. > +(define_expand "nonrelative_tablejumpsi_nospec" > + [(set (match_dup 3) > +(match_operand:SI 0 "gpc_reg_operand" "r")) > + (parallel [(set (pc) > + (match_dup 3)) > + (use (label_ref (match_operand 1))) > + (clobber (match_operand 2))])] Please use a leading tab instead of every 8 leading spaces. I'll try to be quicker at reviewing iterations of this -- there is quite some way to go, without me slowing things down! Segher

Re: [Patch 2/5] rs6000, 128-bit multiply, divide, modulo, shift, compare

2020-08-19 Thread Segher Boessenkool
i" > > + [(set (match_operand:V1TI 0 "vlogical_operand") > > + (eq:V1TI (match_operand:V1TI 1 "vlogical_operand") > > +(match_operand:V1TI 2 "vlogical_operand")))] > > + "TARGET_TI_VECTOR_OPS" > > + "") All the rest of this is in rs6000.md, won't "eqvv1ti3" work already? > Since it's on all of the clauses, Maybe adjust the dg-require to > include ppc_native_128bit for the whole test, unless there is more to > follow. Good plan :-) Thanks for all the comments Will! Carl, could you fix things and resend please? It's a rather big patch, we'll have to do it in stages :-/ Segher

Re: [Patch 3/5] rs6000, Add TI to TD (128-bit DFP) and TD to TI support

2020-08-19 Thread Segher Boessenkool
ECTOR_OPS" > + "dcffixqq %0,%1" > + [(set_attr "type" "dfp")]) I wonder if this should just be TARGET_POWER10 now? That goes for the whole series of course. > + ;; carll I don't think we need this comment on trunk ;-) Looks fine otherwise. Okay for trunk, modulo whatever we do with YARGET_TI_VECTOR_OPS. Thanks! Segher

Re: [RFC PATCH v1 1/1] PPC64: Implement POWER Architecture Vector Function ABI.

2020-08-20 Thread Segher Boessenkool
he message (which is what a mail subject is *for*!) Segher

Re: [RFC PATCH v1 1/1] PPC64: Implement POWER Architecture Vector Function ABI.

2020-08-20 Thread Segher Boessenkool
On Thu, Aug 20, 2020 at 07:31:50PM +, GT wrote: > I'm still trying to understand why we need attribute((target("vsx"))). You need Power8, even! "vsx" alone is not enough (that only guarantees Power7). Your minimum version ("b") requires Power8. Segher

Re: [Patch 4/5] rs6000, Test 128-bit shifts for just the int128 type.

2020-08-20 Thread Segher Boessenkool
/rs6000/vsx.md > +++ b/gcc/config/rs6000/vsx.md > @@ -367,7 +367,7 @@ > UNSPEC_INSERTR > UNSPEC_REPLACE_ELT > UNSPEC_REPLACE_UN > - UNSPEC_XXSWAPD_V1TI > + UNSPEC_XXSWAPD_VEC_I128 Why not just UNSPEC_XXSWAPD? And, why an unspec at all? Segher

Re: [Patch 5/5] rs6000, Conversions between 128-bit integer and floating point values.

2020-08-20 Thread Segher Boessenkool
Hi! On Tue, Aug 11, 2020 at 12:23:13PM -0700, Carl Love wrote: [ Perfect stuff, or I don't see anything anyway! ] Okay for trunk. Thank you! Segher

Re: [PATCH 0/3] Power10 PCREL_OPT support

2020-08-20 Thread Segher Boessenkool
hat is not what it does anyway? /confused > In order to do this, the pass that converts the load address and load/store > must occur late in the compilation cycle. That does not follow afaics. > In particular, the second scheduler > pass will duplicate and optimize some of the referenc

Re: [PATCH 1/3] Power10: Add PCREL_OPT load support

2020-08-20 Thread Segher Boessenkool
crel%u-8)\n\t", > +label_num, label_num); > + } > + return; Don't eat output modifiers please. We have only so few left, and we cannot recycle any more without pain. I don't see why we cannot just do this in the normal output (C) code of the one or few insns that want this? > ;; The ISA we implement. > -(define_attr "isa" "any,p5,p6,p7,p7v,p8v,p9,p9v,p9kf,p9tf,p10" > +(define_attr "isa" "any,p5,p6,p7,p7v,p8v,p9,p9v,p9kf,p9tf,p10,pcrel_opt" >(const_string "any")) No. Please read the heading. Segher

Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

2020-08-24 Thread Segher Boessenkool
On Wed, Aug 19, 2020 at 06:27:45PM -0500, Qing Zhao wrote: > > On Aug 19, 2020, at 5:57 PM, Segher Boessenkool > > wrote: > > Numbers on how expensive this is (for what arch, in code size and in > > execution time) would be useful. If it is so expensive that no one w

Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

2020-08-24 Thread Segher Boessenkool
gisters for example; there are more cases, in general; only the backend code can know what is safe to do). Segher

Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

2020-08-24 Thread Segher Boessenkool
Hi! On Mon, Aug 24, 2020 at 01:02:03PM -0500, Qing Zhao wrote: > > On Aug 24, 2020, at 12:49 PM, Segher Boessenkool > > wrote: > > On Wed, Aug 19, 2020 at 06:27:45PM -0500, Qing Zhao wrote: > >>> On Aug 19, 2020, at 5:57 PM, Segher Boessenkool > >>>

Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

2020-08-24 Thread Segher Boessenkool
On Mon, Aug 24, 2020 at 01:48:02PM -0500, Qing Zhao wrote: > > > > On Aug 24, 2020, at 12:59 PM, Segher Boessenkool > > wrote: > > > > [ Please quote correctly. I fixed this up a bit. ] > > > > On Mon, Aug 24, 2020 at 02:47:22PM +, Rodr

Re: [PATCH] rs6000, restrict bfloat convert intrinsic to Power 10. Fix BU_P10V macro definitions.

2020-08-26 Thread Segher Boessenkool
for the GCC 10 tree. Thanks. The patch looks fine. It now is impossible to write a correct changelog for a backport like this, so I won't review that part. Please make it clear that this is a partial backport in the commit message (and commit of what ofc). Okay for trunk with that. Thanks! Segher

Re: [PATCH] rs6000: Disable -fcaller-saves by default

2020-08-26 Thread Segher Boessenkool
And then only for all together. > I'd probably start by making sure the cost computation is sane though. Yeah. Segher

Re: [PATCH] rs6000, restrict bfloat convert intrinsic to Power 10. Fix BU_P10V macro definitions.

2020-08-27 Thread Segher Boessenkool
-case VSX_BUILTIN_XVCVBF16SPN: > +case P10V_BUILTIN_XVCVSPBF16: > +case P10V_BUILTIN_XVCVBF16SPN: Having "P10V" in the name here doesn't really help anything; in fact, it could just be BUILTIN_XVCBCVXBCXBNVC? Simpler names like that will improve readability as well. But that is all for the future :-) Segher

Re: [PATCH] rs6000: Support ELFv2 sibcall for indirect calls [PR96787]

2020-08-27 Thread Segher Boessenkool
gen_rtx_MEM (SImode, func_desc), tlsarg); > + call[0] = gen_rtx_CALL (VOIDmode, gen_rtx_MEM (SImode, func_addr), tlsarg); I don't understand this change? (Maybe I'm not looking well enough.) Looks fine otherwise, yes :-) Segher

Re: [PATCH] rs6000: Support ELFv2 sibcall for indirect calls [PR96787]

2020-08-27 Thread Segher Boessenkool
'm not looking well enough.) > > Prior to this change, func_desc comes in as a parameter and is never > changed.  Now it's either that case, or it's the new case, so this just > is the join point of that decision. So I'm not looking well enough, okay :-) Segher

Re: [PATCH v5] genemit.c (main): split insn-emit.c for compiling parallelly

2020-08-27 Thread Segher Boessenkool
ease split this (at least the source line, but probably the target line is too long a well). All that are details. This does look like it fixes the problems in the previous versions. Thanks! Segher

Re: [PATCH] rs6000: Support ELFv2 sibcall for indirect calls [PR96787]

2020-08-27 Thread Segher Boessenkool
On Fri, Aug 28, 2020 at 10:48:43AM +0930, Alan Modra wrote: > On Thu, Aug 27, 2020 at 03:17:45PM -0500, Segher Boessenkool wrote: > > On Thu, Aug 27, 2020 at 01:51:25PM -0500, Bill Schmidt wrote: > > It not the copy that is unnecessary: the preventing it *here*, manually,

Re: [PATCH] rs6000: r12 copy cleanup

2020-08-28 Thread Segher Boessenkool
On Fri, Aug 28, 2020 at 11:48:41AM -0500, Bill Schmidt wrote: > Remove unnecessary tests before copying function address to r12, as > requested by Segher. > > Bootstrapped and tested on powerpc64le-unknown-linx-gnu with no > regressions, committed as obvious. Thanks! Segher

Re: [PATCH] rs6000, vec_popcntd is improperly defined in altivec.h

2020-08-31 Thread Segher Boessenkool
e unsupported defines for the builtin > functions. Okay for trunk, and okay for any backports you think is good. Thanks! Segher

Re: [PATCH] rs6000: Expand vec_insert in expander instead of gimple [PR79251]

2020-08-31 Thread Segher Boessenkool
t; xxperm 32,32,45 > xxsel 34,34,33,32 For v a V4SI, x a SI, j some int, what do we generate for v[j&3] = x; ? This should be exactly the same as we generate for vec_insert(x, v, j); (the builtin does a modulo 4 automatically). Segher

Re: [PATCH] test/rs6000: Add Power9 and up as vect_len target

2020-08-31 Thread Segher Boessenkool
ange if that works. Thanks! Segher

<    13   14   15   16   17   18   19   20   21   22   >