Re: Fold some equal to and not equal to patterns in match.pd

2015-07-23 Thread Segher Boessenkool
timisers. Or do we have an example where that does not work or is inconvenient? Segher

Re: [PR64164] drop copyrename, integrate into expand

2015-07-23 Thread Segher Boessenkool
least, it seems to be this patch, I haven't actually checked). Some representative backtraces: /home/segher/src/gcc/gcc/testsuite/gcc.c-torture/compile/pr54713-1.c: In function 'f1': /home/segher/src/gcc/gcc/testsuite/gcc.c-torture/compile/pr54713-1.c:13:1: internal compiler error

Re: [PATCH][RFC][match.pd] optimize (X & C) == N when C is power of 2

2015-07-24 Thread Segher Boessenkool
nt, well, do you really care? :-) (That second case might eventually fold to your original expression). Segher

Re: [PATCH] warn for unsafe calls to __builtin_return_address

2015-07-25 Thread Segher Boessenkool
t; So, my suggestion would be to warn for any call with a nonzero value. The current documentation says that you should only use nonzero values for debug purposes. A warning would help yes, how many people read the manual after all :-) Segher

Re: [PATCH 1/4] convert ASM_OUTPUT_ASCII to a hook

2015-07-25 Thread Segher Boessenkool
I, as defined in config/elfos.h will not emit NUL > - characters - instead it treats them as sub-string separators. Since > - we want to emit NUL strings terminators into the object file we have to > use > - ASM_OUTPUT_SKIP. */ > + FIXME: target.asm_out.output_ascii, as defined in config/elfos.h will not targetm? Segher

Re: [PATCH 4/4] define ASM_OUTPUT_LABEL to the name of a function

2015-07-25 Thread Segher Boessenkool
(FILE *f, const char *label) > +{ > + assemble_name (f, label); > + if (TARGET_GAS) > +fputs (":\n", f); > + else > +fputc ('\n', (f)); You forgot to remove the extra parens here :-) If so many targets use default_assemble_label, can you make it an actual default instead of defining it everywhere? Segher

Re: [PATCH 4/4] define ASM_OUTPUT_LABEL to the name of a function

2015-07-25 Thread Segher Boessenkool
this for target maintainers to do? It > seems like an improvement over what we have, and I don't think they are > getting in anyones way. If you're not confident removing it yourself, then don't. It isn't getting in anyone's way, simply because there is so *much* clutter... Segher

Re: [PATCH][RFC][match.pd] optimize (X & C) == N when C is power of 2

2015-07-27 Thread Segher Boessenkool
On Mon, Jul 27, 2015 at 09:11:12AM +0100, Kyrill Tkachov wrote: > On 25/07/15 03:19, Segher Boessenkool wrote: > >On Fri, Jul 24, 2015 at 09:09:39AM +0100, Kyrill Tkachov wrote: > >>This transformation folds (X % C) == N into > >>X & ((1 << (size - 1)) | (C

Re: [PATCH] warn for unsafe calls to __builtin_return_address

2015-07-28 Thread Segher Boessenkool
o argument can have unpredictable > +effects, including crashing the calling program. As a result, calls > +that are considered unsafe are diagnosed when the @option{-Wbuiltin-address} > +option is in effect. Such calls are typically only useful in debugging > +situations. I like the original "should only be used" better than that last line. Elsewhere there was a "non-zero" btw, but we should use "nonzero" according to the coding conventions. Huh. > +void* __attribute__ ((weak)) Not all targets support weak. Segher

Re: [PATCH], PowerPC IEEE 128-bit patch #4

2015-07-29 Thread Segher Boessenkool
16QImode); > +} > + else > +operands[1] = operands[2] = operands[3] = operands[0]; This won't work (in the pattern you write to op 3 before reading from op 2). Do you ever call this expander late, anyway? Segher

Re: [PATCH], PowerPC IEEE 128-bit patch #4

2015-07-29 Thread Segher Boessenkool
On Wed, Jul 29, 2015 at 06:38:45PM -0400, Michael Meissner wrote: > On Wed, Jul 29, 2015 at 04:59:23PM -0500, Segher Boessenkool wrote: > > On Wed, Jul 29, 2015 at 04:04:28PM -0400, Michael Meissner wrote: > > > +;; Return constant 0x8000 in an Altiv

Re: [PATCH 0/9] start converting POINTER_SIZE to a hook

2015-07-29 Thread Segher Boessenkool
er could > function correctly if this ever changed in the middle of a function. It is also very ugly and much harder to read: it is longer, with more useless interpunction, and there is nothing that makes clear it is a constant. Segher

[PATCH] rs6000: Fix PR67045

2015-07-29 Thread Segher Boessenkool
Paper bag time. Committing as obvious fix. Bootstrapped and regression checked on powerpc64-linux and powerpc64le-linux; also bootstrapped the latter with --enable-checking=release and -O3 (the PR67045 case). Will do an --enable-checking=yes,rtl as well. Segher 2015-07-29 Segher

[PATCH] Build *-match.o as early as possible

2015-08-03 Thread Segher Boessenkool
The two files *-match.o files always finish building last, so if we start building them as soon as possible (instead of pretty late) the total build time will be less on a parallel build. Bootstrapped and tested on powerpc64-linux. Is this okay for trunk? Segher 2014-080-3 Segher

Re: [PATCH] Add __builtin_stack_top

2015-08-04 Thread Segher Boessenkool
ve a reason why you want the entry stack address instead of the frame address, but you didn't really explain I think? Or I missed it. Segher

Re: [PATCH] Add __builtin_stack_top

2015-08-04 Thread Segher Boessenkool
for a count of 0; but making it target-specific is certainly more conservative. You say i386 doesn't have that target macro defined currently. Yes I know; so change that? Or change the generic code, but that is much more testing. Segher

Re: [PATCH] Add __builtin_stack_top

2015-08-04 Thread Segher Boessenkool
gt;> > the > >> > frame address, but you didn't really explain I think? Or I missed it. What would a C program do with this, that it cannot do with the frame address, that would be useful and cannot be much better done in straight assembler? Do you actually want to expose the argument pointer, maybe? Segher

Re: [PATCH 4/4] define ASM_OUTPUT_LABEL to the name of a function

2015-08-05 Thread Segher Boessenkool
; > Heh, indeed. Maybe instead do > > .insert_from_file > > and do that only when we are using -pipe or so. That's ".incbin". Do we really want to go through the headaches of using extra files though? Is this really a bottleneck? Will it even help? Segher

[PATCH] shrink-wrap: Fix up partitions (PR67587)

2015-09-15 Thread Segher Boessenkool
With the new shrink-wrap algorithm, blocks reachable both with and without prologue are duplicated, and their incoming edges are then distributed accordingly. So we need to call fixup_partitions. Is this okay for trunk? Segher 2015-09-16 Segher Boessenkool PR bootstrap/67587

Re: [PATCH, rs6000] Add expansions for min/max vector reductions

2015-09-16 Thread Segher Boessenkool
ders call DONE. You can instead write those patterns as just USEs of their operands? Segher

Re: dejagnu version update?

2015-09-16 Thread Segher Boessenkool
> list): We also need it to avoid the "-jN check loses most results in the summary" problem; or it seems we need to avoid 1.5.2 only for that, if I read the log correctly. Segher

Re: [PATCH, rs6000] Add expansions for min/max vector reductions

2015-09-17 Thread Segher Boessenkool
er expand, so relying on combine to combine all that plus the following splat (as Richard suggests below) is not really going to work. If there also are targets where the _scal version is cheaper, maybe we should keep both, and have expand expand to whatever the target supports? Segher

[PATCH] shrink-wrap: Handle multiple predecessors of prologue

2015-09-18 Thread Segher Boessenkool
is the same test as was there before; I haven't measured what the impact of this suboptimality is. Bootstrapped and tested on powerpc64-linux. is this okay for mainline? Segher 2015-09-18 Segher Boessenkool * function.c (thread_prologue_and_epilogue_insns): Delete or

Re: [PATCH] shrink-wrap: Handle multiple predecessors of prologue

2015-09-21 Thread Segher Boessenkool
ump over the new block (it was not necessarily a fallthrough edge, could be EDGE_COMPLEX). In most cases we could do it of course. I would prefer not to add special case code (that is not as well tested). Segher

Re: [PATCH] shrink-wrap: Handle multiple predecessors of prologue

2015-09-22 Thread Segher Boessenkool
On Tue, Sep 22, 2015 at 05:47:25PM +0200, Bernd Schmidt wrote: > On 09/21/2015 04:01 PM, Segher Boessenkool wrote: > >On Mon, Sep 21, 2015 at 01:56:28PM +0200, Bernd Schmidt wrote: > >>>+ basic_block new_bb = create_empty_bb (EXIT_BLOCK_PTR_FOR_FN > >>>(cfun)->

Re: [PATCH] shrink-wrap: Handle multiple predecessors of prologue

2015-09-22 Thread Segher Boessenkool
On Tue, Sep 22, 2015 at 11:11:47AM -0500, Segher Boessenkool wrote: > > So, given that your solution seems to work, the patch is ok. > > Thanks! Here is what I will commit after bootstrap+test (the superfluous > assert removed, and the comment for the code quoted above tweaked a

[PATCH 0/4] bb-reorder: Add the "simple" algorithm

2015-09-23 Thread Segher Boessenkool
step works. Bootstrapped and tested on powerpc64-linux. There are two new fails in guality testresults for -Os. Is this okay for mainline? Segher Segher Boessenkool (4): bb-reorder: Split out STC bb-reorder: Add the "simple" algorithm bb-reorder: Add -freorder-blocks-algorithm= and wir

[PATCH 1/4] bb-reorder: Split out STC

2015-09-23 Thread Segher Boessenkool
This first patch simply factors code a little bit. 2015-09-23 Segher Boessenkool * bb-reorder.c (reorder_basic_blocks_software_trace_cache): New function, factored out from ... (reorder_basic_blocks): ... here. --- gcc/bb-reorder.c | 29

[PATCH 2/4] bb-reorder: Add the "simple" algorithm

2015-09-23 Thread Segher Boessenkool
disregard loops (which we cannot allow) and the complications of block partitioning. 2015-09-23 Segher Boessenkool * bb-reorder.c (reorder_basic_blocks_software_trace_cache): Print a header to the dump file. (edge_order): New function. (reorder_basic_blocks_simple

[PATCH 4/4] bb-reorder: Documentation updates

2015-09-23 Thread Segher Boessenkool
This updates the documentation for the new option and new defaults. 2015-09-23 Segher Boessenkool * doc/invoke.texi (Optimization Options): Add -freorder-blocks-algorithm=. (Optimize Options) <-O>: Add -freorder-blocks. <-O2>: Remove -freorder-

[PATCH 3/4] bb-reorder: Add -freorder-blocks-algorithm= and wire it up

2015-09-23 Thread Segher Boessenkool
the changes are for -O1 (which now gets "simple" instead of nothing), -Os (which now gets "simple" instead of "stc", since STC results in much bigger code), and for targets that wish to never use STC (not in this patch though). 2015-09-23 Segher Boessenkool

[PATCH] rs6000: Fix -mdebug=stack code for spe_gp_offset

2015-09-23 Thread Segher Boessenkool
This seems like an obvious typo. I cannot test SPE, but I noticed this offset shows up in the debug output for normal configurations. The condition is inverted, compared to all similar ones. Is this okay for trunk? Segher 2015-09-23 Segher Boessenkool * config/rs6000/rs6000.c

Re: [PATCH 0/4] bb-reorder: Add the "simple" algorithm

2015-09-24 Thread Segher Boessenkool
On Thu, Sep 24, 2015 at 11:56:22AM +0200, Bernd Schmidt wrote: > On 09/24/2015 12:06 AM, Segher Boessenkool wrote: > >The current basic block reordering always uses the "software trace cache" > >algorithm. That has a few problems: > > > >1) It increases code

Re: [PATCH 2/4] bb-reorder: Add the "simple" algorithm

2015-09-24 Thread Segher Boessenkool
On Thu, Sep 24, 2015 at 12:32:59PM +0200, Bernd Schmidt wrote: > On 09/24/2015 12:06 AM, Segher Boessenkool wrote: > >This is the meat of this series: a new algorithm to do basic block > >reordering. It uses the simple greedy approach to maximum weighted > >matching, wher

Re: [PATCH 3/4] bb-reorder: Add -freorder-blocks-algorithm= and wire it up

2015-09-24 Thread Segher Boessenkool
On Thu, Sep 24, 2015 at 12:28:08PM +0200, Bernd Schmidt wrote: > On 09/24/2015 12:06 AM, Segher Boessenkool wrote: > >This adds an -freorder-blocks-algorithm=[simple|stc] flag, with "simple" > >as default. For -O2 and up (except -Os) it is switched to "stc" inst

Re: [PATCH 2/4] bb-reorder: Add the "simple" algorithm

2015-09-24 Thread Segher Boessenkool
On Thu, Sep 24, 2015 at 08:39:30AM -0500, Segher Boessenkool wrote: > > Any thoughts on this vs qsort? Do you need a stable sort? > > We always need stable sorts in GCC; things are not reproducible across > targets with qsort (not every qsort is the same). s/targets/hosts/

Re: [PATCH 3/4] bb-reorder: Add -freorder-blocks-algorithm= and wire it up

2015-09-24 Thread Segher Boessenkool
On Thu, Sep 24, 2015 at 08:12:55AM -0700, Andi Kleen wrote: > Segher Boessenkool writes: > > > > In effect, the changes are for -O1 (which now gets "simple" instead of > > nothing), -Os (which now gets "simple" instead of "stc", since STC res

Re: [PATCH 2/4] bb-reorder: Add the "simple" algorithm

2015-09-24 Thread Segher Boessenkool
On Thu, Sep 24, 2015 at 06:03:33PM +0200, Steven Bosscher wrote: > On Thu, Sep 24, 2015 at 12:06 AM, Segher Boessenkool wrote: > > + /* First, collect all edges that can be optimized by reordering blocks: > > + simple jumps and conditional jumps, as well as the function

Re: [PATCH 2/4 v2] bb-reorder: Add the "simple" algorithm

2015-09-25 Thread Segher Boessenkool
v2 changes: - Add a file header comment; - Use "for" loop initial declarations; - Handle asm goto. Testing this on x86_64-linux; okay if it succeeds? Segher 2015-09-99 Segher Boessenkool * bb-reorder.c: Add intro comment. (reorder_basic_blocks_software_trace_cac

Re: [PATCH 2/4 v2] bb-reorder: Add the "simple" algorithm

2015-09-25 Thread Segher Boessenkool
On Fri, Sep 25, 2015 at 10:59:37AM -0500, Peter Bergner wrote: > On Fri, 2015-09-25 at 09:16 -0500, Segher Boessenkool wrote: > > (reorder_basic_blocks): Choose between the STC and the simple > > algorithms (always choose the former). > [snip] > @@ -2274,7 +2444,10 @@

Re: using scratchpads to enhance RTL-level if-conversion: the new patch is almost ready to be prepared for merging to trunk, but not 100% ready yet

2015-09-25 Thread Segher Boessenkool
track down, but it's usually worth the time spent in the end. Compile with -dap (directly with cc1 or with -S), find some difference in the generated asm, see what the RTL insn number for that was (that's what -dp is for), find where the difference came from (from the dumpfiles, that's -da; you probably already know what pass is the culprit ;-) ) Segher

Re: [Patch,optimization]: Optimized changes in the estimate register pressure cost.

2015-09-27 Thread Segher Boessenkool
gt; > ratio with the optimization vs ratio without optimization for FP benchmarks > ( 4668.743 vs 4778.741) Did you swap these? You're saying FP got significantly worse? Segher

Re: [PATCH] Convert SPARC to LRA

2015-09-28 Thread Segher Boessenkool
ong time. We can at least change the default to LRA, so new ports get it unless they like to hurt themselves. I don't think it makes sense to keep reload around *just* for the ports that are in "maintenance mode": by the time we are down to *just* those ports, it makes more sense to relabel them as "unmaintained". Segher

[PATCH] Fix gcc.dg/asm-4.c

2015-09-28 Thread Segher Boessenkool
Double-quoted words in Tcl have substitutions performed on them, including backslash substitutions. That isn't terribly nice for regular expressions, so use braced words instead. Tested on powerpc64-linux. Okay for mainline? Segher 2015-09-28 Segher Boessenkool gcc/test

Re: [PATCH] Convert SPARC to LRA

2015-09-30 Thread Segher Boessenkool
ort a significant priority. I'd be > more likely to push for deprecating cc0 targets first. It looks like most cc0 targets would be pretty easy to convert, if anyone can do testing anyway ;-) Segher

[PATCH] rs6000: Add "cannot_copy" attribute, use it (PR67788, PR67789)

2015-09-30 Thread Segher Boessenkool
-m64/-mlra); new testcase fails before, works after (on 32-bit). Is this okay for mainline? Segher 2015-09-30 Segher Boessenkool PR target/67788 PR target/67789 * config/rs6000/rs6000.c (TARGET_CANNOT_COPY_INSN_P): New. (rs6000_cannot_copy_insn_p): New function

Re: [PATCH] rs6000: Add "cannot_copy" attribute, use it (PR67788, PR67789)

2015-10-01 Thread Segher Boessenkool
On Thu, Oct 01, 2015 at 12:14:44PM +0200, Richard Biener wrote: > On Thu, Oct 1, 2015 at 8:08 AM, Segher Boessenkool > wrote: > > After the shrink-wrapping patches the prologue will often be pushed > > "deeper" into the function, which in turn means the software tr

Re: [PATCH] rs6000: Add "cannot_copy" attribute, use it (PR67788, PR67789)

2015-10-01 Thread Segher Boessenkool
x27;t save the call to the target hook, and that is a big part of the overhead already. Segher

Re: [PATCH] rs6000: Add "cannot_copy" attribute, use it (PR67788, PR67789)

2015-10-01 Thread Segher Boessenkool
On Fri, Oct 02, 2015 at 10:24:07AM +0930, Alan Modra wrote: > On Thu, Oct 01, 2015 at 12:18:08PM -0500, Segher Boessenkool wrote: > > On Thu, Oct 01, 2015 at 12:14:44PM +0200, Richard Biener wrote: > > > So even if not "easy", can you try? > > > > I did,

Re: Fold acc_on_device

2015-10-05 Thread Segher Boessenkool
7861. > +case BUILT_IN_ACC_ON_DEVICE: > + return gimple_fold_builtin_acc_on_device (gsi, > + gimple_call_arg (stmt, 0)); > default:; > } Segher

[PATCH] bb-reorder: Improve the simple algorithm for -Os (PR67864)

2015-10-08 Thread Segher Boessenkool
4 stc x86_64 1905733 1905733 1905733 1905733 1905733 1905733 - xtensa Is this okay for trunk? Segher 2015-10-08 Segher Boessenkool PR rtl-optimization/67864 * gcc/bb-reorder (reorder_basic_blocks_simple): Prefer existing fallthrough edges for condit

Re: [PATCH] bb-reorder: Improve the simple algorithm for -Os (PR67864)

2015-10-09 Thread Segher Boessenkool
icantly improve on unconditional branches. I'm sure there are better heuristics possible but I don't know them (and actually *solving* the problem isn't even polynomial of course). Segher

Re: [PATCH] bb-reorder: Improve the simple algorithm for -Os (PR67864)

2015-10-09 Thread Segher Boessenkool
On Fri, Oct 09, 2015 at 12:35:46PM +0200, Bernd Schmidt wrote: > On 10/08/2015 06:57 PM, Segher Boessenkool wrote: > >As the PR points out, the "simple" reorder algorithm makes bigger code > >than the STC algorithm did, for -Os, for x86. I now tested it for many > >

[PATCH] i386: Use the STC bb-reorder algorithm at -Os (PR67864)

2015-10-16 Thread Segher Boessenkool
For x86, STC still gives better results for optimise-for-size than "simple" does. So use STC at -Os as well. Is this okay for trunk? Segher 2015-10-16 Segher Boessenkool PR rtl-optimization/67864 * common/config/i386/i386-common.c (ix86_option_optimiza

[PATCH] mn10300: Use the STC bb-reorder algorithm at -Os

2015-10-16 Thread Segher Boessenkool
For mn10300, STC still gives better results for optimise-for-size than "simple" does. So use STC at -Os as well. Is this okay for trunk? Segher 2015-10-16 Segher Boessenkool * common/config/mn10300/mn10300-common.c (mn10300_option_optimization_table) :

Re: [PATCH] i386: Use the STC bb-reorder algorithm at -Os (PR67864)

2015-10-16 Thread Segher Boessenkool
On Fri, Oct 16, 2015 at 02:55:54PM +0200, Bernd Schmidt wrote: > On 10/16/2015 02:53 PM, Segher Boessenkool wrote: > >For x86, STC still gives better results for optimise-for-size than > >"simple" does. So use STC at -Os as well. > > For how many targets is this tr

Re: [PATCH, rs6000][v3] powerpc musl libc support

2015-10-16 Thread Segher Boessenkool
; G "}}" > +#elif DEFAULT_LIBC == LIBC_MUSL > +#define CHOOSE_DYNAMIC_LINKER(G, U, M) \ > + "%{mglibc:" G ";:%{muclibc:" U ";:" M "}}" > #else > #error "Unsupported DEFAULT_LIBC" > #endif This doesn't really scale, I wonder if some more elegant non-quadratic way is possible? Not that I expect terribly many other libcs to show up in the near future ;-) Segher

Re: [PATCH][simplify-rtx][2/2] Use constants from pool when simplifying binops

2015-10-19 Thread Segher Boessenkool
try the to use the const_double directly > but rather its constant pool reference. What happens if the constant pool reference is actually the better code, do we still generate that? Segher

Re: [Patch, MIPS] Patch to fix MIPS optimization bug in combine

2015-10-21 Thread Segher Boessenkool
c))) x.c:21 472 {*branch_equalitysi} > (expr_list:REG_DEAD (reg:DI 207) > (int_list:REG_BR_PROB 8010 (nil))) > -> 35) Why does *branch_equalitysi allow these subregs if it cannot handle them? Not just combine can generate code like this (in theory at least). Segher

Re: [PATCH] Fix PR rtl-optimization/67736 in combine.c

2015-10-22 Thread Segher Boessenkool
e-optimizations is default at -O2. > + > +#include Does every target have that header? Shouldn't it be ? > --- a/gcc/testsuite/gcc.dg/torture/pr67736.c > +++ b/gcc/testsuite/gcc.dg/torture/pr67736.c > @@ -0,0 +1,32 @@ > +/* { dg-do run } */ > + > +#include > + And here you don't need inttypes at all? Confused. Segher

Re: [PATCH], PowerPC IEEE 128-bit patch #7 (revised #2)

2015-10-23 Thread Segher Boessenkool
e additional patch in this > selection, to restrict __float128 and IBM extended double from being combined > in an expression. #04, #05 are missing the patch. Segher

[PATCH] rs6000: p8vector-builtin-8.c test requires int128

2015-10-25 Thread Segher Boessenkool
For 32-bit targets p8vector_ok does not imply we have int128. Tested with -m32,-m32/-mpowerpc64,-m64; okay for trunk? Segher 2015-10-26 Segher Boessenkool gcc/testsuite/ * gcc.target/powerpc/p8vector-builtin-8.c: Add "target int128". --- gcc/testsuite/gcc.target/powerp

[PATCH] rs6000: Fix tests for xvmadd and xvnmsub

2015-10-25 Thread Segher Boessenkool
The patterns involved can create vmadd resp. vnmsub instructions instead. This patch changes the testcases to allow those. Tested with -m32,-m32/-mpowerpc64,-m64; okay for trunk? Segher 2015-10-26 Segher Boessenkool gcc/testsuite/ * gcc.target/powerpc/vsx-builtin-2.c: Allow vmadd

Re: [PATCH], PowerPC IEEE 128-bit patch #9 (enable __float128 by default on VSX systems)

2015-10-29 Thread Segher Boessenkool
ot; } { "" } } */ Some of this is redundant. Should this really be linux-only? The second line doesn't do anything then. { "*" } { "" } isn't needed either. In general, please only add dg-skip-if if you know it is needed. There probably should be an effective-target for float128. Segher

[PATCH] lra: Don't remove the scratch in (mem:BLK (scratch))

2015-10-29 Thread Segher Boessenkool
solve the rs6000 bootstrap problems with LRA enabled by default (it ICEd building libitm), with no apparent ill effects. It seems other targets can do without this. rs6000 uses this construct inside of a PARALLEL, maybe that is the difference? Any hints appreciated! Segher 2015-10-29 Segher B

[PATCH] rs6000: Save the PIC reg when needed

2015-10-29 Thread Segher Boessenkool
and the previous LRA patch there now are no differences between -mlra and -mno-lra testsuite runs (on big-endian power7, gcc110). Is this okay for trunk? Segher 2015-10-29 Segher Boessenkool * config/rs6000/rs6000.c (rs6000_reg_live_or_pic_offset_p): Move this function

[PATCH 1/2] rs6000: Another PIC LRA fix

2015-10-31 Thread Segher Boessenkool
variants {-m32/-mno-lra,-m32/-mlra,-m32/-mpowerpc64,-m64/-mno-lra,-m64/-mlra}; and on powerpc64le-linux, everything default. Both also with bootstrapping with LRA defaulted on. Okay for trunk? Segher 2015-10-31 Segher Boessenkool * config/rs6000/rs6000.c (rs6000_reg_live_or_pic_

[PATCH 2/2] rs6000: Rewrite rs6000_reg_live_or_pic_offset_p

2015-10-31 Thread Segher Boessenkool
This function is quite a puzzle; untangle it. No functional change. Tested etc.; okay for trunk? Segher 2015-10-31 Segher Boessenkool * config/rs6000/rs6000.c (rs6000_reg_live_or_pic_offset_p): Rewrite. --- gcc/config/rs6000/rs6000.c | 35 --- 1

Re: [PATCH 13/16] Add test-rtl.c to unittests

2015-10-31 Thread Segher Boessenkool
ETs (one of PC). Segher

Re: [PATCH 3/4] bb-reorder: Add -freorder-blocks-algorithm= and wire it up

2015-11-02 Thread Segher Boessenkool
On Mon, Nov 02, 2015 at 02:38:33PM +, Alan Lawrence wrote: > On 23/09/15 23:06, Segher Boessenkool wrote: > >This adds an -freorder-blocks-algorithm=[simple|stc] flag, with "simple" > >as default. For -O2 and up (except -Os) it is switched to "stc" instead.

Re: [PATCH], Add power9 support to GCC, patch #1

2015-11-04 Thread Segher Boessenkool
ar > +instructions that were added in version 2.07 of the PowerPC ISA. Also > +enable the use of built-in functions that allow more direct access to > +the vector instructions. 3.0 here as well? Segher

Re: [PATCH][combine][RFC] Don't transform sign and zero extends inside mults

2015-11-04 Thread Segher Boessenkool
> The testcase in the patch is the most minimal one I could get that > demonstrates the issue I'm trying to solve. > > Does this approach look ok? In broad lines it does. Could you try this patch instead? Not tested etc. (other than building an aarch64 cross and your test case

Re: [PATCH] clarify documentation of -Q --help=optimizers

2015-11-05 Thread Segher Boessenkool
meaning of "optimization"; anything else is arguably a bug. But people have many different understandings of what a "compiler optimization" is, all the way to "anything the compiler does". Segher

Re: [PATCH] clarify documentation of -Q --help=optimizers

2015-11-05 Thread Segher Boessenkool
On Thu, Nov 05, 2015 at 02:04:47PM -0700, Martin Sebor wrote: > On 11/05/2015 10:09 AM, Segher Boessenkool wrote: > >On Thu, Nov 05, 2015 at 08:58:25AM -0700, Martin Sebor wrote: > >>I don't think that reiterating in a condensed form what the manual > >>doesn'

Re: [PATCH][combine][RFC] Don't transform sign and zero extends inside mults

2015-11-05 Thread Segher Boessenkool
ortex-a53, or you just get a divide insn). > Is there a way that subst can signal some kind of "failed to substitute" > result? Yep, see new patch. The "from == to" condition is for when subst is called just to simplify some code (normally with pc_rtx, pc_rtx). > If n

Re: [PATCH][combine][RFC] Don't transform sign and zero extends inside mults

2015-11-06 Thread Segher Boessenkool
g else but registers (immediates, memory, ...). This probably is a reasonable tradeoff for all targets, even those (if any) that have such insns. > >I'll let you put it through it's paces on your setup :) > I'll let Segher give the final yes/no on this, but it generally lo

Re: [PATCH] i386: Use the STC bb-reorder algorithm at -Os (PR67864)

2015-11-06 Thread Segher Boessenkool
Adding x86 maintainer, ping? On Fri, Oct 16, 2015 at 05:53:41AM -0700, Segher Boessenkool wrote: > For x86, STC still gives better results for optimise-for-size than > "simple" does. So use STC at -Os as well. > > Is this okay for trunk? > > > Segher > &

Re: [PATCH] c/67882 - improve -Warray-bounds for invalid offsetof

2015-11-07 Thread Segher Boessenkool
rator is applied to this pointer explicitly, or implicitly as a result of subscripting, the result is the referenced (n - 1)-dimensional array, which itself is converted into a pointer if used as other than an lvalue. It follows from this that arrays are stored in row-major order (last subscript varies fastest). As far as I see, a5_7[5] here is never treated as an array, just as a pointer, and &a5_7[5][0] is valid. Segher

Re: [PATCH][combine][RFC] Don't transform sign and zero extends inside mults

2015-11-08 Thread Segher Boessenkool
On Fri, Nov 06, 2015 at 04:00:08PM -0600, Segher Boessenkool wrote: > This patch stops combine from generating widening muls of anything else > but registers (immediates, memory, ...). This probably is a reasonable > tradeoff for all targets, even those (if any) that have such insns. >

[PATCH] Fix bb-reorder problem with degenerate cond_jump (PR68182)

2015-11-08 Thread Segher Boessenkool
. Is this okay for trunk? Segher 2015-11-09 Segher Boessenkool * gcc/bb-reorder.c (reorder_basic_blocks_simple): Treat a conditional branch with only one successor just like unconditional branches. --- gcc/bb-reorder.c | 6 +++--- 1 file changed, 3 insertions(+), 3

Re: [PATCH] Fix bb-reorder problem with degenerate cond_jump (PR68182)

2015-11-08 Thread Segher Boessenkool
On Sun, Nov 08, 2015 at 08:21:47PM -0700, Jeff Law wrote: > On 11/08/2015 08:09 PM, Segher Boessenkool wrote: > >The code mistakenly thinks any cond_jump has two successors. This is > >not true if both destinations are the same, as can happen with weird > >

[PATCH 2/2] rs6000: Extend 20050603-3.c testcase to 64-bit

2015-11-08 Thread Segher Boessenkool
The testcase used to fail on 64-bit, but it was disabled there. This patch makes it run there, and beefs up the checking of the generated code a bit. Tested on powerpc64-linux *-m32,-m32/-mpowerpc64,-m64). Is this okay for trunk? Segher 2015-11-09 Segher Boessenkool gcc/testsuite

[PATCH 1/2] simplify-rtx: Simplify trunc of and of shiftrt

2015-11-08 Thread Segher Boessenkool
for trunk? Segher 2015-11-09 Segher Boessenkool * gcc/simplify-rtx.c (simplify_truncation): Simplify TRUNCATE of AND of [LA]SHIFTRT. --- gcc/simplify-rtx.c | 25 + 1 file changed, 25 insertions(+) diff --git a/gcc/simplify-rtx.c b/gcc/simplify-r

Re: [PATCH][combine][RFC] Don't transform sign and zero extends inside mults

2015-11-09 Thread Segher Boessenkool
On Mon, Nov 09, 2015 at 08:52:13AM +0100, Uros Bizjak wrote: > On Sun, Nov 8, 2015 at 9:58 PM, Segher Boessenkool > wrote: > > On Fri, Nov 06, 2015 at 04:00:08PM -0600, Segher Boessenkool wrote: > >> This patch stops combine from generating widening muls of anything el

Re: [PATCH], Add power9 support to GCC, patch #2 (add modulus instructions)

2015-11-09 Thread Segher Boessenkool
operand" "") > + (match_operand:GPR 2 "reg_or_cint_operand" "")))] You could delete the empty constraint strings while you're at it. > +;; On machines with modulo support, do a combined div/mod the old fashioned > +;; method, since the multiply/subtract is faster than doing the mod > instruction > +;; after a divide. You can instead have a "divmod" insn that is split to either of div, mod, or div+mul+sub depending on which of the outputs is unused. Peepholes do not get all cases. This can be a later improvement of course. Segher

Re: [PATCH], Add power9 support to GCC, patch #3 (scalar count trailing zeros)

2015-11-09 Thread Segher Boessenkool
t (match_operand:GPR 0 "gpc_reg_operand" "=r") > + (ctz:GPR (match_operand:GPR 1 "gpc_reg_operand" "r")))] > + "TARGET_CTZ" > + "cnttz %0,%1" > + [(set_attr "type" "cntlz")]) We should probably rename this attr value now. "cntz" maybe? Could be later of course. Segher

Re: [PATCH], Add power9 support to GCC, patch #4

2015-11-09 Thread Segher Boessenkool
("always"). You can write this as {\mlwa\M} for more sanity. > +/* { dg-final { scan-assembler-not "sldi "} } */ > +/* { dg-final { scan-assembler-not "sldi\\. " } } */ Similarly {\msldi\M} catches both. Segher

Re: [PATCH], Add power9 support to GCC, patch #5 (ISA 3.0 fusion)

2015-11-09 Thread Segher Boessenkool
; + (unspec [(const_int 0)] UNSPEC_FUSION_ADDIS) > + (use (match_operand:DI 2 "base_reg_operand" "r,r")) > + (clobber (match_scratch:DI 3 "=X,&b"))] > + "TARGET_TOC_FUSION_INT" Do you need that "??r" alternative? Same for the next define_insn. Big patch, most looks good :-) Segher

Re: [PATCH], Add power9 support to GCC, patch #6 (IEEE 128-bit hardware support)

2015-11-09 Thread Segher Boessenkool
:SI 1 "nonimmediate_operand" "r,Z,r,Z") > + (match_operand:SI 2 "const_0_to_1_operand" "O,O,n,n")] > + UNSPEC_IEEE128_MOVE))] > + "TARGET_FLOAT128_HW" > + "@ > + mtvsrwa %x0,%1 > + lxsiwax %x0,%y1 > + mtvsrwz %x0,%1 > + lxsiwzx %x0,%y1" > + [(set_attr "type" "mffgpr,fpload,mffgpr,fpload")]) Tricky, is there no cleaner way to do this? Segher

Re: [PATCH], Add power9 support to GCC, patch #3 (scalar count trailing zeros)

2015-11-09 Thread Segher Boessenkool
z" maybe? Could be > > later of course. > > I don't see a need to add another type attribute for count trailing zeros > unless count leading zeros has a different timing than count trailing zeros. I didn't suggest adding a "cnttz"; I suggested renaming "cntlz". Maybe "ctz" is better, that's what the target flag is as well. Cheers, Segher

Re: [PATCH], Add power9 support to GCC, patch #4

2015-11-09 Thread Segher Boessenkool
On Mon, Nov 09, 2015 at 12:27:34PM -0500, Michael Meissner wrote: > On Mon, Nov 09, 2015 at 10:29:10AM -0600, Segher Boessenkool wrote: > > On Sun, Nov 08, 2015 at 07:39:14PM -0500, Michael Meissner wrote: > > > +;; Pretend we have a memory form of extswsli until register allocat

Re: [PATCH], Add power9 support to GCC, patch #5 (ISA 3.0 fusion)

2015-11-09 Thread Segher Boessenkool
ed for the ADDIS instruction), but it can be used for power9 > fusion (where the ADDIS must be adjancent, but it no longer has to be the > register being loaded). If you have only "b", r0 will not be chosen. Does that help? Or are you generating this pattern from somewhere else where you put in r0? Segher

Re: [PATCH], Add power9 support to GCC, patch #7 (direct move enhancements)

2015-11-09 Thread Segher Boessenkool
On Sun, Nov 08, 2015 at 07:48:56PM -0500, Michael Meissner wrote: > This patch adds support for the new direct move instructions (MFVSRLD and > MTVSRDD) that simplify moving 128-bit data between GPRs and vector registers. You forgot to attach the patch :-) Segher

Re: [PATCH 1/2] simplify-rtx: Simplify trunc of and of shiftrt

2015-11-10 Thread Segher Boessenkool
On Tue, Nov 10, 2015 at 12:16:09PM +0100, Bernd Schmidt wrote: > On 11/09/2015 08:33 AM, Segher Boessenkool wrote: > >If we have > > > > (truncate:M1 (and:M2 (lshiftrt:M2 (x:M2) C) C2)) > > > >we can write it instead as > > > > (a

Re: [PATCH][combine][RFC] Don't transform sign and zero extends inside mults

2015-11-10 Thread Segher Boessenkool
On Mon, Nov 09, 2015 at 03:51:32AM -0600, Segher Boessenkool wrote: > > >From the original patch submission, it looks that this patch would > > also benefit x86_32. > > Yes, that is what I thought too. > > > Regarding the above code size increase - do you perha

Re: [PATCH], Add power9 support to GCC, patch #10 (SFmode/DFmode d-form addressing)

2015-11-10 Thread Segher Boessenkool
, > including d-form addressing). > > Are these patches ok to check in? You forgot the patch again, it must be a curse ;-) Segher

Re: [PATCH 1/2] simplify-rtx: Simplify trunc of and of shiftrt

2015-11-11 Thread Segher Boessenkool
On Tue, Nov 10, 2015 at 10:04:30PM +0100, Bernd Schmidt wrote: > On 11/10/2015 06:44 PM, Segher Boessenkool wrote: > > >Yes I know. All the rest of the code around is it like this though. > >Do you want this written in a saner way? > > I won't object to leaving

Re: [PATCH 1/2] simplify-rtx: Simplify trunc of and of shiftrt

2015-11-13 Thread Segher Boessenkool
ed to match this instruction: > (set (reg:DI 70 [ _2 ]) > (sign_extend:DI (lshiftrt:SI (subreg:SI (reg/v:DI 80 [ x ]) 0) > (const_int 16 [0x10] Somehow, before the patch, it decided to do a zero-extension (where the combined insns had a sign extension). Was that even correct? Maybe many bits of reg 80 (or, hrm, 81 in the orig?!) are known zero? Segher

Re: [PATCH][combine][RFC] Don't transform sign and zero extends inside mults

2015-11-13 Thread Segher Boessenkool
ase, this is some systematic oversight), but we can live > with it. After the patch it will no longer combine an imul reg,reg (+ mov) into an imul mem,reg. _Most_ cases that end up as mem,reg are already expanded as such, but not all. It's hard to make a smallish testcase. Segher

Re: [PATCH], Add power9 support to GCC, patch #5 (ISA 3.0 fusion)

2015-11-14 Thread Segher Boessenkool
d. It seems that without -flto TOC fusion doesn't do much at all, btw? Segher

<    12   13   14   15   16   17   18   19   20   21   >