[Bug target/94121] ICE on aarch64-linux-gnu: in abs_hwi, at hwint.h:324

2020-03-10 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94121 Wilco changed: What|Removed |Added CC||wilco at gcc dot gnu.org --- Comment #3 from

[Bug middle-end/94172] [arm-none-eabi] ICE in expand_debug_locations, at cfgexpand.c:5403

2020-03-16 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94172 Wilco changed: What|Removed |Added CC||wilco at gcc dot gnu.org --- Comment #2 from

[Bug middle-end/94172] [arm-none-eabi] ICE in expand_debug_locations, at cfgexpand.c:5403

2020-03-16 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94172 --- Comment #6 from Wilco --- (In reply to Jakub Jelinek from comment #3) > Can't reproduce on the trunk, neither on x86_64-linux with -Os -g3 > -fshort-enums, nor on arm-linux-gnueabi with -Os -g3 -fshort-enums > -mcpu=cortex-m0 -mthumb I tried

[Bug debug/94502] [aarch64] Missing LR register location in FDE

2020-04-08 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94502 Wilco changed: What|Removed |Added CC||wilco at gcc dot gnu.org --- Comment #2 from

[Bug debug/94502] [aarch64] Missing LR register location in FDE

2020-04-08 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94502 --- Comment #4 from Wilco --- (In reply to Luis Machado from comment #3) > The lack of a rule for LR means GDB will assume the register is UNSPECIFIED. > Is GCC assuming this register is considered to have the same value as an > inner frame? Ri

[Bug tree-optimization/91322] [10 regression] g++.dg/lto/alias-4_0.C test failure

2020-04-09 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91322 --- Comment #15 from Wilco --- (In reply to Richard Biener from comment #14) > So I'm quite sure the missed optimization isn't a regression? (can somebody > quickly check GCC 9 whether the testcase is optimized there on ARM?) It fails on both

[Bug target/94538] [9/10 Regression] ICE: in extract_constrain_insn_cached, at recog.c:2223 (insn does not satisfy its constraints) with -mcpu=cortex-m23 -mslow-flash-data

2020-04-09 Thread wilco at gcc dot gnu.org
||wilco at gcc dot gnu.org Last reconfirmed||2020-04-09 Status|UNCONFIRMED |ASSIGNED Assignee|unassigned at gcc dot gnu.org |wilco at gcc dot gnu.org --- Comment #1 from Wilco --- Thanks for the concise testcase. -mslow

[Bug target/94538] [10 Regression] ICE: in extract_constrain_insn_cached, at recog.c:2223 (insn does not satisfy its constraints) with -mcpu=cortex-m23 -mslow-flash-data

2020-04-09 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94538 Wilco changed: What|Removed |Added Summary|[9/10 Regression] ICE: in |[10 Regression] ICE: in |extra

[Bug target/94538] [10 Regression] ICE: in extract_constrain_insn_cached, at recog.c:2223 (insn does not satisfy its constraints) with -mcpu=cortex-m23 -mslow-flash-data

2020-04-09 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94538 --- Comment #4 from Wilco --- (In reply to Zdenek Sojka from comment #3) > (In reply to Wilco from comment #2) > > This was introduced by commit e24f6408d so only in GCC10. > > Thank you for checking this! > > I am quite sure this fails in gcc-

[Bug target/94538] [10 Regression] ICE: in extract_constrain_insn_cached, at recog.c:2223 (insn does not satisfy its constraints) with -mcpu=cortex-m23 -mslow-flash-data

2020-04-14 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94538 --- Comment #7 from Wilco --- (In reply to Wilco from comment #4) > (In reply to Zdenek Sojka from comment #3) > > (In reply to Wilco from comment #2) > > > This was introduced by commit e24f6408d so only in GCC10. > > > > Thank you for checking

[Bug target/94538] [10 Regression] ICE: in extract_constrain_insn_cached, at recog.c:2223 (insn does not satisfy its constraints) with -mcpu=cortex-m23 -mslow-flash-data

2020-04-14 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94538 --- Comment #10 from Wilco --- (In reply to Christophe Lyon from comment #8) > > Adding Christophe. I'm thinking the best approach right now is to revert > > given -mpure-code doesn't work at all on Thumb-1 targets - it still emits > > literal po

[Bug target/94538] [10 Regression] ICE: in extract_constrain_insn_cached, at recog.c:2223 (insn does not satisfy its constraints) with -mcpu=cortex-m23 -mslow-flash-data

2020-04-16 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94538 --- Comment #13 from Wilco --- (In reply to Christophe Lyon from comment #12) > I've posted a patch to fix the regression for your f3() examples: > https://gcc.gnu.org/pipermail/gcc-patches/2020-April/543993.html Yes that improves some of the ex

[Bug target/94538] [10 Regression] ICE: in extract_constrain_insn_cached, at recog.c:2223 (insn does not satisfy its constraints) with -mcpu=cortex-m23 -mslow-flash-data

2020-04-16 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94538 --- Comment #14 from Wilco --- (In reply to Christophe Lyon from comment #11) > (In reply to Wilco from comment #10) > Right, but the code is functional. It doesn't avoid the literal load from flash which is exactly what pure-code and slow-flas

[Bug target/94538] [10 Regression] ICE: in extract_constrain_insn_cached, at recog.c:2223 (insn does not satisfy its constraints) with -mcpu=cortex-m23 -mslow-flash-data

2020-04-17 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94538 --- Comment #16 from Wilco --- (In reply to Christophe Lyon from comment #15) > (In reply to Wilco from comment #14) > > (In reply to Christophe Lyon from comment #11) > > > (In reply to Wilco from comment #10) > > > > > Right, but the code is f

[Bug middle-end/94715] New: Squared multiplies are incorrectly signextended

2020-04-22 Thread wilco at gcc dot gnu.org
: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: wilco at gcc dot gnu.org Target Milestone: --- The following example generates incorrect code with -O2: unsigned long long f (int x) { unsigned int t = x * x; return t; } On AArch64 I get: mul w0, w0, w0

[Bug middle-end/94715] Squared multiplies are incorrectly signextended

2020-04-23 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94715 Wilco changed: What|Removed |Added Last reconfirmed||2020-04-23 Ever confirmed|0

[Bug tree-optimization/94787] Failure to detect single bit popcount pattern

2020-04-29 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94787 Wilco changed: What|Removed |Added CC||wilco at gcc dot gnu.org --- Comment #3 from

[Bug target/94789] Failure to take advantage of shift operand semantics to turn subtraction into negate

2020-04-29 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94789 Wilco changed: What|Removed |Added CC||wilco at gcc dot gnu.org --- Comment #4 from

[Bug target/94538] [9/10 Regression] ICE: in extract_constrain_insn_cached, at recog.c:2223 (insn does not satisfy its constraints) with -mcpu=cortex-m23 -mslow-flash-data

2020-04-30 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94538 --- Comment #19 from Wilco --- Yes I have a GCC9.3 build now, this fails too.

[Bug target/95285] AArch64:aarch64 medium code model proposal

2020-05-26 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95285 Wilco changed: What|Removed |Added CC||wilco at gcc dot gnu.org --- Comment #2 from

[Bug target/95285] AArch64:aarch64 medium code model proposal

2020-05-26 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95285 --- Comment #4 from Wilco --- (In reply to Bu Le from comment #3) > (In reply to Wilco from comment #2) > > > Is the main usage scenario huge arrays? If so, these could easily be > > allocated via malloc at startup rather than using bss. It mean

[Bug target/95285] AArch64:aarch64 medium code model proposal

2020-05-26 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95285 --- Comment #5 from Wilco --- (In reply to Bu Le from comment #0) Also it would be much more efficient to have a relocation like this if you wanted a 48-bit PC-relative offset: adrpx0, bar1.2782 add x0, x0, :lo12:bar1.2782 movkx0, :

[Bug target/95285] AArch64:aarch64 medium code model proposal

2020-05-27 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95285 --- Comment #8 from Wilco --- (In reply to Bu Le from comment #6) > (In reply to Wilco from comment #4) > > (In reply to Bu Le from comment #3) > > > (In reply to Wilco from comment #2) > > > Well the question is whether we're talking about more

[Bug target/95285] AArch64:aarch64 medium code model proposal

2020-05-27 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95285 --- Comment #9 from Wilco --- (In reply to Bu Le from comment #7) > (In reply to Wilco from comment #5) > > (In reply to Bu Le from comment #0) > > > > Also it would be much more efficient to have a relocation like this if you > > wanted a 48-bi

[Bug tree-optimization/88398] vectorization failure for a small loop to do byte comparison

2020-05-27 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88398 Wilco changed: What|Removed |Added CC||wilco at gcc dot gnu.org --- Comment #29 from

[Bug target/95285] AArch64:aarch64 medium code model proposal

2020-05-27 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95285 --- Comment #12 from Wilco --- (In reply to Bu Le from comment #10) > > Fortran already has -fstack-arrays to decide between allocating arrays on > > the heap or on the stack. > > I tried the flag with my example. The fstack-array seems cannot m

[Bug target/95285] AArch64:aarch64 medium code model proposal

2020-05-27 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95285 --- Comment #13 from Wilco --- (In reply to Bu Le from comment #11) > > > You're right, we need an extra add, so it's like this: > > > > adrpx0, bar1.2782 > > movkx1, :high32_47:bar1.2782 > > add x0, x0, x1 > > add x0, x0,

[Bug tree-optimization/88398] vectorization failure for a small loop to do byte comparison

2020-05-27 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88398 --- Comment #31 from Wilco --- (In reply to Jiu Fu Guo from comment #30) > (In reply to Wilco from comment #29) > > The key question remains whether it is legal to assume the limit implies the > > memory is valid and use wider accesses. > If un

[Bug tree-optimization/88398] vectorization failure for a small loop to do byte comparison

2020-05-28 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88398 --- Comment #34 from Wilco --- (In reply to Jiu Fu Guo from comment #33) > It would be relatively easy if the target supports unaligned access. like > read64ne in > https://git.tukaani.org/?p=xz.git;a=blob;f=src/liblzma/common/memcmplen.h > Then

[Bug target/95285] AArch64:aarch64 medium code model proposal

2020-05-28 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95285 --- Comment #15 from Wilco --- (In reply to Bu Le from comment #14) > > > Anyway, my point is that the size of single data does't affact the fact > > > that > > > medium code model is missing in aarch64 and aarch64 is lack of PIC large > > > cod

[Bug target/94986] missing diagnostic on ARM thumb2 compilation with -pg when using r7 in inline asm

2020-06-03 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94986 Wilco changed: What|Removed |Added CC||wilco at gcc dot gnu.org --- Comment #1 from

[Bug target/94986] missing diagnostic on ARM thumb2 compilation with -pg when using r7 in inline asm

2020-06-03 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94986 Wilco changed: What|Removed |Added Resolution|--- |INVALID Status|UNCONFIRMED

[Bug tree-optimization/88398] vectorization failure for a small loop to do byte comparison

2020-06-08 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88398 --- Comment #40 from Wilco --- (In reply to Jiu Fu Guo from comment #39) > I’m thinking to draft a patch for this optimization. If any suggestions, > please point out, thanks. Which optimization to be precise? Besides unrolling I haven't seen a

[Bug tree-optimization/88398] vectorization failure for a small loop to do byte comparison

2020-06-10 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88398 --- Comment #44 from Wilco --- (In reply to Jiu Fu Guo from comment #43) > To handle vectorization for this kind of code, it needs to overcome the hard > issue mentioned in comment #5: the loop has 2 exits. Yes and that also implies vector loads

[Bug target/95650] aarch64: Missed optimization storing addition of two shorts

2020-06-12 Thread wilco at gcc dot gnu.org
|1 CC||wilco at gcc dot gnu.org Status|UNCONFIRMED |NEW --- Comment #4 from Wilco --- (In reply to Alex Coplan from comment #3) > I think clang's optimisation is sound here. > > C says that we add

[Bug target/96191] aarch64 stack_protect_test canary leak

2020-07-14 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96191 Wilco changed: What|Removed |Added CC||wilco at gcc dot gnu.org --- Comment #2 from

[Bug tree-optimization/95731] Faiilure to optimize a >= 0 && b >= 0 to (a | b) >= 0

2020-08-04 Thread wilco at gcc dot gnu.org
||2020-08-04 CC||wilco at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #3 from Wilco --- (In reply to Gabriel Ravier from comment #0) > bool f(int a, int b) > { > return a >= 0 && b >=

[Bug target/96768] -mpure-code produces switch tables for thumb-1

2020-08-28 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96768 Wilco changed: What|Removed |Added CC||wilco at gcc dot gnu.org --- Comment #5 from

[Bug c/92172] ARM Thumb2 frame pointers inconsistent with clang

2019-10-21 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92172 Wilco changed: What|Removed |Added CC||wilco at gcc dot gnu.org --- Comment #1 from

[Bug c/92172] ARM Thumb2 frame pointers inconsistent with clang

2019-10-23 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92172 --- Comment #4 from Wilco --- (In reply to Seth LaForge from comment #2) > Good point on frame pointers vs a frame chain for unwinding. I'm looking for > the unwindable frame chain. > > Wilco: > > Why does this matter? Well as your examples show

[Bug c/92172] ARM Thumb2 frame pointers inconsistent with clang

2019-10-23 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92172 --- Comment #6 from Wilco --- (In reply to Seth LaForge from comment #5) > GCC 8: > push{r7, lr} > sub sp, sp, #8 > add r7, sp, #0 > str r0, [r7, #4] > ... > > Clang 9: > push{

[Bug target/91766] -fvisibility=hidden during -fpic still uses GOT indirection on arm64

2019-10-24 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91766 --- Comment #12 from Wilco --- (In reply to Andrew Pinski from comment #10) > This should be a global change and not just an aarch64 change. The reason > is because then aarch64 is the odd man out when it comes to this. Agreed, see https://gcc

[Bug c/85678] -fno-common should be default

2019-10-25 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85678 Wilco changed: What|Removed |Added CC||wilco at gcc dot gnu.org --- Comment #6 from

[Bug target/91766] -fvisibility=hidden during -fpic still uses GOT indirection on arm64

2019-10-25 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91766 --- Comment #13 from Wilco --- (In reply to Wilco from comment #12) > (In reply to Andrew Pinski from comment #10) > > > This should be a global change and not just an aarch64 change. The reason > > is because then aarch64 is the odd man out wh

[Bug target/91927] -mstrict-align doesn't prevent unaligned accesses at -O2 and -O3 on AARCH64 targets

2019-10-30 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91927 Wilco changed: What|Removed |Added CC||wilco at gcc dot gnu.org --- Comment #8 from

[Bug rtl-optimization/92294] New: alias attribute generates incorrect code

2019-10-30 Thread wilco at gcc dot gnu.org
-optimization Assignee: unassigned at gcc dot gnu.org Reporter: wilco at gcc dot gnu.org Target Milestone: --- The following example (from gcc.c-torture/execute/alias-2.c) always calls abort on any AArch64 compiler with -O1 or -O2: static int a[10]; extern int b[10] __attribute__

[Bug rtl-optimization/92294] alias attribute generates incorrect code

2019-10-30 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92294 Wilco changed: What|Removed |Added Target||aarch64 Target Milestone|---

[Bug rtl-optimization/92294] alias attribute generates incorrect code

2019-10-31 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92294 Wilco changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed|

[Bug c++/92425] Incorrect logical AND on 64bit variable using 32bit register

2019-11-08 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92425 Wilco changed: What|Removed |Added CC||wilco at gcc dot gnu.org --- Comment #1 from

[Bug target/92462] [arm32] -ftree-pre makes a variable to be wrongly hoisted out

2019-11-11 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92462 Wilco changed: What|Removed |Added CC||wilco at gcc dot gnu.org --- Comment #1 from

[Bug target/92462] [arm32] -ftree-pre makes a variable to be wrongly hoisted out

2019-11-12 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92462 Wilco changed: What|Removed |Added Resolution|INVALID |FIXED --- Comment #7 from Wilco --- (In reply t

[Bug target/92462] [arm32] -ftree-pre makes a variable to be wrongly hoisted out

2019-11-14 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92462 --- Comment #16 from Wilco --- (In reply to Richard Biener from comment #15) > I can't find PRE doing anything wrong and on 32bit x86_64 the testcase > executes > correctly with GCC 7.3 and GCC 9 (when I add the missing return to > Bar::cmpxchg).

[Bug target/92462] [arm32] -ftree-pre makes a variable to be wrongly hoisted out

2019-11-15 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92462 --- Comment #19 from Wilco --- (In reply to Richard Biener from comment #18) > So I see before DSE1: > > (insn 16 15 17 2 (set (mem/c:SI (plus:SI (reg/f:SI 102 sfp) > (const_int -8 [0xfff8])) [1 cur+0 S4 A64]) >

[Bug target/92462] [arm32] -ftree-pre makes a variable to be wrongly hoisted out

2019-11-18 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92462 --- Comment #23 from Wilco --- (In reply to Richard Biener from comment #22) > Fixed on trunk. Can arm people verify? I checked the DSE dump only. Bonus > if you manage to create a testcase for the testsuite failing before, passing > now. > >

[Bug target/79262] [8/9/10 Regression] load gap with store gap causing performance regression in 462.libquantum

2019-11-19 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79262 --- Comment #8 from Wilco --- Author: wilco Date: Tue Nov 19 15:57:54 2019 New Revision: 278452 URL: https://gcc.gnu.org/viewcvs?rev=278452&root=gcc&view=rev Log: [AArch64] PR79262: Adjust vector cost PR79262 has been fixed for almost all AArch

[Bug target/79262] [8/9/10 Regression] load gap with store gap causing performance regression in 462.libquantum

2019-11-19 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79262 Wilco changed: What|Removed |Added Status|NEW |RESOLVED Resolution|---

[Bug tree-optimization/53947] [meta-bug] vectorizer missed-optimizations

2019-11-19 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947 Bug 53947 depends on bug 79262, which changed state. Bug 79262 Summary: [8/9/10 Regression] load gap with store gap causing performance regression in 462.libquantum https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79262 What|Removed

[Bug c/92612] [10 Regression] Linker error in 525.x264_r after r278509

2019-11-21 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92612 --- Comment #2 from Wilco --- (In reply to Martin Liška from comment #1) > Following patch fixes that: > > diff --git a/benchspec/CPU/525.x264_r/src/ldecod_src/inc/configfile.h > b/benchspec/CPU/525.x264_r/src/ldecod_src/inc/configfile.h > index

[Bug target/91927] -mstrict-align doesn't prevent unaligned accesses at -O2 and -O3 on AARCH64 targets

2019-11-21 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91927 --- Comment #10 from Wilco --- (In reply to Andrew Pinski from comment #9) > I think the following patch is the correct fix: > diff --git a/gcc/config/aarch64/aarch64-simd.md > b/gcc/config/aarch64/aarch64-simd.md > index ad4676bc167..787323255cb

[Bug rtl-optimization/92637] runtime issue with -ftree-coalesce-vars

2019-11-23 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92637 Wilco changed: What|Removed |Added CC||wilco at gcc dot gnu.org --- Comment #2 from

[Bug c/85678] -fno-common should be default

2019-11-25 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85678 --- Comment #8 from Wilco --- (In reply to David Binderman from comment #7) > (In reply to David Brown from comment #0) > > Surely it is time to make "-fno-common" the default, at least when a modern > > C standard is specified indicating that th

[Bug c/85678] -fno-common should be default

2019-11-25 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85678 --- Comment #12 from Wilco --- (In reply to David Brown from comment #11) > Changing the default to "-fno-common" (and ideally > "-Werror=strict-prototypes -Werror=old-style-declaration > -Werror=missing-parameter-type") would have a lot smaller

[Bug target/92665] [AArch64] low lanes select not optimized out for vmlal intrinsics

2019-11-25 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92665 Wilco changed: What|Removed |Added CC||wilco at gcc dot gnu.org --- Comment #3 from

[Bug target/92692] [9/10 Regression] Saving off the callee saved register between ldxr/stxr (caused by shrink wrapping improvements)

2019-11-27 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92692 Wilco changed: What|Removed |Added CC||wilco at gcc dot gnu.org --- Comment #3 from

[Bug target/92692] [9/10 Regression] Saving off the callee saved register between ldxr/stxr (caused by shrink wrapping improvements)

2019-11-27 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92692 --- Comment #5 from Wilco --- (In reply to Andrew Pinski from comment #4) > (In reply to Wilco from comment #3) > > (In reply to Andrew Pinski from comment #2) > > > I think this has been a latent bug since revision 243200: > > > [AArch64] Se

[Bug driver/89014] Use-after-free in aarch64 -march=native

2019-11-29 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89014 --- Comment #9 from Wilco --- Author: wilco Date: Fri Nov 29 17:22:30 2019 New Revision: 278854 URL: https://gcc.gnu.org/viewcvs?rev=278854&root=gcc&view=rev Log: aarch64: fix use-after-free in -march=native (PR driver/89014) Running: $ valgr

[Bug tree-optimization/92738] [10 regression] Large code size growth for -O2 binaries between 2019-05-19...2019-05-29

2019-12-02 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92738 Wilco changed: What|Removed |Added CC||wilco at gcc dot gnu.org --- Comment #7 from

[Bug tree-optimization/92738] [10 regression] Large code size growth for -O2 binaries between 2019-05-19...2019-05-29

2019-12-02 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92738 --- Comment #9 from Wilco --- (In reply to Martin Liška from comment #8) > (In reply to Wilco from comment #7) > > (In reply to Martin Liška from comment #6) > > > So wrf grew starting with r271377, size (w/o debug info) goes from > > > 20164464

[Bug tree-optimization/92738] [10 regression] Large code size growth for -O2 binaries between 2019-05-19...2019-05-29

2019-12-02 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92738 --- Comment #11 from Wilco --- (In reply to Thomas Koenig from comment #10) > (In reply to Martin Liška from comment #6) > > So wrf grew starting with r271377, size (w/o debug info) goes from 20164464B > > to 23674792. > > I think we've had this

[Bug tree-optimization/92822] [10 Regression] vfma_laneq_f32 and vmul_laneq_f32 are broken on aarch64 after r278938

2019-12-12 Thread wilco at gcc dot gnu.org
, ||wilco at gcc dot gnu.org Component|target |tree-optimization --- Comment #4 from Wilco --- (In reply to nsz from comment #2) > e.g. > > #include > > float32x2_t > foo (float32x2_t v0, float32x4_t v1) > { > re

[Bug rtl-optimization/93007] New: [10 regression] pr77698.c testcase fails due to block commoning

2019-12-19 Thread wilco at gcc dot gnu.org
Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: wilco at gcc dot gnu.org Target Milestone: --- Since r276960 we see this failure on Arm: FAIL: gcc.dg/tree-prof/pr77698.c scan-rtl-dump-times alignments "internal

[Bug tree-optimization/93023] give preference to address iv without offset in ivopts

2019-12-23 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93023 Wilco changed: What|Removed |Added CC||wilco at gcc dot gnu.org --- Comment #1 from

[Bug tree-optimization/90838] Detect table-based ctz implementation

2020-01-10 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90838 --- Comment #8 from Wilco --- Author: wilco Date: Fri Jan 10 19:32:53 2020 New Revision: 280132 URL: https://gcc.gnu.org/viewcvs?rev=280132&root=gcc&view=rev Log: PR90838: Support ctz idioms Support common idioms for count trailing zeroes using

[Bug bootstrap/93229] simplify_count_trailing_zeroes doesn't compile on x86_64-pc-linux-gnu

2020-01-10 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93229 --- Comment #2 from Wilco --- (In reply to David Malcolm from comment #0) > A pristine checkout of r280132 doesn't build for me on x86_64-pc-linux-gnu: > > ../../src/gcc/tree-ssa-forwprop.c: In function ‘bool > simplify_count_trailing_zeroes(gim

[Bug bootstrap/93229] simplify_count_trailing_zeroes doesn't compile on x86_64-pc-linux-gnu

2020-01-10 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93229 --- Comment #4 from Wilco --- (In reply to David Malcolm from comment #3) > Apparently broken on other archs too, and for other people; from #gcc: > > nathan: I assume it's not just broken for me; I'm somewhat > sleep-deprived here > dmalcolm:

[Bug tree-optimization/93231] [10 Regression] ICEs since r280132

2020-01-13 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93231 --- Comment #4 from Wilco --- (In reply to Jakub Jelinek from comment #0) > int ctz2 (int x) > { > static const char table[32] = > { > 0, 1, 28, 2, 29, 14, 24, 3, 30, 22, 20, 15, 25, 17, 4, 8, > 31, 27, 13, 23, 21, 19, 16, 7, 26

[Bug tree-optimization/93231] [10 Regression] ICEs since r280132

2020-01-13 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93231 Wilco changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |wilco at gcc dot gnu.org --- Comment #6

[Bug target/92692] [9/10 Regression] Saving off the callee saved register between ldxr/stxr (caused by shrink wrapping improvements)

2020-01-15 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92692 --- Comment #10 from Wilco --- (In reply to Jakub Jelinek from comment #9) > Any -march= or similar? Can't reproduce with current trunk, nor > with even Oct 10 GCC snapshot (crosses in both cases). > grep -B1 stxr pr92692.s > doesn't show any st

[Bug target/92692] [9/10 Regression] Saving off the callee saved register between ldxr/stxr (caused by shrink wrapping improvements)

2020-01-15 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92692 Wilco changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |wilco at gcc dot gnu.org --- Comment #13

[Bug tree-optimization/93231] [10 Regression] ICEs since r280132

2020-01-16 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93231 Wilco changed: What|Removed |Added Status|NEW |RESOLVED Resolution|---

[Bug target/71727] -O3 -mstrict-align produces code which assumes unaligned vector accesses work

2020-01-19 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71727 Wilco changed: What|Removed |Added CC||wilco at gcc dot gnu.org --- Comment #8 from

[Bug target/92692] Saving off the callee saved register between ldxr/stxr (caused by shrink wrapping improvements)

2020-01-27 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92692 Wilco changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|---

[Bug middle-end/64242] Longjmp expansion incorrect

2020-01-27 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64242 --- Comment #37 from Wilco --- (In reply to Andrew Pinski from comment #36) > MIPS is still broken. I might look into MIPS brokenness next week. Yes it seems builtin_longjmp has the exact same fp corruption issue: move$fp,$17

[Bug rtl-optimization/93565] New: Combine duplicates count trailing zero instructions

2020-02-04 Thread wilco at gcc dot gnu.org
Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: wilco at gcc dot gnu.org Target Milestone: --- Created attachment 4 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=4&action=edit ctz_duplication The attached example causes Com

[Bug rtl-optimization/93565] Combine duplicates count trailing zero instructions

2020-02-05 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93565 --- Comment #3 from Wilco --- (In reply to Segher Boessenkool from comment #2) > Of course it first tried to do > > Failed to match this instruction: > (parallel [ > (set (reg:DI 101 [ _9 ]) > (ctz:DI (reg/v:DI 98 [ x ]))) >

[Bug rtl-optimization/93565] [9/10 regression] Combine duplicates instructions

2020-02-11 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93565 Wilco changed: What|Removed |Added CC||segher at kernel dot crashing.org Su

[Bug target/92692] Saving off the callee saved register between ldxr/stxr (caused by shrink wrapping improvements)

2020-02-28 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92692 --- Comment #22 from Wilco --- (In reply to Sebastian Pop from comment #21) > It looks like this hunk from the trunk version of the patch is missing on > gcc-9 branch: > > diff --git a/gcc/config/aarch64/atomics.md b/gcc/config/aarch64/atomics.m

[Bug target/91598] [8/9/10 regression] 60% speed drop on neon intrinsic loop

2020-03-03 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91598 --- Comment #4 from Wilco --- Fixing vmull_lane_s16 and vmlal_lane_s16 to avoid inline assembler gives this schedule which runs 63% faster on Cortex-A53: ldr d2, [x6, x0] ldr d4, [x6, x3] ldr d3, [x6, x2]

[Bug tree-optimization/88398] vectorization failure for a small loop to do byte comparison

2018-12-07 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88398 Wilco changed: What|Removed |Added CC||wilco at gcc dot gnu.org --- Comment #8 from

[Bug middle-end/64242] Longjmp expansion incorrect

2018-12-07 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64242 --- Comment #21 from Wilco --- (In reply to Rainer Orth from comment #20) > The new testcase also FAILs on sparc-sun-solaris2.11 (both 32 and 64-bit): > > +FAIL: gcc.c-torture/execute/pr64242.c -O2 execution test > +FAIL: gcc.c-torture/execut

[Bug middle-end/88560] [9 Regression] armv8_2-fp16-move-1.c and related regressions after r260385

2018-12-20 Thread wilco at gcc dot gnu.org
||2018-12-20 CC||wilco at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #2 from Wilco --- Eg. before test_load_store_1: ldrhr3, [r2, r1, lsl #1]@ __fp16 strhr3, [r0, r1, lsl #1

[Bug target/86891] [9 Regression] wrong code with -O -frerun-cse-after-loop -fno-tree-dominator-opts -fno-tree-fre

2018-12-20 Thread wilco at gcc dot gnu.org
||2018-12-20 CC||wilco at gcc dot gnu.org Assignee|unassigned at gcc dot gnu.org |wilco at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #3 from Wilco --- (In reply to Jakub Jelinek from comment

[Bug target/86891] [9 Regression] __builtin_sub_overflow incorrect for unsigned types

2019-01-03 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86891 --- Comment #5 from Wilco --- (In reply to Richard Earnshaw from comment #4) > Yes, the extension should be zero-extend, not sign extend. The plus > operation is correct, however, since decrementing the first operand could > lead to underflow if

[Bug tree-optimization/88398] vectorization failure for a small loop to do byte comparison

2019-01-04 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88398 --- Comment #11 from Wilco --- (In reply to Jakub Jelinek from comment #10) > If the compiler knew say from PGO that pos is usually a multiple of certain > power of two and that the loop usually iterates many times (I guess the > latter can be de

[Bug tree-optimization/88398] vectorization failure for a small loop to do byte comparison

2019-01-07 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88398 --- Comment #13 from Wilco --- So to add some real numbers to the discussion, the average number of iterations is 4.31. Frequency stats (16 includes all iterations > 16 too): 1: 29.0 2: 4.2 3: 1.0 4: 36.7 5: 8.7 6: 3.4 7: 3.0 8: 2.6 9: 2.1 10: 1

[Bug target/86891] [9 Regression] __builtin_sub_overflow incorrect for unsigned types

2019-01-07 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86891 --- Comment #10 from Wilco --- (In reply to Richard Earnshaw from comment #9) > Fixed. Yes looking at git blame all of the addv/subv support was added for GCC9, so no backporting is needed.

[Bug middle-end/88739] Big-endian union bug

2019-01-07 Thread wilco at gcc dot gnu.org
||2019-01-07 CC||wilco at gcc dot gnu.org Component|target |middle-end Summary|union bug on ARM64 |Big-endian union bug Ever confirmed|0 |1 --- Comment #1 from Wilco

[Bug middle-end/88739] Big-endian union bug

2019-01-07 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88739 --- Comment #3 from Wilco --- (In reply to Richard Earnshaw from comment #2) > > _23 = BIT_FIELD_REF <_2, 16, 0>;// WRONG: should be _2, 14, 0 > > _2 is declared as a 30-bit integer, so perhaps the statement is right, but > expand

[Bug tree-optimization/88739] [7/8/9 Regression] Big-endian union bug

2019-01-08 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88739 --- Comment #16 from Wilco --- (In reply to Richard Biener from comment #8) > So I think part of a fix would be the following. Not sure if > REG_WORDS_BIG_ENDIAN or FLOAT_WORDS_BIG_ENDIAN come into play. > With the fix we no longer simplify this

[Bug tree-optimization/88398] vectorization failure for a small loop to do byte comparison

2019-01-08 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88398 --- Comment #15 from Wilco --- (In reply to rguent...@suse.de from comment #14) > On Mon, 7 Jan 2019, wilco at gcc dot gnu.org wrote: > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88398 > > > > --- Comment #13 f

  1   2   3   4   5   6   7   8   >