[Bug target/96827] [10/11 Regression] __m128i from _mm_set_epi32 is backwards with -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96827 --- Comment #8 from Joel Hutton --- I'm working on this. I believe this may have been introduced by my earlier SLP vector constructor patch.(commit 10d1592) What I believe to be the relevant section: + else if (constructor) +{ + tree rhs = gimple_assign_rhs1 (stmt_info->stmt); + tree val; + FOR_EACH_CONSTRUCTOR_VALUE (CONSTRUCTOR_ELTS (rhs), i, val) + { + if (TREE_CODE (val) == SSA_NAME) + { + gimple* def = SSA_NAME_DEF_STMT (val); + stmt_vec_info def_info = vinfo->lookup_stmt (def); + /* Value is defined in another basic block. */ + if (!def_info) + return false; + scalar_stmts.safe_push (def_info); + } + else + return false; + } +} I'm investigating, but I suspect pushing to a stack which is then popped from later has created a reversal of element order.
[Bug other/92366] new test case gcc.dg/vect/bb-slp-41.c fails with its introduction in r277784
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92366 --- Comment #2 from Joel Hutton --- I'm looking into this. The testcase triggered a case with a constructor with a large number of elements (at least on aarch64).
[Bug testsuite/92391] gcc.dg/vect/bb-slp-40.c FAILs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92391 --- Comment #1 from Joel Hutton --- I'm looking into this.
[Bug testsuite/92391] gcc.dg/vect/bb-slp-40.c FAILs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92391 --- Comment #2 from Joel Hutton --- As this fails when it was introduced, and I don't have a SPARC machine to test on, I suggest making this XFAIL on sparc.
[Bug testsuite/92391] gcc.dg/vect/bb-slp-40.c FAILs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92391 --- Comment #4 from Joel Hutton --- Hi Rainer I set up an account with cfarm, and tested on gcc202, the test fails because on SPARC, no constructor is generated, the for whatever reason (see below) making the test not really applicable. I suggest making the test an xfail, so that if at some point in the future SPARC generates a constructor here the test will apply. The other option is to skip it for SPARC. tree output on SPARC at the slp pass: . . . MEM[(char *)d_202 + 25B] = _136; # .MEM_233 = VDEF <.MEM_232> MEM[(char *)d_202 + 26B] = _140; # .MEM_234 = VDEF <.MEM_233> MEM[(char *)d_202 + 27B] = _144; # .MEM_235 = VDEF <.MEM_234> MEM[(char *)d_202 + 28B] = _148; # .MEM_236 = VDEF <.MEM_235> MEM[(char *)d_202 + 29B] = _152; # .MEM_237 = VDEF <.MEM_236> MEM[(char *)d_202 + 30B] = _156; # .MEM_238 = VDEF <.MEM_237> MEM[(char *)d_202 + 31B] = _160; # PT = { D.1522 } (nonlocal, interposable) d_239 = d_202 + 32;
[Bug tree-optimization/86504] vectorization failure for a nest loop
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86504 --- Comment #10 from Joel Hutton --- Should be fixed on trunk
[Bug tree-optimization/86504] vectorization failure for a nest loop
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86504 --- Comment #11 from Joel Hutton --- Should be fixed on trunk by r277784
[Bug testsuite/92391] gcc.dg/vect/bb-slp-40.c FAILs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92391 --- Comment #6 from Joel Hutton --- This should be fixed with Richard Sandifords changes.
[Bug testsuite/92391] gcc.dg/vect/bb-slp-40.c FAILs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92391 --- Comment #9 from Joel Hutton --- Weird, I tested on gcc202. % uname -a Linux gcc202 4.19.0-5-sparc64-smp #1 SMP Debian 4.19.37-6 (2019-07-18) sparc64 GNU/Linux % cat gcc/testsuite/gcc/gcc.sum Test run by joelh on Tue Nov 26 17:22:27 2019 Native configuration is sparc64-unknown-linux-gnu === gcc tests === Schedule of variations: unix Running target unix Running /home/joelh/gcc/src/gcc/testsuite/gcc.dg/vect/vect.exp ... UNSUPPORTED: gcc.dg/vect/bb-slp-40.c UNSUPPORTED: gcc.dg/vect/bb-slp-40.c -flto -ffat-lto-objects === gcc Summary === # of unsupported tests 2 /home/joelh/gcc/objdir/gcc/xgcc version 10.0.0 20191126 (experimental) (GCC)
[Bug testsuite/92391] gcc.dg/vect/bb-slp-40.c FAILs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92391 --- Comment #11 from Joel Hutton --- I see, I think you're right. I was able to replicate the failure when running the whole 'vect' testsuite. I tried the following change: diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp index 5fe1e83492c..a4418a31516 100644 --- a/gcc/testsuite/lib/target-supports.exp +++ b/gcc/testsuite/lib/target-supports.exp @@ -5753,7 +5753,7 @@ proc check_effective_target_vect_bswap { } { # one vector length. proc check_effective_target_vect_char_add { } { -return [check_cached_effective_target_indexed vect_int { +return [check_cached_effective_target_indexed vect_char_add { expr { [istarget i?86-*-*] || [istarget x86_64-*-*] || ([istarget powerpc*-*-*] which appeared to work, however I'm not familiar with how check_cached_effective_target_indexed works, so I'm not sure if this is sufficient.
[Bug testsuite/92391] gcc.dg/vect/bb-slp-40.c FAILs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92391 --- Comment #13 from Joel Hutton --- This appears to no longer be failing in the latest 'gcc-testresults' can this be closed?
[Bug target/93221] [10 Regression] ICE maximum number of generated reload insns per insn achieved (90) on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93221 Joel Hutton changed: What|Removed |Added CC||joel.hutton at arm dot com --- Comment #1 from Joel Hutton --- I'm taking a look at this.
[Bug target/93221] [10 Regression] ICE maximum number of generated reload insns per insn achieved (90) on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93221 --- Comment #5 from Joel Hutton --- There's some problem with inserting an OI before an OI, which requires an OI before it etc. 18: r98:OI=r99:OI REG_DEAD r97:V4SI Inserting insn reload before: 19: r99:OI=r97:V4SI#0 0 Non input pseudo reload: reject++ alt=0,overall=13,losers=2,rld_nregs=4 0 Non pseudo reload: reject++ alt=1,overall=7,losers=1,rld_nregs=2 0 Non input pseudo reload: reject++ 1 Spill pseudo into memory: reject+=3 Using memory insn operand 1: reject+=3 alt=2,overall=19,losers=2 -- refuse Choosing alt 1 in insn 19: (0) Utv (1) w {*aarch64_movoi} Creating newreg=100, assigning class FP_REGS to r100 19: r99:OI=r100:OI Inserting insn reload before: 20: r100:OI=r97:V4SI#0
[Bug target/93221] [10 Regression] ICE maximum number of generated reload insns per insn achieved (90) on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93221 --- Comment #6 from Joel Hutton --- The regression seems to be introduced by this commit: commit 11b8091fb33c894cea20702d3e85389723987910 Author: Eric Botcazou Date: Wed Dec 18 23:03:23 2019 + * ira.c (ira): Use simple LRA algorithm when not optimizing. From-SVN: r279550
[Bug target/93221] [10 Regression] ICE maximum number of generated reload insns per insn achieved (90) on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93221 Joel Hutton changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #9 from Joel Hutton --- Fixed on trunk.
[Bug rtl-optimization/93303] [10 Regression] ICE in lra_constraints.c4948 on aarch64-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93303 Bug 93303 depends on bug 93221, which changed state. Bug 93221 Summary: [10 Regression] ICE maximum number of generated reload insns per insn achieved (90) on aarch64 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93221 What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED
[Bug target/93135] [10 Regression] g++.dg/cpp0x/initlist118.C fails on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93135 Bug 93135 depends on bug 93221, which changed state. Bug 93221 Summary: [10 Regression] ICE maximum number of generated reload insns per insn achieved (90) on aarch64 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93221 What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED
[Bug target/92922] [10 regression] [ilp32] FAIL: gcc.target/aarch64/sve/acle/asm/ldnt1_u32.c -std=c90 -O1 -g -DTEST_FULL (internal compiler error)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92922 Joel Hutton changed: What|Removed |Added CC||joel.hutton at arm dot com --- Comment #1 from Joel Hutton --- This appears to be fixed on trunk.
[Bug target/92922] [10 regression] [ilp32] FAIL: gcc.target/aarch64/sve/acle/asm/ldnt1_u32.c -std=c90 -O1 -g -DTEST_FULL (internal compiler error)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92922 --- Comment #2 from Joel Hutton --- This was fixed by Richard Sandiford's patch. commit fb15e2bab5267213b8706fa6a29eeef94f62a524 Author: Richard Sandiford Date: Mon Jan 20 19:29:25 2020 + aarch64: Fix SVE ACLE handling of SImode pointers
[Bug tree-optimization/85804] [8/9/10 Regression][AArch64] Mis-compilation of loop with strided array access and xor reduction
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85804 Joel Hutton changed: What|Removed |Added CC||joel.hutton at arm dot com --- Comment #9 from Joel Hutton --- This was fixed on trunk by 69f8c1ae (From SVN: r276700)
[Bug tree-optimization/86504] vectorization failure for a nest loop
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86504 Joel Hutton changed: What|Removed |Added CC||joel.hutton at arm dot com --- Comment #8 from Joel Hutton --- (In reply to Richard Biener from comment #3) Hi Richard, > So the vectorization issue would be that basic-block vectorization doesn't > catch this in a very nice way - on x86 we pull out the invariant computation > and have a vectorized (outer) loop storing to d. Just a small clarification, do you mean to say that there is a difference between the way x86 and aarch64 handle this, as far as I can see they handle this in the same way.
[Bug target/96827] [10/11 Regression] __m128i from _mm_set_epi32 is backwards with -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96827 Joel Hutton changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #9 from Joel Hutton --- This is fixed on trunk by 97b798d8, unfortunately I made a typo in the commit message.
[Bug libgomp/96837] A false if clause in "omp parallel" seriously affects the performance
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96837 Joel Hutton changed: What|Removed |Added CC||joel.hutton at arm dot com --- Comment #5 from Joel Hutton --- Sorry, my commit does not address this bug, I made a typo with PR number in the commit message.
[Bug target/96827] [10/11 Regression] __m128i from _mm_set_epi32 is backwards with -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96827 Joel Hutton changed: What|Removed |Added Resolution|--- |FIXED Status|REOPENED|RESOLVED --- Comment #14 from Joel Hutton --- backported to GCC 10.