[Bug tree-optimization/88492] SLP optimization generates ugly code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88492 ptomsich at gcc dot gnu.org changed: What|Removed |Added CC||ptomsich at gcc dot gnu.org --- Comment #6 from ptomsich at gcc dot gnu.org --- With the current master, the test case generates (with -mcpu=neoverse-n1): .arch armv8.2-a+crc+fp16+rcpc+dotprod+profile .file "pr88492.c" .text .align 2 .p2align 5,,15 .global test_slp .type test_slp, %function test_slp: .LFB0: .cfi_startproc ldr q2, [x0] adrpx1, .LC0 ldr q16, [x1, #:lo12:.LC0] uxtlv4.8h, v2.8b uxtl2 v2.8h, v2.16b uxtlv0.4s, v4.4h uxtlv6.4s, v2.4h uxtl2 v4.4s, v4.8h uxtl2 v2.4s, v2.8h mov v1.16b, v0.16b mov v7.16b, v6.16b mov v5.16b, v4.16b mov v3.16b, v2.16b tbl v0.16b, {v0.16b - v1.16b}, v16.16b tbl v6.16b, {v6.16b - v7.16b}, v16.16b tbl v4.16b, {v4.16b - v5.16b}, v16.16b tbl v2.16b, {v2.16b - v3.16b}, v16.16b add v0.4s, v0.4s, v4.4s add v6.4s, v6.4s, v2.4s add v0.4s, v0.4s, v6.4s addvs0, v0.4s fmovw0, s0 ret .cfi_endproc .LFE0: .size test_slp, .-test_slp which contrasts with LLVM13 (with -mcpu=neoverse-n1): test_slp: // @test_slp .cfi_startproc // %bb.0: // %entry ldr q0, [x0] moviv1.16b, #1 moviv2.2d, # udotv2.4s, v0.16b, v1.16b addvs0, v2.4s fmovw0, s0 ret .Lfunc_end0: .size test_slp, .Lfunc_end0-test_slp or (LLVM13 w/o the mcpu-option): .type test_slp,@function test_slp: // @test_slp .cfi_startproc // %bb.0: // %entry ldr q0, [x0] ushll2 v1.8h, v0.16b, #0 ushll v0.8h, v0.8b, #0 uaddl2 v2.4s, v0.8h, v1.8h uaddl v0.4s, v0.4h, v1.4h add v0.4s, v0.4s, v2.4s addvs0, v0.4s fmovw0, s0 ret .Lfunc_end0: .size test_slp, .Lfunc_end0-test_slp
[Bug ipa/98748] Increased precision for points-to analysis
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98748 ptomsich at gcc dot gnu.org changed: What|Removed |Added CC||ptomsich at gcc dot gnu.org --- Comment #1 from ptomsich at gcc dot gnu.org --- The link (through the author's website) for [0] is https://zenodo.org/record/61898/files/cclyzer.pdf
[Bug target/108589] ICE: RTL check: expected code 'reg', have 'subreg' in rhs_regno, at rtl.h:1932 with -mtune=ampere1a -fno-split-wide-types
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108589 ptomsich at gcc dot gnu.org changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |ptomsich at gcc dot gnu.org Status|NEW |ASSIGNED --- Comment #2 from ptomsich at gcc dot gnu.org --- We'll handle this one.
[Bug target/108589] ICE: RTL check: expected code 'reg', have 'subreg' in rhs_regno, at rtl.h:1932 with -mtune=ampere1a -fno-split-wide-types
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108589 --- Comment #3 from ptomsich at gcc dot gnu.org --- I can confirm that the issue reproduces (as expected) with the '--enable-checking=rtl' configuration. I'll retest the fix and submit via the list.
[Bug rtl-optimization/105314] [12 Regression] ifcvt regression in noce_try_store_flag_mask
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105314 --- Comment #5 from ptomsich at gcc dot gnu.org --- The fix addresses the issue and generates no new failures on small test cases. Testing against SPEC is still ongoing and I'll report back once that has completed.
[Bug rtl-optimization/105314] [12 Regression] ifcvt regression in noce_try_store_flag_mask
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105314 --- Comment #7 from ptomsich at gcc dot gnu.org --- The transformation for """ long func2 (long a, long b, long c) { if (c) a = 0; else a = 5; return a; } """ into """ 0006 : 6: 00163513seqza0,a2 a: 40a00533neg a0,a0 e: 8915andia0,a0,5 10: 8082ret """ is correct, as we get """ tmp = c ? 0 : -1; a = tmp & 5; """
[Bug target/107786] [13 Regression] ICE in extract_insn, at recog.cc:2791 since r13-4151-gacbb5ef06ee978
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107786 ptomsich at gcc dot gnu.org changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |ptomsich at gcc dot gnu.org --- Comment #1 from ptomsich at gcc dot gnu.org --- Looks like a dependency between a recent change of ours and another change that is still in discussion. I'll take this one.
[Bug target/107786] [13 Regression] ICE in extract_insn, at recog.cc:2791 since r13-4151-gacbb5ef06ee978
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107786 ptomsich at gcc dot gnu.org changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #4 from ptomsich at gcc dot gnu.org --- I kicked off some additional testing that will run overnight. Let's close it for now and I will reopen in case that there is further fallout.
[Bug tree-optimization/114010] Unwanted effects of using SSA free lists.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114010 --- Comment #6 from ptomsich at gcc dot gnu.org --- (In reply to Manolis Tsamis from comment #5) > On of these happens to precede a relevant vector statement and then > in one case combine does the umlal transformation but in the other not. Please attach the A/B dump-files for the combine pass.
[Bug tree-optimization/114010] Unwanted effects of using SSA free lists.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114010 --- Comment #7 from ptomsich at gcc dot gnu.org --- (In reply to Manolis Tsamis from comment #5) > On of these happens to precede a relevant vector statement and then > in one case combine does the umlal transformation but in the other not. Please attach the A/B dump-files for the combine pass.
[Bug tree-optimization/114010] Unwanted effects of using SSA free lists.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114010 --- Comment #9 from ptomsich at gcc dot gnu.org --- (In reply to Manolis Tsamis from comment #0) > E.g. another loop, non canonicalized names: > > .L120: > ldr q30, [x0], 16 > moviv29.2s, 0 > ld2 {v26.16b - v27.16b}, [x4], 32 > moviv25.4s, 0 > zip1v29.16b, v30.16b, v29.16b > zip2v30.16b, v30.16b, v25.16b > umlal v29.8h, v26.8b, v28.8b > umlal2 v30.8h, v26.16b, v28.16b > uaddw v31.4s, v31.4s, v29.4h > uaddw2 v31.4s, v31.4s, v29.8h > uaddw v31.4s, v31.4s, v30.4h > uaddw2 v31.4s, v31.4s, v30.8h > cmp x5, x0 > bne .L120 Is it just me, or are the zip1 and zip2 instructions dead? Philipp.
[Bug tree-optimization/114326] Missed optimization for A || B when !B implies A.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114326 --- Comment #2 from ptomsich at gcc dot gnu.org --- To copy the last piece of info from our internal tracker... LLVM learned this new trick only in the run-up to LLVM 18. Up until then, GCC and LLVM performed identically on this snippet.
[Bug rtl-optimization/116353] [15 Regression] ICE on glibc-2.39: RTL pass: ce2, in expand_simple_binop, at optabs.cc:1264
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116353 --- Comment #2 from ptomsich at gcc dot gnu.org --- Thanks for providing an isolated case, this is very helpful. Note that despite this being reported as 'ce2', it is the same underlying pass as 'ce1'... just the second time the ce (conditional execution) pass is run.
[Bug rtl-optimization/116353] [15 Regression] ICE on glibc-2.39: RTL pass: ce2, in expand_simple_binop, at optabs.cc:1264 since r15-2890-g72c9b5f438f22c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116353 --- Comment #5 from ptomsich at gcc dot gnu.org --- To add on to the info provided by Manolis, this is the diff for the proposed fix: diff --git a/gcc/ifcvt.cc b/gcc/ifcvt.cc index 3e25f30b67e..da59c907891 100644 --- a/gcc/ifcvt.cc +++ b/gcc/ifcvt.cc @@ -3938,8 +3938,10 @@ bb_ok_for_noce_convert_multiple_sets (basic_block test_bb, unsigned *cost) rtx src = SET_SRC (set); /* Do not handle anything involving memory loads/stores since it might -violate data-race-freedom guarantees. */ - if (!REG_P (dest) || contains_mem_rtx_p (src)) +violate data-race-freedom guarantees. Make sure we can force SRC +to a register as that may be needed in try_emit_cmove_seq. */ + if (!REG_P (dest) || contains_mem_rtx_p (src) + || !noce_can_force_operand (src)) return false; /* Destination and source must be appropriate. */
[Bug middle-end/116358] [15 Regression] undefined reference to `__umindi3' at -O3 when compiling with SVE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116358 ptomsich at gcc dot gnu.org changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |ptomsich at gcc dot gnu.org
[Bug target/108589] ICE: RTL check: expected code 'reg', have 'subreg' in rhs_regno, at rtl.h:1932 with -mtune=ampere1a -fno-split-wide-types
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108589 ptomsich at gcc dot gnu.org changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #8 from ptomsich at gcc dot gnu.org --- Fixes had been commited a while ago. Marking as RESOLVED.
[Bug debug/110308] [14 Regression] ICE on audiofile-0.3.6: RTL: vartrack: Segmentation fault in mode_to_precision(machine_mode)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110308 --- Comment #8 from ptomsich at gcc dot gnu.org --- @mtsamis: Could you attach the proposed patch as an attachment (to allow easy application and testing that this resolves the ICE)?
[Bug debug/110308] [14 Regression] ICE on audiofile-0.3.6: RTL: vartrack: Segmentation fault in mode_to_precision(machine_mode)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110308 ptomsich at gcc dot gnu.org changed: What|Removed |Added Status|NEW |ASSIGNED
[Bug middle-end/116845] gcc.dg/pr109393.c test fails on ilp32 targets (and maybe others)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116845 --- Comment #7 from ptomsich at gcc dot gnu.org --- Our team will also be busy with other priorities for the next weeks. We will attempt to schedule this before the end of stage 1, but might still have to delay until stage 3.
[Bug rtl-optimization/117836] [meta-bug] favoid-store-forwarding issues
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117836 Bug 117836 depends on bug 117816, which changed state. Bug 117816 Summary: ICE: in rtl_verify_bb_insns, at cfgrtl.cc:2837: flow control insn inside a basic block with -O -favoid-store-forwarding -fnon-call-exceptions -fno-forward-propagate -finstrument-functions https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117816 What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED
[Bug rtl-optimization/117816] ICE: in rtl_verify_bb_insns, at cfgrtl.cc:2837: flow control insn inside a basic block with -O -favoid-store-forwarding -fnon-call-exceptions -fno-forward-propagate -fins
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117816 ptomsich at gcc dot gnu.org changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #4 from ptomsich at gcc dot gnu.org --- Fix merged onto master.
[Bug rtl-optimization/48696] Horrible bitfield code generation on x86
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=48696 --- Comment #22 from ptomsich at gcc dot gnu.org --- Agreed. It would be ideal not to have to deal with this in the store-forward avoidance pass (i.e., catching it before or during lowering). Given that the store-forward avoidance pass (mostly) catches this case, it seems reasonable to ensure that cases such as this can be cleaned up in the pass nonetheless.
[Bug rtl-optimization/117712] [15 regression] ICE when building x265: internal compiler error: in expand_fix, at optabs.cc:5936 since r15-2890
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117712 ptomsich at gcc dot gnu.org changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |konstantinos.eleftheriou@vr ||ull.eu Status|NEW |ASSIGNED
[Bug rtl-optimization/117872] wrong code with -O -maccumulate-outgoing-args --param=store-forwarding-max-distance=1000 -favoid-store-forwarding
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117872 --- Comment #1 from ptomsich at gcc dot gnu.org --- Our patch (see https://gcc.gnu.org/pipermail/gcc-patches/2024-December/671800.html) for PR117835 should also fix this issue.
[Bug rtl-optimization/48696] Horrible bitfield code generation on x86
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=48696 ptomsich at gcc dot gnu.org changed: What|Removed |Added CC||ptomsich at gcc dot gnu.org --- Comment #20 from ptomsich at gcc dot gnu.org --- (In reply to Konstantinos Eleftheriou from comment #19) > The two loads could be optimized, given that the small load is contained in > the larger one. This could be handled by "reload" if we extend the size of > the small load to the size of the large one. Note that we are investigating to add the necessary infrastructure in the store-forward-avoidance pass to merge the two differently-sized loads.
[Bug middle-end/116860] [15 Regression] New test case gcc.dg/tree-ssa/fold-xor-and-or.c from r15-3866-ga88d6c6d777ad7 fails
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116860 ptomsich at gcc dot gnu.org changed: What|Removed |Added Ever confirmed|0 |1 Assignee|unassigned at gcc dot gnu.org |konstantinos.eleftheriou@vr ||ull.eu Last reconfirmed||2025-01-08 Status|UNCONFIRMED |ASSIGNED
[Bug rtl-optimization/117922] [15 Regression] 1000% compilation time slow down on the testcase from pr26854
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117922 ptomsich at gcc dot gnu.org changed: What|Removed |Added CC||ptomsich at gcc dot gnu.org Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |cmuellner at gcc dot gnu.org
[Bug tree-optimization/117079] [15 Regression] FAIL: gcc.target/i386/pr105493.c since r15-2820-gab18785840d7b8
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117079 ptomsich at gcc dot gnu.org changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |cmuellner at gcc dot gnu.org Status|NEW |ASSIGNED
[Bug rtl-optimization/119160] wrong code with -O2 -finstrument-functions-once -favoid-store-forwarding -fnon-call-exceptions -fschedule-insns -mgeneral-regs-only
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119160 ptomsich at gcc dot gnu.org changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |konstantinos.eleftheriou@vr ||ull.eu Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2025-03-11 Ever confirmed|0 |1
[Bug testsuite/119862] [16 Regression] gcc.dg/pr119160.c FAILs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119862 ptomsich at gcc dot gnu.org changed: What|Removed |Added Status|NEW |ASSIGNED CC||ptomsich at gcc dot gnu.org Assignee|unassigned at gcc dot gnu.org |konstantinos.eleftheriou@vr ||ull.eu
[Bug rtl-optimization/119884] [16 Regression] ICE: in emit_move_insn, at expr.cc:4636 with -O2 -fno-dse -favoid-store-forwarding
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119884 ptomsich at gcc dot gnu.org changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |konstantinos.eleftheriou@vr ||ull.eu CC||ptomsich at gcc dot gnu.org Status|UNCONFIRMED |ASSIGNED Ever confirmed|0 |1 Last reconfirmed||2025-04-22
[Bug rtl-optimization/117836] [meta-bug] favoid-store-forwarding issues
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117836 --- Comment #1 from ptomsich at gcc dot gnu.org --- We'll continue using this meta-bug to track issues that should be addressed before we enabled favoid-store-forwarding by default at O2.
[Bug testsuite/116860] Move optimization from match.pd into tree-ssa-reassoc (optimize_range_tests) where it can be more effective
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116860 --- Comment #13 from ptomsich at gcc dot gnu.org --- v3 has been on the list since March as https://gcc.gnu.org/pipermail/gcc-patches/2025-March/677788.html