[Bug tree-optimization/88492] SLP optimization generates ugly code

2021-04-14 Thread ptomsich at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88492

ptomsich at gcc dot gnu.org changed:

   What|Removed |Added

 CC||ptomsich at gcc dot gnu.org

--- Comment #6 from ptomsich at gcc dot gnu.org ---
With the current master, the test case generates (with -mcpu=neoverse-n1):

.arch armv8.2-a+crc+fp16+rcpc+dotprod+profile
.file   "pr88492.c"
.text
.align  2
.p2align 5,,15
.global test_slp
.type   test_slp, %function
test_slp:
.LFB0:
.cfi_startproc
ldr q2, [x0]
adrpx1, .LC0
ldr q16, [x1, #:lo12:.LC0]
uxtlv4.8h, v2.8b
uxtl2   v2.8h, v2.16b
uxtlv0.4s, v4.4h
uxtlv6.4s, v2.4h
uxtl2   v4.4s, v4.8h
uxtl2   v2.4s, v2.8h
mov v1.16b, v0.16b
mov v7.16b, v6.16b
mov v5.16b, v4.16b
mov v3.16b, v2.16b
tbl v0.16b, {v0.16b - v1.16b}, v16.16b
tbl v6.16b, {v6.16b - v7.16b}, v16.16b
tbl v4.16b, {v4.16b - v5.16b}, v16.16b
tbl v2.16b, {v2.16b - v3.16b}, v16.16b
add v0.4s, v0.4s, v4.4s
add v6.4s, v6.4s, v2.4s
add v0.4s, v0.4s, v6.4s
addvs0, v0.4s
fmovw0, s0
ret
.cfi_endproc
.LFE0:
.size   test_slp, .-test_slp

which contrasts with LLVM13 (with -mcpu=neoverse-n1):

test_slp:   // @test_slp
.cfi_startproc
// %bb.0:   // %entry
ldr q0, [x0]
moviv1.16b, #1
moviv2.2d, #
udotv2.4s, v0.16b, v1.16b
addvs0, v2.4s
fmovw0, s0
ret
.Lfunc_end0:
.size   test_slp, .Lfunc_end0-test_slp

or (LLVM13 w/o the mcpu-option):

.type   test_slp,@function
test_slp:   // @test_slp
.cfi_startproc
// %bb.0:   // %entry
ldr q0, [x0]
ushll2  v1.8h, v0.16b, #0
ushll   v0.8h, v0.8b, #0
uaddl2  v2.4s, v0.8h, v1.8h
uaddl   v0.4s, v0.4h, v1.4h
add v0.4s, v0.4s, v2.4s
addvs0, v0.4s
fmovw0, s0
ret
.Lfunc_end0:
.size   test_slp, .Lfunc_end0-test_slp

[Bug ipa/98748] Increased precision for points-to analysis

2021-01-19 Thread ptomsich at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98748

ptomsich at gcc dot gnu.org changed:

   What|Removed |Added

 CC||ptomsich at gcc dot gnu.org

--- Comment #1 from ptomsich at gcc dot gnu.org ---
The link (through the author's website) for [0] is
https://zenodo.org/record/61898/files/cclyzer.pdf

[Bug target/108589] ICE: RTL check: expected code 'reg', have 'subreg' in rhs_regno, at rtl.h:1932 with -mtune=ampere1a -fno-split-wide-types

2023-01-30 Thread ptomsich at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108589

ptomsich at gcc dot gnu.org changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |ptomsich at gcc dot 
gnu.org
 Status|NEW |ASSIGNED

--- Comment #2 from ptomsich at gcc dot gnu.org ---
We'll handle this one.

[Bug target/108589] ICE: RTL check: expected code 'reg', have 'subreg' in rhs_regno, at rtl.h:1932 with -mtune=ampere1a -fno-split-wide-types

2023-01-30 Thread ptomsich at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108589

--- Comment #3 from ptomsich at gcc dot gnu.org ---
I can confirm that the issue reproduces (as expected) with the
'--enable-checking=rtl' configuration.  I'll retest the fix and submit via the
list.

[Bug rtl-optimization/105314] [12 Regression] ifcvt regression in noce_try_store_flag_mask

2022-04-25 Thread ptomsich at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105314

--- Comment #5 from ptomsich at gcc dot gnu.org ---
The fix addresses the issue and generates no new failures on small test cases.
Testing against SPEC is still ongoing and I'll report back once that has
completed.

[Bug rtl-optimization/105314] [12 Regression] ifcvt regression in noce_try_store_flag_mask

2022-04-25 Thread ptomsich at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105314

--- Comment #7 from ptomsich at gcc dot gnu.org ---
The transformation for 
"""
long func2 (long a, long b, long c)
{
if (c)
a = 0;
else
a = 5;
return a;
}
"""
into
"""
0006 :
   6:   00163513seqza0,a2
   a:   40a00533neg a0,a0
   e:   8915andia0,a0,5
  10:   8082ret
"""
is correct, as we get
"""
  tmp = c ? 0 : -1;
  a = tmp & 5;
"""

[Bug target/107786] [13 Regression] ICE in extract_insn, at recog.cc:2791 since r13-4151-gacbb5ef06ee978

2022-11-21 Thread ptomsich at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107786

ptomsich at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |ptomsich at gcc dot 
gnu.org

--- Comment #1 from ptomsich at gcc dot gnu.org ---
Looks like a dependency between a recent change of ours and another change that
is still in discussion.

I'll take this one.

[Bug target/107786] [13 Regression] ICE in extract_insn, at recog.cc:2791 since r13-4151-gacbb5ef06ee978

2022-11-21 Thread ptomsich at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107786

ptomsich at gcc dot gnu.org changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #4 from ptomsich at gcc dot gnu.org ---
I kicked off some additional testing that will run overnight.
Let's close it for now and I will reopen in case that there is further fallout.

[Bug tree-optimization/114010] Unwanted effects of using SSA free lists.

2024-02-21 Thread ptomsich at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114010

--- Comment #6 from ptomsich at gcc dot gnu.org ---
(In reply to Manolis Tsamis from comment #5)
> On of these happens to precede a relevant vector statement and then
> in one case combine does the umlal transformation but in the other not.

Please attach the A/B dump-files for the combine pass.

[Bug tree-optimization/114010] Unwanted effects of using SSA free lists.

2024-02-21 Thread ptomsich at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114010

--- Comment #7 from ptomsich at gcc dot gnu.org ---
(In reply to Manolis Tsamis from comment #5)
> On of these happens to precede a relevant vector statement and then
> in one case combine does the umlal transformation but in the other not.

Please attach the A/B dump-files for the combine pass.

[Bug tree-optimization/114010] Unwanted effects of using SSA free lists.

2024-02-23 Thread ptomsich at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114010

--- Comment #9 from ptomsich at gcc dot gnu.org ---
(In reply to Manolis Tsamis from comment #0) 
> E.g. another loop, non canonicalized names:
> 
> .L120:
>   ldr q30, [x0], 16
>   moviv29.2s, 0
>   ld2 {v26.16b - v27.16b}, [x4], 32
>   moviv25.4s, 0
>   zip1v29.16b, v30.16b, v29.16b
>   zip2v30.16b, v30.16b, v25.16b
>   umlal   v29.8h, v26.8b, v28.8b
>   umlal2  v30.8h, v26.16b, v28.16b
>   uaddw   v31.4s, v31.4s, v29.4h
>   uaddw2  v31.4s, v31.4s, v29.8h
>   uaddw   v31.4s, v31.4s, v30.4h
>   uaddw2  v31.4s, v31.4s, v30.8h
>   cmp x5, x0
>   bne .L120

Is it just me, or are the zip1 and zip2 instructions dead?

Philipp.

[Bug tree-optimization/114326] Missed optimization for A || B when !B implies A.

2024-03-13 Thread ptomsich at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114326

--- Comment #2 from ptomsich at gcc dot gnu.org ---
To copy the last piece of info from our internal tracker...

LLVM learned this new trick only in the run-up to LLVM 18.
Up until then, GCC and LLVM performed identically on this snippet.

[Bug rtl-optimization/116353] [15 Regression] ICE on glibc-2.39: RTL pass: ce2, in expand_simple_binop, at optabs.cc:1264

2024-08-12 Thread ptomsich at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116353

--- Comment #2 from ptomsich at gcc dot gnu.org ---
Thanks for providing an isolated case, this is very helpful.

Note that despite this being reported as 'ce2', it is the same underlying pass
as 'ce1'... just the second time the ce (conditional execution) pass is run.

[Bug rtl-optimization/116353] [15 Regression] ICE on glibc-2.39: RTL pass: ce2, in expand_simple_binop, at optabs.cc:1264 since r15-2890-g72c9b5f438f22c

2024-08-13 Thread ptomsich at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116353

--- Comment #5 from ptomsich at gcc dot gnu.org ---
To add on to the info provided by Manolis, this is the diff for the proposed
fix:

diff --git a/gcc/ifcvt.cc b/gcc/ifcvt.cc
index 3e25f30b67e..da59c907891 100644
--- a/gcc/ifcvt.cc
+++ b/gcc/ifcvt.cc
@@ -3938,8 +3938,10 @@ bb_ok_for_noce_convert_multiple_sets (basic_block
test_bb, unsigned *cost)
   rtx src = SET_SRC (set);

   /* Do not handle anything involving memory loads/stores since it might
-violate data-race-freedom guarantees.  */
-  if (!REG_P (dest) || contains_mem_rtx_p (src))
+violate data-race-freedom guarantees.  Make sure we can force SRC
+to a register as that may be needed in try_emit_cmove_seq.  */
+  if (!REG_P (dest) || contains_mem_rtx_p (src)
+ || !noce_can_force_operand (src))
return false;

   /* Destination and source must be appropriate.  */

[Bug middle-end/116358] [15 Regression] undefined reference to `__umindi3' at -O3 when compiling with SVE

2024-08-13 Thread ptomsich at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116358

ptomsich at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |ptomsich at gcc dot 
gnu.org

[Bug target/108589] ICE: RTL check: expected code 'reg', have 'subreg' in rhs_regno, at rtl.h:1932 with -mtune=ampere1a -fno-split-wide-types

2023-05-09 Thread ptomsich at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108589

ptomsich at gcc dot gnu.org changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #8 from ptomsich at gcc dot gnu.org ---
Fixes had been commited a while ago.
Marking as RESOLVED.

[Bug debug/110308] [14 Regression] ICE on audiofile-0.3.6: RTL: vartrack: Segmentation fault in mode_to_precision(machine_mode)

2023-06-20 Thread ptomsich at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110308

--- Comment #8 from ptomsich at gcc dot gnu.org ---
@mtsamis: Could you attach the proposed patch as an attachment (to allow easy
application and testing that this resolves the ICE)?

[Bug debug/110308] [14 Regression] ICE on audiofile-0.3.6: RTL: vartrack: Segmentation fault in mode_to_precision(machine_mode)

2023-06-20 Thread ptomsich at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110308

ptomsich at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |ASSIGNED

[Bug middle-end/116845] gcc.dg/pr109393.c test fails on ilp32 targets (and maybe others)

2024-09-26 Thread ptomsich at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116845

--- Comment #7 from ptomsich at gcc dot gnu.org ---
Our team will also be busy with other priorities for the next weeks.
We will attempt to schedule this before the end of stage 1, but might still
have to delay until stage 3.

[Bug rtl-optimization/117836] [meta-bug] favoid-store-forwarding issues

2024-12-06 Thread ptomsich at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117836
Bug 117836 depends on bug 117816, which changed state.

Bug 117816 Summary: ICE: in rtl_verify_bb_insns, at cfgrtl.cc:2837: flow 
control insn inside a basic block with -O -favoid-store-forwarding 
-fnon-call-exceptions -fno-forward-propagate -finstrument-functions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117816

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

[Bug rtl-optimization/117816] ICE: in rtl_verify_bb_insns, at cfgrtl.cc:2837: flow control insn inside a basic block with -O -favoid-store-forwarding -fnon-call-exceptions -fno-forward-propagate -fins

2024-12-06 Thread ptomsich at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117816

ptomsich at gcc dot gnu.org changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #4 from ptomsich at gcc dot gnu.org ---
Fix merged onto master.

[Bug rtl-optimization/48696] Horrible bitfield code generation on x86

2024-12-27 Thread ptomsich at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=48696

--- Comment #22 from ptomsich at gcc dot gnu.org ---
Agreed. It would be ideal not to have to deal with this in the store-forward
avoidance pass (i.e., catching it before or during lowering).

Given that the store-forward avoidance pass (mostly) catches this case, it
seems reasonable to ensure that cases such as this can be cleaned up in the
pass nonetheless.

[Bug rtl-optimization/117712] [15 regression] ICE when building x265: internal compiler error: in expand_fix, at optabs.cc:5936 since r15-2890

2025-02-04 Thread ptomsich at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117712

ptomsich at gcc dot gnu.org changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  
|konstantinos.eleftheriou@vr
   ||ull.eu
 Status|NEW |ASSIGNED

[Bug rtl-optimization/117872] wrong code with -O -maccumulate-outgoing-args --param=store-forwarding-max-distance=1000 -favoid-store-forwarding

2024-12-17 Thread ptomsich at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117872

--- Comment #1 from ptomsich at gcc dot gnu.org ---
Our patch (see
https://gcc.gnu.org/pipermail/gcc-patches/2024-December/671800.html) for
PR117835 should also fix this issue.

[Bug rtl-optimization/48696] Horrible bitfield code generation on x86

2024-12-20 Thread ptomsich at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=48696

ptomsich at gcc dot gnu.org changed:

   What|Removed |Added

 CC||ptomsich at gcc dot gnu.org

--- Comment #20 from ptomsich at gcc dot gnu.org ---
(In reply to Konstantinos Eleftheriou from comment #19)
> The two loads could be optimized, given that the small load is contained in
> the larger one. This could be handled by "reload" if we extend the size of
> the small load to the size of the large one.

Note that we are investigating to add the necessary infrastructure in the
store-forward-avoidance pass to merge the two differently-sized loads.

[Bug middle-end/116860] [15 Regression] New test case gcc.dg/tree-ssa/fold-xor-and-or.c from r15-3866-ga88d6c6d777ad7 fails

2025-01-08 Thread ptomsich at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116860

ptomsich at gcc dot gnu.org changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Assignee|unassigned at gcc dot gnu.org  
|konstantinos.eleftheriou@vr
   ||ull.eu
   Last reconfirmed||2025-01-08
 Status|UNCONFIRMED |ASSIGNED

[Bug rtl-optimization/117922] [15 Regression] 1000% compilation time slow down on the testcase from pr26854

2025-01-10 Thread ptomsich at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117922

ptomsich at gcc dot gnu.org changed:

   What|Removed |Added

 CC||ptomsich at gcc dot gnu.org
 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |cmuellner at gcc dot 
gnu.org

[Bug tree-optimization/117079] [15 Regression] FAIL: gcc.target/i386/pr105493.c since r15-2820-gab18785840d7b8

2025-01-10 Thread ptomsich at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117079

ptomsich at gcc dot gnu.org changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |cmuellner at gcc dot 
gnu.org
 Status|NEW |ASSIGNED

[Bug rtl-optimization/119160] wrong code with -O2 -finstrument-functions-once -favoid-store-forwarding -fnon-call-exceptions -fschedule-insns -mgeneral-regs-only

2025-03-11 Thread ptomsich at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119160

ptomsich at gcc dot gnu.org changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  
|konstantinos.eleftheriou@vr
   ||ull.eu
 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2025-03-11
 Ever confirmed|0   |1

[Bug testsuite/119862] [16 Regression] gcc.dg/pr119160.c FAILs

2025-04-22 Thread ptomsich at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119862

ptomsich at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
 CC||ptomsich at gcc dot gnu.org
   Assignee|unassigned at gcc dot gnu.org  
|konstantinos.eleftheriou@vr
   ||ull.eu

[Bug rtl-optimization/119884] [16 Regression] ICE: in emit_move_insn, at expr.cc:4636 with -O2 -fno-dse -favoid-store-forwarding

2025-04-22 Thread ptomsich at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119884

ptomsich at gcc dot gnu.org changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  
|konstantinos.eleftheriou@vr
   ||ull.eu
 CC||ptomsich at gcc dot gnu.org
 Status|UNCONFIRMED |ASSIGNED
 Ever confirmed|0   |1
   Last reconfirmed||2025-04-22

[Bug rtl-optimization/117836] [meta-bug] favoid-store-forwarding issues

2025-04-22 Thread ptomsich at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117836

--- Comment #1 from ptomsich at gcc dot gnu.org ---
We'll continue using this meta-bug to track issues that should be addressed
before we enabled favoid-store-forwarding by default at O2.

[Bug testsuite/116860] Move optimization from match.pd into tree-ssa-reassoc (optimize_range_tests) where it can be more effective

2025-04-25 Thread ptomsich at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116860

--- Comment #13 from ptomsich at gcc dot gnu.org ---
v3 has been on the list since March as
https://gcc.gnu.org/pipermail/gcc-patches/2025-March/677788.html