[Bug tree-optimization/106293] [13 regression] 456.hmmer at -Ofast -march=native regressed by 19% on zen2 and zen3 in July 2022

2023-08-04 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106293 Jan Hubicka changed: What|Removed |Added Summary|[13/14 Regression] |[13 regression] 456.hmmer

[Bug middle-end/110857] aarch64-linux-gnu profiledbootstrap broken

2023-08-04 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110857 Jan Hubicka changed: What|Removed |Added Status|UNCONFIRMED |WAITING Last reconfirmed|

[Bug middle-end/110857] aarch64-linux-gnu profiledbootstrap broken

2023-08-06 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110857 Jan Hubicka changed: What|Removed |Added Resolution|--- |FIXED Status|WAITING

[Bug target/110727] [14 Regression] gcc.target/aarch64/sve/aarch64-sve.exp has two new failures since commit 061f74c0673

2023-08-06 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110727 Jan Hubicka changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|---

[Bug tree-optimization/94427] 456.hmmer is 8-17% slower when compiled at -Ofast than with GCC 9

2023-08-07 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94427 Jan Hubicka changed: What|Removed |Added CC||hubicka at gcc dot gnu.org --- Comment #3

[Bug tree-optimization/94427] 456.hmmer is 8-17% slower when compiled at -Ofast than with GCC 9

2023-08-07 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94427 --- Comment #4 from Jan Hubicka --- It is the same loop - it was float only in my mind (since the function return float value :) With loop splitting we no longer have the last iteration check, but we still have the underflow checks that are inde

[Bug target/110899] RFE: Attributes preserve_most and preserve_all

2023-08-07 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110899 Jan Hubicka changed: What|Removed |Added CC||hubicka at gcc dot gnu.org --- Comment #6

[Bug tree-optimization/110963] [14 Regression] Dead Code Elimination Regression since r14-2946-g46c8c225455

2023-08-10 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110963 --- Comment #7 from Jan Hubicka --- We don't consider main cold, but executed once: code out of loops is optimized for size, but anything in loops is optimized according to -O setting. I did not really think of users overwriting it by hot attri

[Bug middle-end/110972] New: 13% fatigue regression between g:bb3ceeb6520c13fc (2023-08-07 21:09) and g:d9dc70cc65becca9 (2023-08-08 13:30)

2023-08-10 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110972 Bug ID: 110972 Summary: 13% fatigue regression between g:bb3ceeb6520c13fc (2023-08-07 21:09) and g:d9dc70cc65becca9 (2023-08-08 13:30) Product: gcc Version: 13.1

[Bug middle-end/110972] 13% fatigue regression between g:bb3ceeb6520c13fc (2023-08-07 21:09) and g:d9dc70cc65becca9 (2023-08-08 13:30)

2023-08-10 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110972 --- Comment #1 from Jan Hubicka --- The following patches in the range looks like they may cause the difference commit d9f3ea61fe36e2de3354b90b65ff8245099114c9 Author: Richard Biener Date: Mon Aug 7 14:44:20 2023 +0200 tree-optimization

[Bug middle-end/110973] New: 9% namd regression between g:c2a447d840476dbd (2023-08-03 18:47) and g:73da34a538ddc2ad (2023-08-09 20:17)

2023-08-10 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110973 Bug ID: 110973 Summary: 9% namd regression between g:c2a447d840476dbd (2023-08-03 18:47) and g:73da34a538ddc2ad (2023-08-09 20:17) Product: gcc Version: 13.1.0

[Bug middle-end/110975] New: Missed unlooping

2023-08-10 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110975 Bug ID: 110975 Summary: Missed unlooping Product: gcc Version: 13.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee

[Bug tree-optimization/106293] [13 regression] 456.hmmer at -Ofast -march=native regressed by 19% on zen2 and zen3 in July 2022

2023-08-10 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106293 Jan Hubicka changed: What|Removed |Added See Also||https://gcc.gnu.org/bugzill

[Bug middle-end/110975] Missed unlooping

2023-08-10 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110975 Jan Hubicka changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|---

[Bug tree-optimization/110971] [14 Regression] ICE on valid code at -O3 on x86_64-linux-gnu: in operator/, at sreal.cc:261

2023-08-10 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110971 --- Comment #1 from Jan Hubicka --- The problem here is that the split conditional is predicted with probability 0: if (b.0_1 > a.4_14) goto ; [0.00%] else goto ; [100.00%] this is caused by jump threading; [local count: 118111

[Bug tree-optimization/110988] [14 regression] ICE when building 523.xalancbmk_r with pgo and lto

2023-08-15 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110988 Jan Hubicka changed: What|Removed |Added Resolution|--- |FIXED Status|UNCONFIRMED

[Bug middle-end/26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)

2023-08-15 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163 Bug 26163 depends on bug 110988, which changed state. Bug 110988 Summary: [14 regression] ICE when building 523.xalancbmk_r with pgo and lto https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110988 What|Removed |Add

[Bug tree-optimization/110971] [14 Regression] ICE on valid code at -O3 on x86_64-linux-gnu: in operator/, at sreal.cc:261

2023-08-15 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110971 Jan Hubicka changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|---

[Bug tree-optimization/110940] [14 Regression] ICE at -O3 on x86_64-linux-gnu: in apply_scale, at profile-count.h:1180

2023-08-15 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110940 Jan Hubicka changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|---

[Bug tree-optimization/110923] [14 Regression] with-build-config=bootstrap-lto-lean and `make profile-bootstrap` ICEs during build during lsplit pass

2023-08-15 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110923 Jan Hubicka changed: What|Removed |Added Resolution|--- |FIXED Status|UNCONFIRMED

[Bug target/111064] New: 5-10% regression of parest on icelake between g:d073e2d75d9ed492de9a8dc6970e5b69fae20e5a (Aug 15 2023) and g:9ade70bb86c8744f4416a48bb69cf4705f00905a (Aug 16)

2023-08-18 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111064 Bug ID: 111064 Summary: 5-10% regression of parest on icelake between g:d073e2d75d9ed492de9a8dc6970e5b69fae20e5a (Aug 15 2023) and g:9ade70bb86c8744f4416a48bb69cf4705f00905a

[Bug target/111064] 5-10% regression of parest on icelake between g:d073e2d75d9ed492de9a8dc6970e5b69fae20e5a (Aug 15 2023) and g:9ade70bb86c8744f4416a48bb69cf4705f00905a (Aug 16)

2023-08-18 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111064 --- Comment #1 from Jan Hubicka --- Maybe commit 3064d1f5c48cb6ce1b4133570dd08ecca8abb52d Author: liuhongt Date: Thu Aug 10 11:41:39 2023 +0800 Software mitigation: Disable gather generation in vectorization for GDS affected Intel Proces

[Bug middle-end/111054] [14 Regression] ICE: in to_sreal, at profile-count.cc:472 with -O3 -fno-guess-branch-probability

2023-08-22 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111054 --- Comment #2 from Jan Hubicka --- This is a missing check for profile presence (we can not convert undefined probability to sreal). I will fix that.

[Bug ipa/111157] [14 Regression] 416.gamess fails with a run-time abort when compiled with -O2 -flto after r14-3226-gd073e2d75d9ed4

2023-08-26 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57 --- Comment #4 from Jan Hubicka --- So here ipa-modref declares the field dead, while ipa-prop determines its value even if it is unused and makes it used later? I think dead argument is probably better than optimizing out one store, so I think

[Bug middle-end/110973] 9% 444.namd regression between g:c2a447d840476dbd (2023-08-03 18:47) and g:73da34a538ddc2ad (2023-08-09 20:17)

2023-08-29 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110973 --- Comment #5 from Jan Hubicka --- Note that some (not all?) namd scores seems to be back to pre-regression https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=798.120.0 https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=791.120.0 ht

[Bug tree-optimization/111498] New: 951% profile quality regression between g:93996cfb308ffc63 (2023-09-18 03:40) and g:95d2ce05fb32e663 (2023-09-19 03:22)

2023-09-20 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111498 Bug ID: 111498 Summary: 951% profile quality regression between g:93996cfb308ffc63 (2023-09-18 03:40) and g:95d2ce05fb32e663 (2023-09-19 03:22) Product: gcc Vers

[Bug middle-end/111551] New: Fix for PR106081 is not working with profile feedback on imagemagick

2023-09-23 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111551 Bug ID: 111551 Summary: Fix for PR106081 is not working with profile feedback on imagemagick Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal

[Bug middle-end/111552] New: 549.fotonik3d_r regression with -O2 -flto -march=native on zen between g:85d613da341b7630 (2022-06-21 15:51) and g:ecd11acacd6be57a (2022-07-01 16:07)

2023-09-23 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111552 Bug ID: 111552 Summary: 549.fotonik3d_r regression with -O2 -flto -march=native on zen between g:85d613da341b7630 (2022-06-21 15:51) and g:ecd11acacd6be57a (2022-07-01

[Bug middle-end/111573] New: lambda functions often not inlined and optimized out

2023-09-24 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111573 Bug ID: 111573 Summary: lambda functions often not inlined and optimized out Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Compon

[Bug ipa/59948] Optimize std::function

2023-09-25 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59948 --- Comment #8 from Jan Hubicka --- Trunk optimized stuff return 0, but fails to optimize out functions which becomes unused after indirect inlining. With -fno-early-inlining we end up with: int m () { void * D.48296; int __args#0; struct

[Bug libstdc++/116140] [15 Regression] 5-35% slowdown of 483.xalancbmk and 523.xalancbmk_r since r15-2356-ge69456ff9a54ba

2024-08-01 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116140 Jan Hubicka changed: What|Removed |Added Last reconfirmed||2024-08-01 Status|UNCONFIRMED

[Bug ipa/116296] [13/14/15 Regression] internal compiler error: in merge, at ipa-modref-tree.cc:176 at -O3

2024-08-12 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116296 --- Comment #2 from Jan Hubicka --- It is most likely some problem with computing bit offsets for the alias oracle. I guess multiplying that number by sizeof (long) * 11 * 11 * 8 triggers overflow. Probably harmless for -fdisable-checking gener

[Bug middle-end/116582] New: gather is a win in some cases on zen CPUs

2024-09-03 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116582 Bug ID: 116582 Summary: gather is a win in some cases on zen CPUs Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle

[Bug target/116582] gather is a win in some cases on zen CPUs

2024-09-03 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116582 --- Comment #2 from Jan Hubicka --- it is mysterious. I was looking into why in some cases the gather is a win in micro-benchmark and loss in real benchmark. Indeed distribution of indices makes difference. If I make indices random then the pe

[Bug target/116582] gather is a win in some cases on zen CPUs

2024-09-03 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116582 --- Comment #3 from Jan Hubicka --- Just for completeness the codegen for parest sparse matrix multiply is: 0.31 │320: kmovb %k1,%k4 0.25 │ kmovb %k1,%k5 0.28 │ vmovdqu32 (%rcx,%rax,1),%zmm0 0.32 │

[Bug tree-optimization/109213] [13 Regression] -Os generates significantly more code since r13-723

2023-03-29 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109213 --- Comment #8 from Jan Hubicka --- We have large-stack-frame-growth that is relative, so yes, increasing stack size of caller makes gcc to think that it is heavy and making it event heavier will not hurt that much. We originally ran into stack

[Bug ipa/109341] [12/13 Regression] ICE in merge, at ipa-modref-tree.cc:176 since r12-3142-g5c85f29537662f

2023-03-30 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109341 Jan Hubicka changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |hubicka at gcc dot gnu.org

[Bug target/109137] [12 regression] Compiling ffmpeg with -m32 on x86_64-pc-linux-gnu hangs on libavcodec/h264_cabac.c since r12-9086-g489c81db7d4f75

2023-03-30 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109137 --- Comment #21 from Jan Hubicka --- Zen 1-3 changes were intentional in the original tuning patch (it is also briefly mentioned in the commit message). By allowing 256 bit AVX moves instead of 64bit integer moves (or 128bit) we can move bigger

[Bug ipa/109509] Huge compile time with forced inlining

2023-04-14 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109509 Jan Hubicka changed: What|Removed |Added CC||hubicka at gcc dot gnu.org --- Comment #3

[Bug rtl-optimization/108086] [11 Regression] internal compiler error: in set_accesses, at rtl-ssa/internals.inl:449

2023-04-14 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108086 Jan Hubicka changed: What|Removed |Added CC||hubicka at gcc dot gnu.org --- Comment #1

[Bug tree-optimization/109491] [11/12 Regression] Segfault in tree-ssa-sccvn.cc:expressions_equal_p()

2023-04-14 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109491 Jan Hubicka changed: What|Removed |Added CC||hubicka at gcc dot gnu.org --- Comment #1

[Bug ipa/109509] Huge compile time with forced inlining

2023-04-14 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109509 --- Comment #5 from Jan Hubicka --- For a summary - PR109491 does not seem to be about integration time. most time is RTL PRE. - PR108086 has 10% spent in integration and seems to be operand scan issue - PR99785 is hard to judge given that

[Bug c++/79416] Internal compiler error for recursive template expansion

2023-04-14 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79416 Jan Hubicka changed: What|Removed |Added CC||hubicka at gcc dot gnu.org --- Comment #4

[Bug target/109137] [12 regression] Compiling ffmpeg with -m32 on x86_64-pc-linux-gnu hangs on libavcodec/h264_cabac.c since r12-9086-g489c81db7d4f75

2023-04-14 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109137 --- Comment #26 from Jan Hubicka --- reverted the znver1-3 change on gcc-12 branch. We still may want to fix IRA to avoid the problem on core_avx512 targets.

[Bug tree-optimization/109595] New: Missed upper bound on number of iterations

2023-04-22 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109595 Bug ID: 109595 Summary: Missed upper bound on number of iterations Product: gcc Version: 13.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-

[Bug tree-optimization/109605] New: -fno-tree-vectorize does not disable vectorizer

2023-04-24 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109605 Bug ID: 109605 Summary: -fno-tree-vectorize does not disable vectorizer Product: gcc Version: 13.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component:

[Bug tree-optimization/109690] New: bad SLP vectorization on zen

2023-05-01 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109690 Bug ID: 109690 Summary: bad SLP vectorization on zen Product: gcc Version: 13.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization

[Bug target/109690] bad SLP vectorization on zen

2023-05-05 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109690 --- Comment #7 from Jan Hubicka --- Thanks a lot! There however still seems to be problem with vectorization On zen4 i now get: jh@ryzen4:~/gcc/build/gcc> ./xgcc -B ./ -O2 -march=native slp.c ; perf stat ./a.out Performance counter stats fo

[Bug c++/106943] GCC building clang/llvm with LTO flags causes ICE in clang

2023-05-12 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106943 Jan Hubicka changed: What|Removed |Added CC||hubicka at gcc dot gnu.org --- Comment #1

[Bug middle-end/109849] New: suboptimal code for vector walking loop

2023-05-13 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109849 Bug ID: 109849 Summary: suboptimal code for vector walking loop Product: gcc Version: 13.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-e

[Bug target/109811] libxjl 0.7 is a lot slower in GCC 13.1 vs Clang 16

2023-05-17 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109811 Jan Hubicka changed: What|Removed |Added CC||hubicka at gcc dot gnu.org Ever confi

[Bug ipa/113359] [13/14 Regression] LTO miscompilation of ceph on aarch64 and x86_64

2024-04-04 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113359 --- Comment #23 from Jan Hubicka --- The patch looks reasonable. We probably could hash the padding vectors at summary generation time to reduce WPA overhead, but that can be done incrementally next stage1. I however wonder if we really guarant

[Bug ipa/113291] [14 Regression] compilation never (?) finishes with recursive always_inline functions at -O and above since r14-2172

2024-04-09 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113291 --- Comment #8 from Jan Hubicka --- I am not sure this ought to be P1: - the compilation technically is finite, but not in reasonable time - it is possible to adjust the testcas (do early inlining manually) and get same infinite build on relea

[Bug lto/113208] [14 Regression] lto1: error: Alias and target's comdat groups differs since r14-5979-g99d114c15523e0

2024-04-15 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113208 --- Comment #25 from Jan Hubicka --- So we have comdat groups that diverges in t1.o and t2.o. In one object it has alias in it while in other object it does not Merging nodes for _ZN6vectorI12QualityValueEC2ERKS1_. Candidates: _ZN6vectorI12Qua

[Bug lto/113208] [14 Regression] lto1: error: Alias and target's comdat groups differs since r14-5979-g99d114c15523e0

2024-04-15 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113208 --- Comment #27 from Jan Hubicka --- OK, but the problem is same. Having comdats with same key defining different set of public symbols is IMO not a good situation for both non-LTO and LTO builds. Unless the additional alias is never used by val

[Bug lto/113208] [14 Regression] lto1: error: Alias and target's comdat groups differs since r14-5979-g99d114c15523e0

2024-04-15 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113208 --- Comment #28 from Jan Hubicka --- So the main problem is that in t2 we have _ZN6vectorI12QualityValueEC1ERKS1_/7 (vector<_Tp>::vector(const vector<_Tp>&) [with _Tp = QualityValue]) Type: function definition analyzed alias cpp_implicit_alia

[Bug testsuite/109596] [14 Regression] Lots of guality testcase fails on x86_64 after r14-162-gcda246f8b421ba

2024-04-15 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109596 --- Comment #19 from Jan Hubicka --- I looked into the remaining exit/nonexit rename discussed here earlier before the PR was closed. The following patch would restore the code to do the same calls as before my patch PR tree-optimization

[Bug middle-end/114774] New: Missed DSE in simple code

2024-04-18 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114774 Bug ID: 114774 Summary: Missed DSE in simple code Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end A

[Bug middle-end/114774] Missed DSE in simple code due to interleaving sotres

2024-04-18 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114774 Jan Hubicka changed: What|Removed |Added Summary|Missed DSE in simple code |Missed DSE in simple code

[Bug tree-optimization/114779] __builtin_constant_p does not work in inline functions

2024-04-19 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114779 Jan Hubicka changed: What|Removed |Added CC||hubicka at gcc dot gnu.org --- Comment #7

[Bug c++/93008] Need a way to make inlining heuristics ignore whether a function is inline

2024-04-22 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93008 --- Comment #8 from Jan Hubicka --- Note that cold attribute is also quite strong since it turns optimize_size codegen that is often a lot slower. Reading the discussion again, I don't think we have a way to make inline keyword ignored by inline

[Bug tree-optimization/114787] [13/14 Regression] wrong code at -O1 on x86_64-linux-gnu (the generated code hangs)

2024-04-22 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114787 --- Comment #13 from Jan Hubicka --- -fdump-tree-all-all changing generated code is also bad. We probably should avoid dumping loop bounds then they are not recorded. I added dumping of loop bounds and this may be unexpected side effect. WIll

[Bug libstdc++/114821] New: _M_realloc_append should use memcpy instead of loop to copy data when possible

2024-04-23 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114821 Bug ID: 114821 Summary: _M_realloc_append should use memcpy instead of loop to copy data when possible Product: gcc Version: 14.0 Status: UNCONFIRMED Severity:

[Bug libstdc++/114821] _M_realloc_append should use memcpy instead of loop to copy data when possible

2024-04-23 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114821 --- Comment #2 from Jan Hubicka --- What I am shooting for is to optimize it later in loop distribution. We can recognize memcpy loop if we can figure out that source and destination memory are different. We can help here with restrict, but I w

[Bug middle-end/114822] New: ldist should produce memcpy/memset/memmove histograms based on loop information converted

2024-04-23 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114822 Bug ID: 114822 Summary: ldist should produce memcpy/memset/memmove histograms based on loop information converted Product: gcc Version: 14.0 Status: UNCONFIRMED

[Bug libstdc++/114821] _M_realloc_append should use memcpy instead of loop to copy data when possible

2024-04-23 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114821 --- Comment #6 from Jan Hubicka --- Thanks. I though the relocate_a only cares about the fact if the pointed-to type can be bitwise copied. It would be nice to early produce memcpy from libstdc++ for std::pair, so the second patch makes sense t

[Bug libstdc++/114821] _M_realloc_append should use memcpy instead of loop to copy data when possible

2024-04-23 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114821 --- Comment #8 from Jan Hubicka --- I had wrong noexcept specifier. This version works, but I still need to inline relocate_object_a into the loop diff --git a/libstdc++-v3/include/bits/stl_uninitialized.h b/libstdc++-v3/include/bits/stl_unini

[Bug libstdc++/114821] _M_realloc_append should use memcpy instead of loop to copy data when possible

2024-04-23 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114821 --- Comment #9 from Jan Hubicka --- Your patch gives me error compiling testcase jh@ryzen3:/tmp> ~/trunk-install/bin/g++ -O3 ~/t.C In file included from /home/jh/trunk-install/include/c++/14.0.1/vector:65, from /home/jh/t.C:1:

[Bug libstdc++/114821] _M_realloc_append should use memcpy instead of loop to copy data when possible

2024-04-24 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114821 --- Comment #13 from Jan Hubicka --- Thanks a lot, looks great! Do we still auto-detect memmove when the copy constructor turns out to be memcpy equivalent after optimization?

[Bug tree-optimization/114787] [13 Regression] wrong code at -O1 on x86_64-linux-gnu (the generated code hangs)

2024-04-24 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114787 --- Comment #18 from Jan Hubicka --- predict.cc queries number of iterations using number_of_iterations_exit and loop_niter_by_eval and finally using estimated_stmt_executions. The first two queries are not updating the upper bounds datastructu

[Bug target/113236] WebP benchmark is 20% slower vs. Clang on AMD Zen 4

2024-04-24 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113236 --- Comment #3 from Jan Hubicka --- Seems this perofmance difference is still there on zen4 https://www.phoronix.com/review/gcc14-clang18-amd-zen4/3

[Bug target/113235] SMHasher SHA3-256 benchmark is almost 40% slower vs. Clang (not enough complete loop peeling)

2024-04-24 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113235 --- Comment #9 from Jan Hubicka --- Phoronix still claims the difference https://www.phoronix.com/review/gcc14-clang18-amd-zen4/2

[Bug middle-end/114852] New: jpegxl 10.0.1 is faster with clang18 then with gcc14

2024-04-25 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114852 Bug ID: 114852 Summary: jpegxl 10.0.1 is faster with clang18 then with gcc14 Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Compon

[Bug ipa/114985] [15 regression] internal compiler error: in discriminator_fail during stage2

2024-05-10 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114985 --- Comment #14 from Jan Hubicka --- So this is problem in ipa_value_range_from_jfunc? It is Maritn's code, I hope he will know why types are wrong here. Once can get type compatibility problem on mismatched declarations and LTO, but it seems th

[Bug middle-end/115036] New: division is not shortened based on value range

2024-05-10 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115036 Bug ID: 115036 Summary: division is not shortened based on value range Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: m

[Bug middle-end/115037] New: Unused std::vector is not optimized away.

2024-05-10 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115037 Bug ID: 115037 Summary: Unused std::vector is not optimized away. Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle

[Bug middle-end/115037] Unused std::vector is not optimized away.

2024-05-10 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115037 Jan Hubicka changed: What|Removed |Added CC||jason at redhat dot com,

[Bug libstdc++/109442] Dead local copy of std::vector not removed from function

2024-05-11 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109442 --- Comment #19 from Jan Hubicka --- Note that the testcase from PR115037 also shows that we are not able to optimize out dead stores to the vector, which is another quite noticeable problem. void test() { std::vector test; test

[Bug tree-optimization/113787] [12/13/14 Regression] Wrong code at -O with ipa-modref on aarch64

2024-05-16 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113787 Jan Hubicka changed: What|Removed |Added Summary|[12/13/14/15 Regression]|[12/13/14 Regression] Wrong

[Bug middle-end/115277] New: ICF needs to match loop bound estimates

2024-05-29 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115277 Bug ID: 115277 Summary: ICF needs to match loop bound estimates Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-e

[Bug middle-end/115277] [13/14/15 regression] ICF needs to match loop bound estimates

2024-05-29 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115277 Jan Hubicka changed: What|Removed |Added Summary|ICF needs to match loop |[13/14/15 regression] ICF

[Bug ipa/67051] symtab_node::equal_address_to too conservative?

2024-06-04 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67051 --- Comment #2 from Jan Hubicka --- I believe that there was some discussion on this in the past. I would be quite happy to change the predicate to be more aggressive. Current code basically duplicates what original fold-const.c did. One proble

[Bug target/109811] libjxl 0.7 is a lot slower in GCC 13.1 vs Clang 16

2023-11-02 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109811 --- Comment #13 from Jan Hubicka --- So I re-tested it with current mainline and clang 16/17 For mainline I get (megapixels per second, bigger is better): 13.39 13.38 13.42 clang 16: 20.06 20.06 1

[Bug tree-optimization/110641] [14 Regression] ICE in adjust_loop_info_after_peeling, at tree-ssa-loop-ivcanon.cc:1023 since r14-2230-g7e904d6c7f2

2023-11-06 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110641 Jan Hubicka changed: What|Removed |Added Status|NEW |ASSIGNED --- Comment #3 from Jan Hubicka

[Bug tree-optimization/112618] New: internal compiler error: in expand_MASK_CALL, at internal-fn.cc:4529

2023-11-19 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112618 Bug ID: 112618 Summary: internal compiler error: in expand_MASK_CALL, at internal-fn.cc:4529 Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal

[Bug libstdc++/110287] _M_check_len is expensive

2023-11-19 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110287 --- Comment #8 from Jan Hubicka --- With return value range propagation https://gcc.gnu.org/pipermail/gcc-patches/2023-November/637265.html reduces --param max-inline-insns-auto needed for _M_realloc_insert to be inlined on my testcase from 39 t

[Bug middle-end/109849] suboptimal code for vector walking loop

2023-11-19 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109849 --- Comment #21 from Jan Hubicka --- Patch https://gcc.gnu.org/pipermail/gcc-patches/2023-November/637265.html gets us closer to inlining _M_realloc_insert at -O3 (3 insns away) Patch https://gcc.gnu.org/pipermail/gcc-patches/2023-November/6369

[Bug libstdc++/110287] _M_check_len is expensive

2023-11-19 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110287 --- Comment #9 from Jan Hubicka --- This is _M_realloc insert at release_ssa time: eleased 63 names, 165.79%, removed 63 holes void std::vector::_M_realloc_insert (struct vector * const this, struct iterator __position, const struct pair_t & __

[Bug libstdc++/110287] _M_check_len is expensive

2023-11-21 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110287 Jan Hubicka changed: What|Removed |Added Status|UNCONFIRMED |NEW Ever confirmed|0

[Bug middle-end/112653] New: We should optimize memmove to memcpy using alias oracle

2023-11-21 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112653 Bug ID: 112653 Summary: We should optimize memmove to memcpy using alias oracle Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Pr

[Bug middle-end/110377] Early VRP and IPA-PROP should work out value ranges from __builtin_unreachable

2023-11-21 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110377 Jan Hubicka changed: What|Removed |Added Resolution|--- |FIXED Status|NEW

[Bug libstdc++/110287] _M_check_len is expensive

2023-11-21 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110287 Bug 110287 depends on bug 110377, which changed state. Bug 110377 Summary: Early VRP and IPA-PROP should work out value ranges from __builtin_unreachable https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110377 What|Removed

[Bug middle-end/109849] suboptimal code for vector walking loop

2023-11-21 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109849 Bug 109849 depends on bug 110377, which changed state. Bug 110377 Summary: Early VRP and IPA-PROP should work out value ranges from __builtin_unreachable https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110377 What|Removed

[Bug middle-end/112653] We should optimize memmove to memcpy using alias oracle

2023-11-21 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112653 --- Comment #3 from Jan Hubicka --- PR82898 testcases seems to be about type based alias analysis. However PTA should be useable here.

[Bug middle-end/88345] -Os overrides -falign-functions=N on the command line

2023-11-22 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88345 Jan Hubicka changed: What|Removed |Added CC||hubicka at gcc dot gnu.org --- Comment #17

[Bug ipa/98925] Extend ipa-prop to handle return functions for slot optimization

2023-11-22 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98925 --- Comment #3 from Jan Hubicka --- Return value range propagation was added in r:53ba8d669550d3a1f809048428b97ca607f95cf5 however it works on scalar return values only for now. Extending it to aggregates is a logical next step and should not be

[Bug rtl-optimization/112657] [13/14 Regression] missed optimization: cmove not used with multiple returns

2023-11-22 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112657 --- Comment #8 from Jan Hubicka --- The negative return value branch predictor is set to have 98% hitrate (measured on SPEC2k17 some time ago). There is --param predictable-branch-outcome that is also set to 2% so indeed we consider the branch

[Bug middle-end/112653] PTA should handle correctly escape information of values returned by a function

2023-11-23 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112653 --- Comment #7 from Jan Hubicka --- Thanks for explanation. I think it is quite common pattern that new object is construted and worked on and later returned, so I think we ought to handle this correctly. Another example just came up in https:

[Bug middle-end/112706] New: missed simplification in FRE

2023-11-24 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112706 Bug ID: 112706 Summary: missed simplification in FRE Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end

[Bug target/109811] libjxl 0.7 is a lot slower in GCC 13.1 vs Clang 16

2023-11-24 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109811 --- Comment #15 from Jan Hubicka --- With SRA improvements r:aae723d360ca26cd9fd0b039fb0a616bd0eae363 we finally get good performance at -O2. Improvements to push_back implementation also helps a bit. Mainline with default flags (-O2): Inpu

<    1   2   3   4   5   6   7   8   9   >