[Bug debug/89528] New: Wrong debug info generated at -Og [gcc-trunk]
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89528 Bug ID: 89528 Summary: Wrong debug info generated at -Og [gcc-trunk] Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: debug Assignee: unassigned at gcc dot gnu.org Reporter: dccitaliano at gmail dot com Target Milestone: --- $ cat 5.c char b; int d, e; static int i = 1; void a(l) { printf("", l); } char(c)(char l) { return l || b && l == 1 ? b : b % l; } short(f)(l, m) { return l * m; } short(g)(short l, short m) { return m || l == 767 && m == 1; } int(h)(l, m) { return (l ^ m & l ^ (m & 647) - m ^ m) < m; } static int(j)(l) { return d == 0 || l == 647 && d == 1 ? l : l % d; } short(k)(l) { return l >= 2 >> l; } static short n(void) { int l_1127 = ~j(9 || 0) ^ 65535; optimize_me_not(); f(l_1127, i && e ^ 4) && g(0, 0); e = 0; return 5; } int main() { n(); } $ cat outer.c void optimize_me_not() {} ### -Og Reading symbols from ./a.out... (gdb) b 13 Breakpoint 1 at 0x400590: file 5.c, line 13. (gdb) r Starting program: /home/davide/finished-reducing-gcc/a.out Breakpoint 1, n () at 5.c:13 13optimize_me_not(); (gdb) info locals l_1127 = -4258335 ### -O0 Reading symbols from ./a.out... (gdb) b 13 Breakpoint 1 at 0x40064c: file 5.c, line 13. (gdb) r Starting program: /home/davide/finished-reducing-gcc/a.out Breakpoint 1, n () at 5.c:13 13optimize_me_not(); (gdb) info locals l_1127 = -65535 $ gcc-trunk --version gcc-trunk (GCC) 9.0.1 20190227 (experimental) [trunk revision 269248] Copyright (C) 2019 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. $ gdb-trunk --version GNU gdb (GDB) 8.3.50.20190227-git Copyright (C) 2019 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law.
[Bug debug/89529] New: Wrong debug info generated at -Og [gcc-trunk]
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89529 Bug ID: 89529 Summary: Wrong debug info generated at -Og [gcc-trunk] Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: debug Assignee: unassigned at gcc dot gnu.org Reporter: dccitaliano at gmail dot com Target Milestone: --- $ cat 2.c int a; void b() { short l_1862 = 19071; a = 0; for (; 0;) optimize_me_not(); --l_1862; } int main() { b(); } $ cat outer.c void optimize_me_not() {} ### -Og Reading symbols from ./a.out... (gdb) b 6 Breakpoint 1 at 0x40048c: file 2.c, line 7. (gdb) r Starting program: /home/davide/finished-reducing-gcc/a.out Breakpoint 1, b () at 2.c:7 7 --l_1862; (gdb) frame var No symbol "var" in current context. (gdb) info locals l_1862 = 19070 ### -O0 Reading symbols from ./a.out... (gdb) b 6 Breakpoint 1 at 0x400497: file 2.c, line 7. (gdb) r Starting program: /home/davide/finished-reducing-gcc/a.out Breakpoint 1, b () at 2.c:7 7 --l_1862; (gdb) info locals l_1862 = 19071 $ gcc-trunk --version gcc-trunk (GCC) 9.0.1 20190227 (experimental) [trunk revision 269248] Copyright (C) 2019 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. $ gdb-trunk --version GNU gdb (GDB) 8.3.50.20190227-git Copyright (C) 2019 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law.
[Bug debug/89529] Wrong debug info generated at -Og [gcc-trunk]
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89529 --- Comment #1 from dcci --- The breakpoint is set on the line where the decrement happens, so it should probably print the value before the decrement (at -Og)
[Bug debug/89530] New: Wrong debug informations for C array generated at -Og [gcc-trunk]
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89530 Bug ID: 89530 Summary: Wrong debug informations for C array generated at -Og [gcc-trunk] Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: debug Assignee: unassigned at gcc dot gnu.org Reporter: dccitaliano at gmail dot com Target Milestone: --- $ cat a.c int e, g; short f; char h; short(a)(b) {} int(c)(d) {} void i() { int j, k; unsigned short l_1404[3][9] = {58143, 8, 5, 80}; for (; f; f = 0) if (h) e = l_1404[0][7]; else { for (; g;) j = k = c(j); e = a(k); } optimize_me_not(); } int main() { i(); } $ cat outer.c void optimize_me_not() {} ### -O0 Reading symbols from ./a.out... (gdb) b 17 Breakpoint 1 at 0x40055c: file 3.c, line 17. (gdb) r Starting program: /home/davide/finished-reducing-gcc/a.out Breakpoint 1, i () at 3.c:17 17optimize_me_not(); (gdb) p l_1404[0][0] $1 = 58143 ### -Og Reading symbols from ./a.out... (gdb) b 17 Breakpoint 1 at 0x4004f5: file 3.c, line 17. (gdb) r Starting program: /home/davide/finished-reducing-gcc/a.out Breakpoint 1, i () at 3.c:17 17optimize_me_not(); (gdb) p l_1404[0][0] $1 = 9 $ gcc-trunk --version gcc-trunk (GCC) 9.0.1 20190227 (experimental) [trunk revision 269248] Copyright (C) 2019 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. $ gdb-trunk --version GNU gdb (GDB) 8.3.50.20190227-git Copyright (C) 2019 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law.
[Bug debug/89530] Wrong debug informations for C array generated at -Og [gcc-trunk]
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89530 --- Comment #2 from dcci --- Thanks Jakub. We're trying to report more of these but it's hard to filter out duplicates. A possible way we thought was that of stopping at some point in the pipeline (so running a subset of the optimizations), to identify where this broke. (something like https://llvm.org/docs/OptBisect.html). Is there any easy way of doing this in GCC? I skimmed through the docs and haven't found any.
[Bug debug/89530] Wrong debug informations for C array generated at -Og [gcc-trunk]
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89530 --- Comment #5 from dcci --- (In reply to Jakub Jelinek from comment #4) > Also, we usually bisect which gcc revision introduced a problem and from > that change we can often see what goes wrong quickly. Both Red Hat and SUSE > have terrabytes of built gcc revisions to make such bisection faster. I see. Is this data available for the masses?
[Bug debug/89530] Wrong debug informations for C array generated at -Og [gcc-trunk]
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89530 --- Comment #7 from dcci --- (In reply to Jakub Jelinek from comment #6) > (In reply to dcci from comment #5) > > (In reply to Jakub Jelinek from comment #4) > > > Also, we usually bisect which gcc revision introduced a problem and from > > > that change we can often see what goes wrong quickly. Both Red Hat and > > > SUSE > > > have terrabytes of built gcc revisions to make such bisection faster. > > > > I see. Is this data available for the masses? > > No, it is behind VPNs etc. One can do a git-bisect or write his own script > of course, the bisect seeds we have are just for those who do this several > times a day and don't want to wait until stuff builds again and again. I guess I'll just look at the dump, then.
[Bug tree-optimization/117033] GCC trunk emits larger code at -Os compared to -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117033 --- Comment #6 from Davide Italiano --- I noticed you linked the LLVM bug I found. As part of my search/analysis, I found out there are cases that sometimes clang gets but GCC doesn't (unsurprisingly, FWIW). Here's a simple one. https://godbolt.org/z/fjjzf9xds
[Bug tree-optimization/117033] GCC trunk emits larger code at -Os compared to -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117033 --- Comment #8 from Davide Italiano --- (In reply to Andrew Pinski from comment #7) > (In reply to Davide Italiano from comment #6) > > I noticed you linked the LLVM bug I found. > > As part of my search/analysis, I found out there are cases that sometimes > > clang gets but GCC doesn't (unsurprisingly, FWIW). > > > > Here's a simple one. > > https://godbolt.org/z/fjjzf9xds > > here is a slightly modified example which shows the opposite way, GCC > figures it at -Os but LLVM does not: > ``` > long patatino() { > long x = 0; > for (int i = 0; i < 5; ++i) { > if (x < 10) > while (x < 10) { > x += 1; > } > } > return x; > } > ``` Yes, indeed. I found example that go both way. From what I see, FWIW, but I'm still very early in my investigation, GCC's -Oz outperforms LLVM's -Oz on many cases.
[Bug middle-end/117123] New: [12/13/14/15 regression] Generated code at -Os on trunk is larger than GCC 14.4
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117123 Bug ID: 117123 Summary: [12/13/14/15 regression] Generated code at -Os on trunk is larger than GCC 14.4 Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: dccitaliano at gmail dot com Target Milestone: --- https://godbolt.org/z/ern8cbPzr Inline code: struct Potato { int size; bool isMashed; }; int dont_be_here(); int patatino(int a) { if (a > 5 && a % 2 == 0 && a != 10) { return a * 2; } else { Potato spud; spud.size = a; spud.isMashed = false; for (int i = 0; i < 10; i++) { if (i > 10 && i < 5 && a == -1) { for (int j = 0; j < 5; j++) { spud.size += j; } } } // Added a loop that never gets executed with a complex condition and statement inside for (int k = 0; k < 10 && a == -100 && spud.size > 1000; k++) { for (int l = 0; l < 5; l++) { spud.size += l * k; } } // Modified to add a loop that never gets executed with a call to dont_be_here() for (int m = 0; m < 10 && a == -1000 && spud.size < -1000; m++) { dont_be_here(); } // Modified function with added conditional statement if (a > 1 && spud.size < -1) { spud.size *= 2; } // Added another loop that never gets executed with a complex condition and statement inside for (int n = 0; n < 10 && a == -2000 && spud.size > 2000; n++) { for (int o = 0; o < 5; o++) { spud.size -= o * n; } } return spud.size; } }
[Bug middle-end/117128] New: [15 regression] GCC trunk generates larger code than GCC 14 at -Os/OZ
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117128 Bug ID: 117128 Summary: [15 regression] GCC trunk generates larger code than GCC 14 at -Os/OZ Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: dccitaliano at gmail dot com Target Milestone: --- int f(int a) { int result = 0; for (int i = 0; i < a && (i % 2 == 0 || a > 10); i++) { result += i * (a - i); for (int j = i; j < a / 2 && (j % 3 != 0 || i < 5); j++) { result -= j * (i + j); } } return result; } Godbolt: https://godbolt.org/z/qWKWWYb99
[Bug target/117103] GCC trunk emits push + pop at -Oz when a mov could suffice
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117103 --- Comment #3 from Davide Italiano --- > Note if you are doing code size comparison, then looking at the # of > instructions for a target like x86 is not the way to go. You need to actually > look at the assembled instruction output. Oops, my bad. I was using `size` in my calculations and look at the size of .text. I'll refine the search. Would you be interested in having a follow-up for the loop being removed entirely?
[Bug target/117128] [15 regression] GCC trunk generates larger code than GCC 14 at -Os/Oz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117128 Davide Italiano changed: What|Removed |Added CC||rguenth at gcc dot gnu.org --- Comment #2 from Davide Italiano --- Unless I screwed up the bisection, Richard, this points to: 237e83e2158a3d9b875f8775805d04d97e8b36c1 is the first bad commit commit 237e83e2158a3d9b875f8775805d04d97e8b36c1 Author: Richard Biener Date: Wed Jun 28 13:36:59 2023 +0200 tree-optimization/110451 - hoist invariant compare after interchange The following adjusts the cost model of invariant motion to consider [VEC_]COND_EXPRs and comparisons producing a data value as expensive. For 503.bwaves_r this avoids an unnecessarily high vectorization factor because of an integer comparison besides data operations on double. PR tree-optimization/110451 * tree-ssa-loop-im.cc (stmt_cost): [VEC_]COND_EXPR and tcc_comparison are expensive. * gfortran.dg/vect/pr110451.f: New testcase. gcc/testsuite/gfortran.dg/vect/pr110451.f | 51 +++ gcc/tree-ssa-loop-im.cc | 11 ++- 2 files changed, 61 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gfortran.dg/vect/pr110451.f
[Bug target/117103] New: GCC trunk emits push + pop at -Oz when a mov could suffice
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117103 Bug ID: 117103 Summary: GCC trunk emits push + pop at -Oz when a mov could suffice Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: dccitaliano at gmail dot com Target Milestone: --- Not a big deal, and maybe there are cases where it's not profitable, but I found this while testing: https://clang.godbolt.org/z/6K18T1a7a ``` int f(int a, int b, int increment) { int i = 0; while (i < a) { if (i % 2 == 0) { i += 2; } else { i++; } } int j = 0; while (j < b) { j += increment; if (j > b / 2) { j += 1; } } return 0; } ``` -Os emits: mov ecx, 2 -Oz emits: push2 [...] pop rcx Another interesting (maybe?, not sure) bit is that the loop could be elided entirely. I can file another bug if there's interest, or just reuse this one.
[Bug middle-end/117033] New: GCC trunk emits larger code at -Oz compared to -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117033 Bug ID: 117033 Summary: GCC trunk emits larger code at -Oz compared to -O2 Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: dccitaliano at gmail dot com Target Milestone: --- long patatino() { long x = 0; for (int i = 0; i < 5; ++i) { while (x < 37) { if (x % 4 == 0) { x += 4; } } } return x; } -Oz: patatino(): pushq $5 xorl%eax, %eax popq%rdx .L5: cmpq$36, %rax jg .L7 addq$4, %rax jmp .L5 .L7: decl%edx jne .L5 ret -O2: patatino(): movl$40, %eax ret https://godbolt.org/z/969sf9cfb Believe this is a pass ordering problem -- might be that simplifying the loop isn't always beneficial.
[Bug tree-optimization/117033] GCC trunk emits larger code at -Oz compared to -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117033 --- Comment #3 from Davide Italiano --- (In reply to Davide Italiano from comment #2) > Sorry, modified the title. This is `-Oz`, not `-Os` Writing this because you mentioned it blocks `-Os`, fwiw.
[Bug tree-optimization/117033] GCC trunk emits larger code at -Oz compared to -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117033 Davide Italiano changed: What|Removed |Added Blocks|103916 | Summary|GCC trunk emits larger code |GCC trunk emits larger code |at -Os compared to -O2 |at -Oz compared to -O2 --- Comment #2 from Davide Italiano --- Sorry, modified the title. This is `-Oz`, not `-Os` Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103916 [Bug 103916] [meta-bug] -Os vs loop header copy
[Bug middle-end/116994] New: [15 regression] GCC trunk generates larger code than GCC 14 at -Os
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116994 Bug ID: 116994 Summary: [15 regression] GCC trunk generates larger code than GCC 14 at -Os Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: dccitaliano at gmail dot com Target Milestone: --- I found this at -Os and I figured I could report it. For this specific (reduced) small example, GCC trunk generates code that's 7% larger than GCC 14. long patatino() { long x = 0; for (int i = 0; i < 5; ++i) { while (x < 10) { x += ((x % 2 == 0 && x % 3 != 0 && x % 7 != 0) || (x % 2 != 0 && x % 5 == 0 && x % 11 != 0)) ? 2 : 1; } } return x; } Godbolt example: https://godbolt.org/z/3bqfq4nM7
[Bug target/117006] New: [15 regression] GCC trunk generates larger code than GCC 14 at -Os
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117006 Bug ID: 117006 Summary: [15 regression] GCC trunk generates larger code than GCC 14 at -Os Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: dccitaliano at gmail dot com Target Milestone: --- Similar to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116994 https://godbolt.org/z/64bxGvnrh long patatino() { long x = 0; for (int i = 0; i < 5; ++i) { while (x < 10) { if ((x % 2 == 0 && x % 3 != 0) || (x % 5 == 0 && x > 5)) { if (x > 7) { x += 4; } else { x += 2; } } else { x += 1; } while (x % 4 == 0) { x += 3; } } } return x; } In particular, trunk generates .L4: lea rax, [rcx+4] lea rdx, [rcx+2] cmp rcx, 8 cmovl rax, rdx mov rcx, rax jmp .L7 instead of: .L4: cmp rcx, 7 jle .L6 add rcx, 4 jmp .L7 .L6: add rcx, 2 jmp .L7
[Bug tree-optimization/117253] New: [14/15 regression] Generated code at -Os on trunk is larger than GCC 13.3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117253 Bug ID: 117253 Summary: [14/15 regression] Generated code at -Os on trunk is larger than GCC 13.3 Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: dccitaliano at gmail dot com Target Milestone: --- https://gcc.godbolt.org/z/cqvMzPd4d Inline code: int f(int a) { int* ptr = &a; int result = *ptr; int* arr = new int[10]; for (int i = 0; i < 10; i++) { arr[i] = i; } union Data { int i; float f; } data; for (int i = 0; i < 10; i++) { for (int j = 0; j < 10; j++) { if ((i + j) % 3 == 0 && (i * j) % 2 != 0) { if ((i * j) > 20 && (i + j) < 15) { result += i * j; } result += i + j; *ptr += arr[i] + arr[j]; data.i = i + j; data.f = (float)data.i / 2.0f; result += data.i; for (int k = 0; k < 5; k++) { if ((k * i) % 4 == 0 && (j * k) % 5 != 0) { result += k * i * j; } } } } } delete[] arr; return result; } This points to: commit 55fcaa9a8bd9c8ce97ca929fc902c88cf92786a0 Author: Andrew Pinski Date: Wed Jun 7 09:05:15 2023 -0700 Add Plus to the op list of `(zero_one == 0) ? y : z y` pattern This adds plus to the op list of `(zero_one == 0) ? y : z y` patterns which currently has bit_ior and bit_xor. This shows up now in GCC after the boolization work that Uroš has been doing. OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions. PR tree-optimization/97711 PR tree-optimization/110155 gcc/ChangeLog: * match.pd ((zero_one == 0) ? y : z y): Add plus to the op. ((zero_one != 0) ? z y : y): Likewise. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/branchless-cond-add-2.c: New test. * gcc.dg/tree-ssa/branchless-cond-add.c: New test. gcc/match.pd | 4 ++-- gcc/testsuite/gcc.dg/tree-ssa/branchless-cond-add-2.c | 8 gcc/testsuite/gcc.dg/tree-ssa/branchless-cond-add.c | 18 ++ 3 files changed, 28 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/branchless-cond-add-2.c create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/branchless-cond-add.c
[Bug target/117253] [14/15 regression] Generated code at -Os on trunk is larger than GCC 13.3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117253 Davide Italiano changed: What|Removed |Added CC||rguenth at gcc dot gnu.org --- Comment #2 from Davide Italiano --- (In reply to Andrew Pinski from comment #1) > aarch64 does NOT show a regression. > > But basically the issue: > if ((i * j) > 20 && (i + j) < 15) { > result += i * j; > } > > is converted to result += (i * j) * ((i * j) > 20 && (i + j) < 15). > > And then selects the multiply due to code size and it just goes down hill > from there. > > This is 100% a synthetic test so I am not sure we are worried about the code > size increase here for x86_64; especially for x86_64. Thanks for taking a look Andrew. Not extremely familiar with this code, but I wonder if the fact this shows a regression only on some arches suggests this peephole is kind-of target-dependent?
[Bug rtl-optimization/117128] [15 regression] GCC trunk generates larger code than GCC 14 at -Os/Oz since r14-2161-g237e83e2158a3d
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117128 --- Comment #5 from Davide Italiano --- Another example: int f(int a) { int arr[5]; for (int i = 0; i < 5; i++) { if (i % 2 == 0 && a > 5 || i % 3 == 0 && a < -2) { arr[i] = a + i * 2; } else { arr[i] = a + i; } } return arr[2]; }
[Bug target/117253] [14/15 regression] Generated code at -Os on trunk is larger than GCC 13.3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117253 --- Comment #6 from Davide Italiano --- (In reply to Richard Biener from comment #4) > So probably IVOPTs related. With -fno-ivopts code generated by GCC 13 and > trunk are about the same size. Slightly less contrived example that points to the same commit (and hopefully the same regression, in case it's useful: int f(int a) { if (a > 5) { a *= 2; if (a % 3 == 0 && a < 20) { a += 5; } } return a; } https://godbolt.org/z/Wj85rscYK
[Bug target/117253] [14/15 regression] Generated code at -Os on trunk is larger than GCC 13.3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117253 --- Comment #5 from Davide Italiano --- (In reply to Sam James from comment #3) > Davide, was this reduced from a real application or are you > fuzzing/experimenting? (The reports are welcome either way.) Hi Sam, these are not reduced by a real application, they're result of a stress tester to find perf/size bugs/regressions.
[Bug middle-end/117271] New: [13/14/15 regression] GCC trunk emits larger code at -Os than 12.4.0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117271 Bug ID: 117271 Summary: [13/14/15 regression] GCC trunk emits larger code at -Os than 12.4.0 Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: dccitaliano at gmail dot com Target Milestone: --- Testcase (-Os): int f(int* a) { if (*a > 5 && ((*a % 2 == 0 && *a < 20) || (*a > 10 && *a % 3 != 0))) { *a = *a * 2; } int arr[5]; for (int i = 0; i < 5; i++) { arr[i] = *a + i; } for (int j = 0; j < 3; j++) { for (int k = 0; k < 2 && arr[j] % 2 == 0; k++) { arr[j] += k * 2; } } int sum = 0; for (int m = 0; m < 5 && (arr[m] % 3 == 0 || arr[m] > 10); m++) { sum += arr[m] * (m % 2 == 0 ? 2 : 3); for (int n = 0; n < 2 && (sum % 2 != 0 || m > 2); n++) { sum -= n * (m % 2 == 0 ? 1 : 2); if (n > 0 && sum > 10 && (arr[m] % 2 != 0 || m < 3)) { sum += arr[m] / 2; } } if (sum > 20 && m > 1) { sum *= 2; } } arr[0] += sum; return arr[0]; } Bisects to: commit aadc5c07feb0ab08729ab25d0d896b55860ad9e6 Author: Andrew Pinski Date: Mon Aug 7 00:05:21 2023 -0700 VR-VALUES [PR28794]: optimize compare assignments also This patch fixes the oldish (2006) bug where VRP was not optimizing the comparison for assignments while handling them for GIMPLE_COND only. It just happens to also solves PR 103281 due to allowing to optimize `c < 1` to `c == 0` and then we get `(c == 0) == c` (which was handled by r14-2501-g285c9d04). OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions. PR tree-optimization/103281 PR tree-optimization/28794 gcc/ChangeLog: * vr-values.cc (simplify_using_ranges::simplify_cond_using_ranges_1): Split out majority to ... (simplify_using_ranges::simplify_compare_using_ranges_1): Here. (simplify_using_ranges::simplify_casted_cond): Rename to ... (simplify_using_ranges::simplify_casted_compare): This and change arguments to take op0 and op1. (simplify_using_ranges::simplify_compare_assign_using_ranges_1): New method. (simplify_using_ranges::simplify): For tcc_comparison assignments call simplify_compare_assign_using_ranges_1. * vr-values.h (simplify_using_ranges): Add new methods, simplify_compare_using_ranges_1 and simplify_compare_assign_using_ranges_1. Rename simplify_casted_cond and simplify_casted_compare and update argument types. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/pr103281-1.c: New test. * gcc.dg/tree-ssa/vrp-compare-1.c: New test. gcc/testsuite/gcc.dg/tree-ssa/pr103281-1.c| 19 +++ gcc/testsuite/gcc.dg/tree-ssa/vrp-compare-1.c | 13 +++ gcc/vr-values.cc | 160 -- gcc/vr-values.h | 4 +- 4 files changed, 134 insertions(+), 62 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr103281-1.c create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/vrp-compare-1.c
[Bug middle-end/117271] [13/14/15 regression] GCC trunk emits larger code at -Os than 12.4.0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117271 --- Comment #1 from Davide Italiano --- https://godbolt.org/z/eExfPsjzs
[Bug middle-end/117123] [12/13/14/15 regression] Generated code at -Os on trunk is larger than GCC 14.4
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117123 --- Comment #1 from Davide Italiano --- I'd like to point out that in GCC-13.3 this seems to emit much shorter code: i.e. https://godbolt.org/z/YEvbs14PY _Z8patatinoi: movl%edi, %eax cmpl$5, %edi jle .L2 testl $1, %edi jne .L2 cmpl$10, %edi je .L2 addl%eax, %eax .L2: ret
[Bug rtl-optimization/117278] New: [12/13/14/15 regression] Code at -Os is larger on trunk than GCC 11.4.0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117278 Bug ID: 117278 Summary: [12/13/14/15 regression] Code at -Os is larger on trunk than GCC 11.4.0 Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: dccitaliano at gmail dot com Target Milestone: --- Looks like this regressed in GCC 12 and has never recovered since. Testcase (https://godbolt.org/z/T1vvzWjvK) int patatino(int* a) { if (*a > 5 && *a % 2 == 0) { if ((*a / 2) % 3 != 0) { *a = *a / 2; for (int i = 0; i < *a; i++) { if (i % 2 == 0 && (*a - i) % 3 != 0 && (i + (*a - i)) % 11 != 0) { for (int j = i; j < *a; j += 2) { *a += j % 5; if (j > *a / 3 && (*a + j) % 7 == 0 && (j * 2 + *a) % 13 == 0) { *a -= j / 2; } } } } } else { *a = 0; } } else { *a = *a + 1; } return 0; } Bisects to: dc1969dab392661cdac1170bbb8c9f83f388580d is the first bad commit commit dc1969dab392661cdac1170bbb8c9f83f388580d Author: Xionghu Luo Date: Wed Dec 29 20:02:12 2021 -0600 loop-invariant: Don't move cold bb instructions to preheader in RTL gcc/ChangeLog: 2021-12-30 Xionghu Luo * loop-invariant.c (find_invariants_bb): Check profile count before motion. (find_invariants_body): Add argument. gcc/testsuite/ChangeLog: 2021-12-30 Xionghu Luo * gcc.dg/loop-invariant-2.c: New. gcc/loop-invariant.c| 17 ++--- gcc/testsuite/gcc.dg/loop-invariant-2.c | 20 2 files changed, 34 insertions(+), 3 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/loop-invariant-2.c bisect found first bad commit
[Bug target/117253] [14/15 regression] Generated code at -Os on trunk is larger than GCC 13.3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117253 --- Comment #7 from Davide Italiano --- (In reply to Richard Biener from comment #4) > So probably IVOPTs related. With -fno-ivopts code generated by GCC 13 and > trunk are about the same size. For the second example (see code above) -- `-fno-ivopts` doesn't seem to help. Maybe a different issue, I don't know.
[Bug middle-end/116751] New: GCC trunk (-O3) doesn't optimize a loop that can be folded into a constant
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116751 Bug ID: 116751 Summary: GCC trunk (-O3) doesn't optimize a loop that can be folded into a constant Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: dccitaliano at gmail dot com Target Milestone: --- https://godbolt.org/z/qq1bbdMKn ``` #include int64_t patatino() { int64_t result = 0; for (int r = 0; r < 7; r++) { for (int s = 0; s < 10; s++) { for (int t = 0; t < 10; t++) { if (t + s + r == 10) { result += 1; } } } } return result; } ``` Changing the condition from `t + s + r == 10` to `t + s == 10` changes the generated code into: ``` patatino(): mov eax, 63 ret ```
[Bug middle-end/116753] New: [regression from GCC 12.4] GCC trunk (-O3) can't fold a loop into a constant
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116753 Bug ID: 116753 Summary: [regression from GCC 12.4] GCC trunk (-O3) can't fold a loop into a constant Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: dccitaliano at gmail dot com Target Milestone: --- https://godbolt.org/z/xGh9vY57b ``` long patatino() { long x = 0; for (int i = 0; i < 5; ++i) { while (x < 10) { if (x % 2 == 0) { x += 2; } else { x += 1; } // Dead while loop while ((x > 20) && (i % 3 == 0) && (x % 5 == 0)) { x -= 5; } // Dead while loop while ((x < -5) && (i % 2 == 0) && (x % 3 == 0)) { x += 3; } } } return x; } ``` GCC trunk emits: patatino(): mov eax, 2 .L2: add rax, 2 cmp rax, 9 jle .L2 ret GCC 12.4 emits: patatino(): mov eax, 10 ret
[Bug middle-end/116753] [13/14/15 Regression] GCC trunk (-O3) can't fold a loop into a constant
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116753 --- Comment #3 from dcci --- Slightly easier example that still fails (no nested loop): ``` long patatino() { long x = 0; while (x < 10) { if (x % 2 == 0) { x += 2; } else { x += 1; } // Dead if ((x > 20) && (x % 5 == 0)) { x -= 5; } // Dead if ((x < -5) && (x % 3 == 0)) { x += 3; } } return x; } ``` I am not an expert of GCC implementation but I would imagine some sort of range analysis being able to find that the two conditions are dead and remove them before unrolling. FWIW, LLVM seems to get this case right.
[Bug middle-end/116868] New: GCC trunk doesn't eliminate a superfluous new/delete pair
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116868 Bug ID: 116868 Summary: GCC trunk doesn't eliminate a superfluous new/delete pair Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: dccitaliano at gmail dot com Target Milestone: --- Maybe this is known, but i wasn't able to find the exact duplicate. (this is the closest I found -- apologies for the churn, if any) https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78104 https://godbolt.org/z/579rGG95z #include int sumVector() { const std::vector vec = {1}; int sum = 0; for (int i = 0; i < vec.size(); i++) { sum += vec[i]; } return sum; } At -O3 emits: sumVector(): sub rsp, 8 mov edi, 4 calloperator new(unsigned long) mov esi, 4 mov DWORD PTR [rax], 1 mov rdi, rax calloperator delete(void*, unsigned long) mov eax, 1 add rsp, 8 ret I'd expect to emit something like (clang does it): sumVector(): mov eax, 1 ret
[Bug target/117278] [12/13/14/15 regression] Code at -Os is larger on trunk than GCC 11.4.0 since r12-6149-gdc1969dab39266
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117278 --- Comment #4 from Davide Italiano --- (In reply to Davide Italiano from comment #3) > Another example that I found while looking at this: > > int f(int* a) { > int sum = *a; > for (int i = 0; i < 10; i++) { > if (i % 2 == 0 && (i > 3 || *a < 5)) { > for (int j = 0; j < 5; j++) { > if (j > 2 && sum > 0) { > sum += i + j; > } > for (int k = 0; k < 3; k++) { > if ((k * j) % 2 != 0 && i > 5) { > sum -= k; > } > if (k > 1 && j < 4 && (sum % (i + 1)) == 0) { > sum += k * j; > } > } > } > } > } > return sum; > } Sorry, wrong comment, ignore this one
[Bug middle-end/117160] New: [15 regression] GCC trunk generates larger code than GCC 14 at -Os/-Oz (progressed in 14)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117160 Bug ID: 117160 Summary: [15 regression] GCC trunk generates larger code than GCC 14 at -Os/-Oz (progressed in 14) Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: dccitaliano at gmail dot com Target Milestone: --- https://godbolt.org/z/764Gs8Tvf Inline code: struct X { int size; char color; }; int f(int a) { X* p = new X(); p->size = a; p->color = 'R'; for (int i = 0; i < p->size && (p->size % 2 == 0 || i < 5); i++) { if (i > 2 && p->size > 10) { p->size -= i; } else { p->size += i; } for (int j = 0; j < i && (p->size * j) % 3 == 0; j++) { p->size += j * 2; } } int result = p->size; delete p; return result; } Points to: 3e1bd6470e4deba1a3ad14621037098311ad1350 is the first bad commit commit 3e1bd6470e4deba1a3ad14621037098311ad1350 Author: Richard Biener Date: Tue Oct 1 10:37:16 2024 +0200 tree-optimization/116906 - unsafe PRE with never executed edges When we're computing ANTIC for PRE we treat edges to not yet visited blocks as having a maximum ANTIC solution to get at an optimistic solution in the iteration. That assumes the edges visted eventually execute. This is a wrong assumption that can lead to wrong code (and not only non-optimality) when possibly trapping expressions are involved as the testcases in the PR show. The following mitigates this by pruning trapping expressions from ANTIC computed when maximum sets are involved. PR tree-optimization/116906 * tree-ssa-pre.cc (prune_clobbered_mems): Add clean_traps argument. (compute_antic_aux): Direct prune_clobbered_mems to prune all traps when any MAX solution was involved in the ANTIC computation. (compute_partial_antic_aux): Adjust. * gcc.dg/pr116906-1.c: New testcase. * gcc.dg/pr116906-2.c: Likewise. gcc/testsuite/gcc.dg/pr116906-1.c | 43 +++ gcc/testsuite/gcc.dg/pr116906-2.c | 40 gcc/tree-ssa-pre.cc | 16 +-- 3 files changed, 93 insertions(+), 6 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/pr116906-1.c create mode 100644 gcc/testsuite/gcc.dg/pr116906-2.c bisect found first bad commit
[Bug target/116994] [15 regression] GCC trunk generates larger code than GCC 14 at -Os
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116994 Davide Italiano changed: What|Removed |Added CC||lingling.kong7 at gmail dot com --- Comment #2 from Davide Italiano --- This bisects to: commit d826f7945609046f922732b138fb90795d5b1985 Author: konglin1 Date: Wed May 8 15:46:10 2024 +0800 x86: Fix cmov cost model issue [PR109549] (if_then_else:SI (eq (reg:CCZ 17 flags) (const_int 0 [0])) (reg/v:SI 101 [ e ]) (reg:SI 102)) The cost is 8 for the rtx, the cost for (eq (reg:CCZ 17 flags) (const_int 0 [0])) is 4, but this is just an operator do not need to compute it's cost in cmov. gcc/ChangeLog: PR target/109549 * config/i386/i386.cc (ix86_rtx_costs): The XEXP (x, 0) for cmov is an operator do not need to compute cost. gcc/testsuite/ChangeLog: * gcc.target/i386/cmov6.c: Fixed. gcc/config/i386/i386.cc | 2 +- gcc/testsuite/gcc.target/i386/cmov6.c | 5 + 2 files changed, 2 insertions(+), 5 deletions(-)
[Bug middle-end/117123] [14/15 regression] Generated code at -Os on trunk is larger than GCC 14.4
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117123 Davide Italiano changed: What|Removed |Added CC||pheeck at gcc dot gnu.org, ||rguenth at gcc dot gnu.org --- Comment #2 from Davide Italiano --- This points to: commit cd794c3961017703a4d2ca0e854ea23b3d4b6373 Author: Filip Kastl Date: Thu Dec 14 11:29:31 2023 +0100 A new copy propagation and PHI elimination pass This patch adds the strongly-connected copy propagation (SCCOPY) pass. It is a lightweight GIMPLE copy propagation pass that also removes some redundant PHI statements. It handles degenerate PHIs, e.g.: _5 = PHI <_1>; _6 = PHI <_6, _6, _1, _1>; _7 = PHI <16, _7>; // Replaces occurences of _5 and _6 by _1 and _7 by 16 It also handles more complicated situations, e.g.: _8 = PHI <_9, _10>; _9 = PHI <_8, _10>; _10 = PHI <_8, _9, _1>; // Replaces occurences of _8, _9 and _10 by _1 gcc/ChangeLog: * Makefile.in: Added sccopy pass. * passes.def: Added sccopy pass before LTO streaming and before RTL expansion. * tree-pass.h (make_pass_sccopy): Added sccopy pass. * gimple-ssa-sccopy.cc: New file. gcc/testsuite/ChangeLog: * gcc.dg/sccopy-1.c: New test. Signed-off-by: Filip Kastl gcc/Makefile.in | 1 + gcc/gimple-ssa-sccopy.cc| 680 gcc/passes.def | 2 + gcc/testsuite/gcc.dg/sccopy-1.c | 78 + gcc/tree-pass.h | 1 + 5 files changed, 762 insertions(+) create mode 100644 gcc/gimple-ssa-sccopy.cc create mode 100644 gcc/testsuite/gcc.dg/sccopy-1.c bisect found first bad commit
[Bug target/117278] [12/13/14/15 regression] Code at -Os is larger on trunk than GCC 11.4.0 since r12-6149-gdc1969dab39266
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117278 --- Comment #3 from Davide Italiano --- Another example that I found while looking at this: int f(int* a) { int sum = *a; for (int i = 0; i < 10; i++) { if (i % 2 == 0 && (i > 3 || *a < 5)) { for (int j = 0; j < 5; j++) { if (j > 2 && sum > 0) { sum += i + j; } for (int k = 0; k < 3; k++) { if ((k * j) % 2 != 0 && i > 5) { sum -= k; } if (k > 1 && j < 4 && (sum % (i + 1)) == 0) { sum += k * j; } } } } } return sum; }
[Bug rtl-optimization/117128] [15 regression] GCC trunk generates larger code than GCC 14 at -Os/Oz since r14-2161-g237e83e2158a3d
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117128 --- Comment #6 from Davide Italiano --- Yet another example: int f(int* a) { int sum = *a; for (int i = 0; i < 10; i++) { if (i % 2 == 0 && (i > 3 || *a < 5)) { for (int j = 0; j < 5; j++) { if (j > 2 && sum > 0) { sum += i + j; } for (int k = 0; k < 3; k++) { if ((k * j) % 2 != 0 && i > 5) { sum -= k; } if (k > 1 && j < 4 && (sum % (i + 1)) == 0) { sum += k * j; } } } } } return sum; }