[Bug c/111000] New: Wrong code at -O3 on x86_64-linux-gnu since r14-2944-g3d48c11ad08
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111000 Bug ID: 111000 Summary: Wrong code at -O3 on x86_64-linux-gnu since r14-2944-g3d48c11ad08 Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: shaohua.li at inf dot ethz.ch Target Milestone: --- gcc at -O3 produces the wrong code. Bisected to r14-2944-g3d48c11ad08 Compiler explorer: https://godbolt.org/z/b67W17Gvb $ cat a.c int printf(const char *, ...); long a = 68; int b, d, e; int main() { for (; d <= 6; d++) { b = 0; for (; b <= 6; b++) { int c = a; e = c >= 32 || d > 647 >> c ? d : 0; } } printf("%d\n", e); } $ $ gcc -O0 a.c &&./a.out 6 $ gcc -O3 a.c && ./a.out 4 $
[Bug tree-optimization/111000] [14 Regression] Wrong code at -O3 on x86_64-linux-gnu since r14-2944-g3d48c11ad08
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111000 Andrew Pinski changed: What|Removed |Added Component|c |tree-optimization Target Milestone|--- |14.0 Keywords||wrong-code Summary|Wrong code at -O3 on|[14 Regression] Wrong code |x86_64-linux-gnu since |at -O3 on x86_64-linux-gnu |r14-2944-g3d48c11ad08 |since r14-2944-g3d48c11ad08
[no subject]
ทุนหมุนเวียนธุรกิจระยะสั้นสำหรับ ผู้ประกอบการ โรงงานฯ หจก. บริษัท ธุรกิจ SMEs ง่ายกว่าธนาคาร | ไม่เช็คบูโร | ลดต้น ลดดอกเบี้ย | ไม่ต้องค้ำ | คุยง่าย อนุมัติไวเร็ว 📞โทร 082 5928519 คุณเอก 📞โทร 063 2543219 ตะวัน 💬ไลน์ไอดี esc.credit ✅ดอกเบี้ยต่ำสุด 1.25%* ✅ปิดยอดได้ตลอดเวลา ไม่ต้องรอให้ครบสัญญา ✅ฟรีค่าธรรมเนียม ไม่เรียกเก็บเงินก่อนทำสัญญาทุกกรณี
[Bug middle-end/110986] [14 Regression] aarch64 has support for conditional not (and vectors can do conditional not still) after r14-3110-g7fb65f10285
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110986 --- Comment #14 from Andrew Pinski --- The next and final part for cond_unary_5.c: (for unpack (vec_unpack_lo vec_unpack_hi) (simplify (negate (unpack (vec_cond @0 uniform_integer_cst_p@1 uniform_integer_cst_p@2))) (with { tree outer_mask_type = truth_type_for (type); tree allones = build_minus_one_cst (type); tree zeros = build_zero_cst (type); } (if (integer_onep (@1) && integer_zerop (@2)) (vec_cond (unpack:outer_mask_type @0) { allones; } { zeros; } ) (if (integer_onep (@2) && integer_zerop (@1)) (vec_cond (unpack:outer_mask_type @0) { zeros; } { allones; } )) (simplify (negate (vec_pack_trunc (vec_cond @0 uniform_integer_cst_p@1 uniform_integer_cst_p@2) (vec_cond @3 @1 @2))) (with { tree outer_mask_type = truth_type_for (type); tree allones = build_minus_one_cst (type); tree zeros = build_zero_cst (type); } (if (integer_onep (@1) && integer_zerop (@2)) (vec_cond (vec_pack_trunc:outer_mask_type @0 @3) { allones; } { zeros; } ) (if (integer_onep (@2) && integer_zerop (@1)) (vec_cond (vec_pack_trunc:outer_mask_type @0 @3) { zeros; } { allones; } )
[Bug target/111001] New: SH: ICE during RTL pass: sh_treg_combine2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111001 Bug ID: 111001 Summary: SH: ICE during RTL pass: sh_treg_combine2 Product: gcc Version: 13.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: wbx at openadk dot org Target Milestone: --- Created attachment 55731 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55731&action=edit preprocessed source code of rw_bitmaps.c Hi, following compile error occurs for Buildroot targeting sh4-linux-gnu for the e2fsprogs package: terminate called after throwing an instance of 'std::bad_alloc' what(): std::bad_alloc during RTL pass: sh_treg_combine2 rw_bitmaps.c: In function ‘read_bitmaps_range_start’: rw_bitmaps.c:447:1: internal compiler error: Aborted 447 | } | ^ 0x7fe95c926f8f ??? ./signal/../sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c:0 0x7fe95c975ccc __pthread_kill_implementation ./nptl/pthread_kill.c:44 0x7fe95c926ef1 __GI_raise ../sysdeps/posix/raise.c:26 0x7fe95c911471 __GI_abort ./stdlib/abort.c:79 0x7fe95c912189 __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 0x7fe95c912244 __libc_start_main_impl ../csu/libc-start.c:381 Please submit a full bug report, with preprocessed source (by using -freport-bug). Please include the complete backtrace with any bug report. Attached is the preprocessed rw_bitmaps.c file. The problem exist only for -O1/-O2/-O3, using -Os/O0 does not trigger the ICE. And it is an new issue in gcc 13.2.0, it does not happen for 12.3.0. Do you need more information, do not hesitate to ask. best regards Waldemar
[Bug target/111001] SH: ICE during RTL pass: sh_treg_combine2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111001 --- Comment #1 from Waldemar Brodkorb --- /home/browa22-ext/e2fsprogs/output/host/bin/sh4-buildroot-linux-gnu-gcc -v Using built-in specs. COLLECT_GCC=/home/browa22-ext/e2fsprogs/output/host/bin/sh4-buildroot-linux-gnu-gcc.br_real COLLECT_LTO_WRAPPER=/home/browa22-ext/e2fsprogs/output/host/libexec/gcc/sh4-buildroot-linux-gnu/13.2.0/lto-wrapper Target: sh4-buildroot-linux-gnu Configured with: ./configure --prefix=/home/browa22-ext/e2fsprogs/output/host --sysconfdir=/home/browa22-ext/e2fsprogs/output/host/etc --enable-static --target=sh4-buildroot-linux-gnu --with-sysroot=/home/browa22-ext/e2fsprogs/output/host/sh4-buildroot-linux-gnu/sysroot --enable-__cxa_atexit --with-gnu-ld --disable-libssp --disable-multilib --disable-decimal-float --enable-plugins --enable-lto --with-gmp=/home/browa22-ext/e2fsprogs/output/host --with-mpc=/home/browa22-ext/e2fsprogs/output/host --with-mpfr=/home/browa22-ext/e2fsprogs/output/host --with-pkgversion='Buildroot 2023.08-rc1-68-g27dc493780-dirty' --with-bugurl=http://bugs.buildroot.net/ --without-zstd --disable-libquadmath --disable-libquadmath-support --enable-tls --enable-threads --without-isl --without-cloog --enable-languages=c --with-build-time-tools=/home/browa22-ext/e2fsprogs/output/host/sh4-buildroot-linux-gnu/bin --with-multilib-list=m4,m4-nofpu --enable-shared --disable-libgomp Thread model: posix Supported LTO compression algorithms: zlib gcc version 13.2.0 (Buildroot 2023.08-rc1-68-g27dc493780-dirty) COMPILER_PATH=/home/browa22-ext/e2fsprogs/output/host/libexec/gcc/sh4-buildroot-linux-gnu/13.2.0/:/home/browa22-ext/e2fsprogs/output/host/libexec/gcc/sh4-buildroot-linux-gnu/13.2.0/:/home/browa22-ext/e2fsprogs/output/host/libexec/gcc/sh4-buildroot-linux-gnu/:/home/browa22-ext/e2fsprogs/output/host/lib/gcc/sh4-buildroot-linux-gnu/13.2.0/:/home/browa22-ext/e2fsprogs/output/host/lib/gcc/sh4-buildroot-linux-gnu/:/home/browa22-ext/e2fsprogs/output/host/lib/gcc/sh4-buildroot-linux-gnu/13.2.0/../../../../sh4-buildroot-linux-gnu/bin/ LIBRARY_PATH=/home/browa22-ext/e2fsprogs/output/host/lib/gcc/sh4-buildroot-linux-gnu/13.2.0/:/home/browa22-ext/e2fsprogs/output/host/lib/gcc/sh4-buildroot-linux-gnu/13.2.0/../../../../sh4-buildroot-linux-gnu/lib/!m4/:/home/browa22-ext/e2fsprogs/output/host/lib/gcc/sh4-buildroot-linux-gnu/13.2.0/../../../../sh4-buildroot-linux-gnu/lib/:/home/browa22-ext/e2fsprogs/output/host/sh4-buildroot-linux-gnu/sysroot/lib/:/home/browa22-ext/e2fsprogs/output/host/sh4-buildroot-linux-gnu/sysroot/usr/lib/ COLLECT_GCC_OPTIONS='--sysroot=/home/browa22-ext/e2fsprogs/output/host/sh4-buildroot-linux-gnu/sysroot' '-fstack-protector-strong' '-fPIE' '-pie' '-v' '-dumpdir' 'a.' /home/browa22-ext/e2fsprogs/output/host/libexec/gcc/sh4-buildroot-linux-gnu/13.2.0/collect2 -plugin /home/browa22-ext/e2fsprogs/output/host/libexec/gcc/sh4-buildroot-linux-gnu/13.2.0/liblto_plugin.so -plugin-opt=/home/browa22-ext/e2fsprogs/output/host/libexec/gcc/sh4-buildroot-linux-gnu/13.2.0/lto-wrapper -plugin-opt=-fresolution=/tmp/ccQ7cSBB.res -plugin-opt=-pass-through=-lgcc -plugin-opt=-pass-through=-lgcc_s -plugin-opt=-pass-through=-lc -plugin-opt=-pass-through=-lgcc -plugin-opt=-pass-through=-lgcc_s --sysroot=/home/browa22-ext/e2fsprogs/output/host/sh4-buildroot-linux-gnu/sysroot --eh-frame-hdr -m shlelf_linux -dynamic-linker /lib/ld-linux.so.2 -pie /home/browa22-ext/e2fsprogs/output/host/sh4-buildroot-linux-gnu/sysroot/usr/lib/Scrt1.o /home/browa22-ext/e2fsprogs/output/host/sh4-buildroot-linux-gnu/sysroot/usr/lib/crti.o /home/browa22-ext/e2fsprogs/output/host/lib/gcc/sh4-buildroot-linux-gnu/13.2.0/crtbeginS.o -L/home/browa22-ext/e2fsprogs/output/host/lib/gcc/sh4-buildroot-linux-gnu/13.2.0 -L/home/browa22-ext/e2fsprogs/output/host/lib/gcc/sh4-buildroot-linux-gnu/13.2.0/../../../../sh4-buildroot-linux-gnu/lib/!m4 -L/home/browa22-ext/e2fsprogs/output/host/lib/gcc/sh4-buildroot-linux-gnu/13.2.0/../../../../sh4-buildroot-linux-gnu/lib -L/home/browa22-ext/e2fsprogs/output/host/sh4-buildroot-linux-gnu/sysroot/lib -L/home/browa22-ext/e2fsprogs/output/host/sh4-buildroot-linux-gnu/sysroot/usr/lib -z now -z relro -lgcc --push-state --as-needed -lgcc_s --pop-state -lc -lgcc --push-state --as-needed -lgcc_s --pop-state /home/browa22-ext/e2fsprogs/output/host/lib/gcc/sh4-buildroot-linux-gnu/13.2.0/crtendS.o /home/browa22-ext/e2fsprogs/output/host/sh4-buildroot-linux-gnu/sysroot/usr/lib/crtn.o /home/browa22-ext/e2fsprogs/output/host/lib/gcc/sh4-buildroot-linux-gnu/13.2.0/../../../../sh4-buildroot-linux-gnu/bin/ld: /home/browa22-ext/e2fsprogs/output/host/sh4-buildroot-linux-gnu/sysroot/usr/lib/Scrt1.o: in function `L_main': start.os:(.text+0x1c): undefined reference to `main' /home/browa22-ext/e2fsprogs/output/host/lib/gcc/sh4-buildroot-linux-gnu/13.2.0/../../../../sh4-buildroot-linux-gnu/bin/ld: BFD (GNU Binutils) 2.40 assertion fail elf32-sh.c:3924 collect2: error: ld returned 1 exit status browa22-ext@lxwbrodk:~$ /home/browa22-ext/e2fsprogs/output
[Bug tree-optimization/111002] New: Code generation for vectorized -(a[i] != 0) with number of elements chang could be improved
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111002 Bug ID: 111002 Summary: Code generation for vectorized -(a[i] != 0) with number of elements chang could be improved Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: enhancement Priority: P3 Component: tree-optimization Assignee: pinskia at gcc dot gnu.org Reporter: pinskia at gcc dot gnu.org Target Milestone: --- Testcase: ``` void __attribute__ ((noipa)) f (int *__restrict r, int *__restrict a, short *__restrict pred) { for (int i = 0; i < 1024; ++i) r[i] = pred[i] != 0 ? -1 : 0; } ``` Kinda of patch: ``` /* Sink unary operations to branches, but only if we do fold both. */ (for op (negate bit_not abs absu) (simplify (op (view_convert? (vec_cond:s @0 @1 @2))) (if (element_precision (type) == element_precision (@1)) (vec_cond @0 (op! (view_convert @1)) (op! (view_convert @2)) ``` That is `Sink unary operations` one needs to add support for view_convert there ... I Noticed this while working on PR 110986 (but is not needed for that issue).
[Bug tree-optimization/111002] Code generation for vectorized -(a[i] != 0) with number of elements chang could be improved
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111002 Andrew Pinski changed: What|Removed |Added Ever confirmed|0 |1 Last reconfirmed||2023-08-12 Status|UNCONFIRMED |ASSIGNED --- Comment #1 from Andrew Pinski --- Note it looks like 4.7 used to produce the code without the secondary negate there ...
[Bug libstdc++/110860] std::format("{:f}",2e304) invokes undefined behaviour
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110860 --- Comment #12 from Paul Dreik --- The last fix is unfortunately not sufficient either, because for abs(__v)<1 log10 becomes negative and that wont convert gracefully to size_t. I implemented the following fix, which avoids log10 and uses frexp instead since we only need an approximation anyway. That also removes the need for handling the sign. The magic constant 4004U / 13301U is obtained from Gnu Octave rats(log10(2),11) which I chose because it is slightly conservative. I think integer math is good here, to avoid conversions. I experimented with bitwise fiddling to get the exponent but I think that is less portable and readable than frexp. It does however avoid a function call to frexp and is twice as fast (I benchmarked it). What do you think of the following patch? commit a7b133fb073ebd7f6ba686f31530ed20d656bd57 Author: Paul Dreik Date: Sat Aug 12 13:16:30 2023 +0200 libstdc++: Avoid problematic use of log10 in std::format [PR110860] If abs(__v) is smaller than one, the result will be on the form 0.x. It is only if the magnitude is large that more digits are needed before the decimal dot. This uses frexp instead of log10 which should be less expensive and have sufficient precision for the desired purpose. It removes the problematic cases where log10 will be negative or not fit in an int. diff --git a/libstdc++-v3/include/std/format b/libstdc++-v3/include/std/format index 23da6b008..8d147abe9 100644 --- a/libstdc++-v3/include/std/format +++ b/libstdc++-v3/include/std/format @@ -1490,14 +1490,22 @@ namespace __format // If the buffer is too small it's probably because of a large // precision, or a very large value in fixed format. size_t __guess = 8 + __prec; - if (__fmt == chars_format::fixed && __v != 0) // +ddd.prec + if (__fmt == chars_format::fixed) // +ddd.prec { - if constexpr (is_same_v<_Fp, float>) - __guess += __builtin_log10f(__v < 0.0f ? -__v : __v); - else if constexpr (is_same_v<_Fp, double>) - __guess += __builtin_log10(__v < 0.0 ? -__v : __v); - else if constexpr (is_same_v<_Fp, long double>) - __guess += __builtin_log10l(__v < 0.0l ? -__v : __v); + if constexpr (is_same_v<_Fp, float> || is_same_v<_Fp, double> || is_same_v<_Fp, long double>) + { + // the number of digits to the left of the decimal point + // is floor(log10(max(abs(__v),1)))+1 + int __exp{}; + if constexpr (is_same_v<_Fp, float>) + __builtin_frexpf(__v, &__exp); + else if constexpr (is_same_v<_Fp, double>) + __builtin_frexp(__v, &__exp); + else if constexpr (is_same_v<_Fp, long double>) + __builtin_frexpl(__v, &__exp); + if (__exp>0) + __guess += 1U + __exp * 4004U / 13301U; // log10(2) approx. + } else __guess += numeric_limits<_Fp>::max_exponent10; }
[Bug libstdc++/110860] std::format("{:f}",2e304) invokes undefined behaviour
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110860 --- Comment #13 from Jonathan Wakely --- Oh dear, I should stop trying to do so many things at once. I did consider using __v < some_constant because we only need that code for large values, not for 0 and not for anything less than 1. We could maybe use numeric_limits::digits10 or something like that. Your approach looks good though. I'm not sure if frexpf and frexpl are universally supported on all targets we care about, but maybe they're present on all targets where our std::to_chars is enabled.
[Bug libstdc++/110860] std::format("{:f}",2e304) invokes undefined behaviour
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110860 --- Comment #14 from Jonathan Wakely --- That portability concern already applied to log10f and log10l anyway.
[Bug modula2/108119] m2rte plugin should be disabled by default
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108119 --- Comment #6 from CVS Commits --- The master branch has been updated by Gaius Mulley : https://gcc.gnu.org/g:46905fcde00fd84eb06b6bd1a6e788171d32865b commit r14-3179-g46905fcde00fd84eb06b6bd1a6e788171d32865b Author: Gaius Mulley Date: Sat Aug 12 13:43:14 2023 +0100 PR modula2/108119 disable m2rte plugin by default This patch disables the m2rte plugin by default. The driver will only append the -fplugin=m2rte command line option for cc1gm2 if -fm2-plugin is present. It only enabled providing ENABLE_PLUGIN is defined. gcc/m2/Make-file.in will only build and install m2rte if enable_plugin is yes. gcc/m2/ChangeLog: PR modula2/108119 * Make-lang.in (M2RTE_PLUGIN_SO): Assigned to plugin/m2rte$(exeext).so if enable_plugin is yes. (m2.all.cross): Replace plugin/m2rte$(soext) with $(M2RTE_PLUGIN_SO). (m2.all.encap): Replace plugin/m2rte$(soext) with $(M2RTE_PLUGIN_SO). (m2.install-plugin): Add dummy rule when enable_plugin is not yes. (plugin/m2rte$(exeext).so): Add dummy rule when enable_plugin is not yes. (m2/stage2/cc1gm2$(exeext)): Replace plugin/m2rte$(soext) with $(M2RTE_PLUGIN_SO). (m2/stage1/cc1gm2$(exeext)): Replace plugin/m2rte$(soext) with $(M2RTE_PLUGIN_SO). * gm2spec.cc (lang_specific_driver): Set need_plugin to false by default. gcc/testsuite/ChangeLog: PR modula2/108119 * gm2/iso/check/fail/iso-check-fail.exp (gm2_init_iso): Add -fm2-plugin. * gm2/switches/auto-init/fail/switches-auto-init-fail.exp (gm2_init_iso): Add -fm2-plugin. * gm2/switches/check-all/pim2/fail/switches-check-all-pim2-fail.exp (gm2_init_pim2): Add -fm2-plugin. * gm2/switches/check-all/plugin/iso/fail/switches-check-all-plugin-iso-fail.exp (gm2_init_iso): Add -fm2-plugin. * gm2/switches/check-all/plugin/pim2/fail/switches-check-all-plugin-pim2-fail.exp (gm2_init_pim2): Add -fm2-plugin. Signed-off-by: Gaius Mulley
[Bug modula2/108119] m2rte plugin should be disabled by default
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108119 --- Comment #7 from CVS Commits --- The releases/gcc-13 branch has been updated by Gaius Mulley : https://gcc.gnu.org/g:131d5ffd42497b84c2a3329b02e075732e4fd6d2 commit r13-7715-g131d5ffd42497b84c2a3329b02e075732e4fd6d2 Author: Gaius Mulley Date: Sat Aug 12 13:53:32 2023 +0100 PR modula2/108119 disable m2rte plugin by default This patch disables the m2rte plugin by default. The driver will only append the -fplugin=m2rte command line option for cc1gm2 if -fm2-plugin is present. It only enabled providing ENABLE_PLUGIN is defined. gcc/m2/Make-file.in will only build and install m2rte if enable_plugin is yes. gcc/m2/ChangeLog: PR modula2/108119 * Make-lang.in (M2RTE_PLUGIN_SO): Assigned to plugin/m2rte$(exeext).so if enable_plugin is yes. (m2.all.cross): Replace plugin/m2rte$(soext) with $(M2RTE_PLUGIN_SO). (m2.all.encap): Replace plugin/m2rte$(soext) with $(M2RTE_PLUGIN_SO). (m2.install-plugin): Add dummy rule when enable_plugin is not yes. (plugin/m2rte$(exeext).so): Add dummy rule when enable_plugin is not yes. (m2/stage2/cc1gm2$(exeext)): Replace plugin/m2rte$(soext) with $(M2RTE_PLUGIN_SO). (m2/stage1/cc1gm2$(exeext)): Replace plugin/m2rte$(soext) with $(M2RTE_PLUGIN_SO). * gm2spec.cc (lang_specific_driver): Set need_plugin to false by default. gcc/testsuite/ChangeLog: PR modula2/108119 * gm2/iso/check/fail/iso-check-fail.exp (gm2_init_iso): Add -fm2-plugin. * gm2/switches/auto-init/fail/switches-auto-init-fail.exp (gm2_init_iso): Add -fm2-plugin. * gm2/switches/check-all/pim2/fail/switches-check-all-pim2-fail.exp (gm2_init_pim2): Add -fm2-plugin. * gm2/switches/check-all/plugin/iso/fail/switches-check-all-plugin-iso-fail.exp (gm2_init_iso): Add -fm2-plugin. * gm2/switches/check-all/plugin/pim2/fail/switches-check-all-plugin-pim2-fail.exp (gm2_init_pim2): Add -fm2-plugin. Signed-off-by: Gaius Mulley
[Bug modula2/108119] m2rte plugin should be disabled by default
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108119 Gaius Mulley changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #8 from Gaius Mulley --- Closing now that the patch has been applied on gcc-13 and bootstrapped successfully on ppc64le. Patch also applied to gcc-14 and bootstrapped on x86_64 and aarch64 successfully.
[Bug target/105504] Fails to break dependency for vcvtss2sd xmm, xmm, mem
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105504 Eric Gallager changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED --- Comment #9 from Eric Gallager --- (In reply to Hongtao.liu from comment #8) > (In reply to Eric Gallager from comment #7) > > > > Did this fix it? > > Yes. OK, closing, then
[Bug tree-optimization/111003] New: [14 Regression] Dead Code Elimination Regression at -O3 since r14-2161-g237e83e2158
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111003 Bug ID: 111003 Summary: [14 Regression] Dead Code Elimination Regression at -O3 since r14-2161-g237e83e2158 Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: scherrer.sv at gmail dot com Target Milestone: --- static int c, d, e, f; static short g; static int *h = &c; void foo(void); short(a)(); static unsigned b(unsigned char j, int l) { return j > l ? j : j << l; } static int *i(); static void k(int j, unsigned char l) { i(); g = f; f = g; for (; g;) { int m; d = a(); for (; d;) { if (l) if (!(j >= -639457069 && j <= -639457069)) if (m) foo(); m = (10 != (l ^ b(j, 6))) < (0 > e); } } } static int *i() { for (; e; e = a(e, 6)) ; return h; } int main() { k(c, c); } gcc-8441841a1b9 (trunk) -O3 cannot eliminate the call to foo but gcc-releases/gcc-13.1.0 -O3 can. --- gcc-8441841a1b985d68245954af1ff023db121b0635 -O3 case.c -S -o case.s - OUTPUT - main: .LFB3: .cfi_startproc pushq %r13 .cfi_def_cfa_offset 16 .cfi_offset 13, -16 pushq %r12 .cfi_def_cfa_offset 24 .cfi_offset 12, -24 pushq %rbp .cfi_def_cfa_offset 32 .cfi_offset 6, -32 pushq %rbx .cfi_def_cfa_offset 40 .cfi_offset 3, -40 subq$8, %rsp .cfi_def_cfa_offset 48 movle(%rip), %edi movlc(%rip), %ebx testl %edi, %edi je .L5 .p2align 4,,10 .p2align 3 .L2: movl$6, %esi xorl%eax, %eax calla movswl %ax, %edi movl%edi, e(%rip) testl %edi, %edi jne .L2 .L5: movlf(%rip), %eax movswl %ax, %edx movw%ax, g(%rip) movl%edx, f(%rip) testw %ax, %ax je .L36 movzbl %bl, %ebp testb %bl, %bl je .L39 movl%ebp, %eax sall$6, %eax xorl%ebp, %eax cmpl$10, %eax setne %r12b .p2align 4,,10 .p2align 3 .L18: xorl%eax, %eax calla cwtl movl%eax, d(%rip) testl %eax, %eax je .L14 cmpl$-639457069, %ebx jne .L17 .L19: jmp .L19 .p2align 4,,10 .p2align 3 .L16: cmpl$6, %ebp jg .L20 movle(%rip), %eax xorl%r13d, %r13d shrl$31, %eax cmpb%al, %r12b setb%r13b .p2align 4,,10 .p2align 3 .L17: testl %r13d, %r13d je .L16 callfoo movle(%rip), %eax shrl$31, %eax cmpl$6, %ebp setg%dl xorl%r13d, %r13d orl %r12d, %edx cmpb%al, %dl movld(%rip), %eax setb%r13b testl %eax, %eax jne .L17 .L14: cmpw$0, g(%rip) jne .L18 .L36: addq$8, %rsp .cfi_remember_state .cfi_def_cfa_offset 40 xorl%eax, %eax popq%rbx .cfi_def_cfa_offset 32 popq%rbp .cfi_def_cfa_offset 24 popq%r12 .cfi_def_cfa_offset 16 popq%r13 .cfi_def_cfa_offset 8 ret .L39: .cfi_restore_state cmpl$6, %ebp jg .L7 .p2align 4,,10 .p2align 3 .L10: xorl%eax, %eax calla cwtl movl%eax, d(%rip) testl %eax, %eax je .L8 .L9: jmp .L9 .p2align 4,,10 .p2align 3 .L12: cmpw$0, g(%rip) je .L36 .L7: xorl%eax, %eax calla cwtl movl%eax, d(%rip) testl %eax, %eax je .L12 .L13: jmp .L13 .p2align 4,,10 .p2align 3 .L8: cmpw$0, g(%rip) jne .L10 jmp .L36 .p2align 4,,10 .p2align 3 .L20: cmpl$-639457069, %ebx jne .L20 jmp .L19 -- END OUTPUT - --- gcc-2b98cc24d6af0432a74f6dad1c722ce21c1f7458 -O3 case.c -S -o case.s - OUTPUT - main: .LFB3: .cfi_startproc movle(%rip), %edi pushq %rbx .cfi_def_cfa_offset 16 .cfi_offset 3, -16 movlc(%rip), %ebx testl %edi, %edi je .L5 .p2align 4,,10 .p2align 3 .L2: movl$6, %esi xorl%eax
[Bug testsuite/103324] RFE: Add a `make quickcheck` or `make smoketest` Makefile target to allow only running a portion of the testsuite
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103324 --- Comment #9 from Eric Gallager --- (In reply to Sam James from comment #8) > Using make synchronisation can help a bit: > https://www.gnu.org/software/make/manual/html_node/Parallel-Output.html. > It's made our build logs in Gentoo a lot more readable for GCC, FWIW. So, I'm finally getting around to trying this, and it makes it seem as if the testsuite is hanging while waiting for output to be synchronized...
[Bug modula2/110779] SysClock can not read the clock
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110779 Iain Sandoe changed: What|Removed |Added CC||iains at gcc dot gnu.org --- Comment #13 from Iain Sandoe --- (In reply to Gaius Mulley from comment #12) > Created attachment 55717 [details] > Another patch for Darwin > > More portability configure checks and fixes to wrapclock.cc. This patch fixes bootstrap on the affected versions - I've tried it on old and new Darwin + a cross. There are some time-related test fails, but that can be handled separately.
[Bug modula2/110779] SysClock can not read the clock
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110779 --- Comment #14 from CVS Commits --- The master branch has been updated by Gaius Mulley : https://gcc.gnu.org/g:63fb0bedb8077ac1e6b6337f198b4eae30813fbc commit r14-3180-g63fb0bedb8077ac1e6b6337f198b4eae30813fbc Author: Gaius Mulley Date: Sat Aug 12 18:17:41 2023 +0100 PR modula2/110779 SysClock can not read the clock (Darwin portability fixes) This patch adds corrections to defensively check against glibc functions, structures and contains fallbacks. These fixes were required under Darwin. gcc/m2/ChangeLog: PR modula2/110779 * gm2-libs-iso/SysClock.mod (EpochTime): New procedure. (GetClock): Call EpochTime if the C time functions are unavailable. * gm2-libs-iso/wrapclock.def (istimezone): New function definition. libgm2/ChangeLog: PR modula2/110779 * configure: Regenerate. * configure.ac: Provide special case test for Darwin cross configuration. (GLIBCXX_CONFIGURE): New statement. (GLIBCXX_CHECK_GETTIMEOFDAY): New statement. (GLIBCXX_ENABLE_LIBSTDCXX_TIME): New statement. * libm2iso/wrapclock.cc: New sys/time.h conditional include. (sys/syscall.h): Conditional include. (unistd.h): Conditional include. (GetTimeRealtime): Re-implement. (SetTimeRealtime): Re-implement. (timezone): Re-implement. (istimezone): New function. (daylight): Re-implement. (isdst): Re-implement. (tzname): Re-implement. Signed-off-by: Gaius Mulley
[Bug tree-optimization/111003] [14 Regression] Dead Code Elimination Regression at -O3 since r14-2161-g237e83e2158
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111003 Andrew Pinski changed: What|Removed |Added Target Milestone|--- |14.0
[Bug tree-optimization/111003] [14 Regression] Dead Code Elimination Regression at -O3 since r14-2161-g237e83e2158
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111003 Andrew Pinski changed: What|Removed |Added Last reconfirmed||2023-08-12 Ever confirmed|0 |1 Status|UNCONFIRMED |NEW --- Comment #1 from Andrew Pinski --- Confirmned. The good news is the issue can be reproduced even without < (replaced with !&) and m being initialized: ``` static int c, d, e, f; static short g; static int *h = &c; void foo(void); short(a)(); static unsigned b(unsigned char j, int l) { return j > l ? j : j << l; } static int *i(); static void k(int j, unsigned char l) { i(); g = f; f = g; for (; g;) { int m = 0; d = a(); for (; d;) { if (l) if (!(j >= -639457069 && j <= -639457069)) if (m) foo(); m = !(10 != (l ^ b(j, 6))) & (0 > e); } } } static int *i() { for (; e; e = a(e, 6)) ; return h; } int main() { k(c, c); } ```
[Bug tree-optimization/110991] [14 Regression] Dead Code Elimination Regression at -O2 since r14-1135-gc53f51005de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110991 --- Comment #2 from Andrew Pinski --- What is interesting is -O3 unrolls the loop in cunroll and the loop becomes a nothing as everything can be almost constant folded away ... Maybe that is something which can be tuned for -O2 and unrolling ...
[Bug gcov-profile/110988] [14 regression] ICE when building 523.xalancbmk_r with pgo and lto
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110988 Andrew Pinski changed: What|Removed |Added Target Milestone|--- |14.0 Summary|ICE when building |[14 regression] ICE when |523.xalancbmk_r with pgo|building 523.xalancbmk_r |and lto |with pgo and lto Keywords||ice-on-valid-code
[Bug c++/111004] New: Visitor and concept error message
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111004 Bug ID: 111004 Summary: Visitor and concept error message Product: gcc Version: 13.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: deco33000 at yandex dot com Target Milestone: --- The error code for the following situation is not clear. The error is that struct A and B don't have a "bool activated;" member. The issue is the variant does not reflect what the error is in an helpful way. To ease your life, here is the godbolt: https://godbolt.org/z/Wdr4zn5E1 Do you think it is possible to improve the diagnostic? --- Reduced test case: #include #include #include using namespace std; template concept My_concept = requires(T a) { { a.activated } -> std::same_as; }; struct A { bool not_activated; }; struct B { bool not_activated; int other; }; auto test(variant &v) -> void { std::visit([](My_concept auto &&arg) { std::cout << "OK\n"; }, v); } int main() { variant v; v = A(); test(v); return 0; } Thanks
[Bug modula2/110779] SysClock can not read the clock
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110779 --- Comment #15 from CVS Commits --- The releases/gcc-13 branch has been updated by Gaius Mulley : https://gcc.gnu.org/g:a11ca333df2b6abb4187b39f32bb35a195d8fb33 commit r13-7716-ga11ca333df2b6abb4187b39f32bb35a195d8fb33 Author: Gaius Mulley Date: Sat Aug 12 20:20:45 2023 +0100 PR modula2/110779 SysClock can not read the clock (Darwin fixes) This patch adds corrections to defensively check against glibc functions, structures and contains fallbacks. These fixes were required under Darwin. gcc/m2/ChangeLog: PR modula2/110779 * gm2-libs-iso/SysClock.mod (EpochTime): New procedure. (GetClock): Call EpochTime if the C time functions are unavailable. * gm2-libs-iso/wrapclock.def (istimezone): New function definition. libgm2/ChangeLog: PR modula2/110779 * configure: Regenerate. * configure.ac: Provide special case test for Darwin cross configuration. (GLIBCXX_CONFIGURE): New statement. (GLIBCXX_CHECK_GETTIMEOFDAY): New statement. (GLIBCXX_ENABLE_LIBSTDCXX_TIME): New statement. * libm2iso/wrapclock.cc: New sys/time.h conditional include. (sys/syscall.h): Conditional include. (unistd.h): Conditional include. (GetTimeRealtime): Re-implement. (SetTimeRealtime): Re-implement. (timezone): Re-implement. (istimezone): New function. (daylight): Re-implement. (isdst): Re-implement. (tzname): Re-implement. Signed-off-by: Gaius Mulley
[Bug modula2/110779] SysClock can not read the clock
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110779 Gaius Mulley changed: What|Removed |Added Resolution|--- |FIXED Status|REOPENED|RESOLVED --- Comment #16 from Gaius Mulley --- Many thanks for all the testing - closing now that the patches have been applied.
[Bug c++/111004] Visitor and concept error message
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111004 --- Comment #1 from Andrew Pinski --- Clang's error message is similarly "bad" with GCC's libstdc++: ``` :24:5: error: no matching function for call to 'visit' 24 | std::visit([](My_concept auto &&arg) { std::cout << "OK\n"; }, v); | ^~ /opt/compiler-explorer/gcc-snapshot/lib/gcc/x86_64-linux-gnu/14.0.0/../../../../include/c++/14.0.0/variant:1867:5: note: candidate template ignored: couldn't infer template argument '_Res' 1867 | visit(_Visitor&& __visitor, _Variants&&... __variants) | ^ /opt/compiler-explorer/gcc-snapshot/lib/gcc/x86_64-linux-gnu/14.0.0/../../../../include/c++/14.0.0/variant:1827:5: note: candidate template ignored: substitution failure [with _Visitor = (lambda at :24:16), _Variants = &>]: no type named 'type' in 'std::invoke_result<(lambda at :24:16), A &>' 1827 | visit(_Visitor&& __visitor, _Variants&&... __variants) | ^ ``` Now LLVM's libc++ produces something which might be helpful: ``` In file included from :2: In file included from /opt/compiler-explorer/clang-trunk-20230812/bin/../include/c++/v1/iostream:43: In file included from /opt/compiler-explorer/clang-trunk-20230812/bin/../include/c++/v1/ios:222: In file included from /opt/compiler-explorer/clang-trunk-20230812/bin/../include/c++/v1/__locale:21: In file included from /opt/compiler-explorer/clang-trunk-20230812/bin/../include/c++/v1/mutex:192: In file included from /opt/compiler-explorer/clang-trunk-20230812/bin/../include/c++/v1/__condition_variable/condition_variable.h:17: In file included from /opt/compiler-explorer/clang-trunk-20230812/bin/../include/c++/v1/__mutex/unique_lock.h:17: In file included from /opt/compiler-explorer/clang-trunk-20230812/bin/../include/c++/v1/__system_error/system_error.h:14: In file included from /opt/compiler-explorer/clang-trunk-20230812/bin/../include/c++/v1/__system_error/error_category.h:15: In file included from /opt/compiler-explorer/clang-trunk-20230812/bin/../include/c++/v1/string:622: In file included from /opt/compiler-explorer/clang-trunk-20230812/bin/../include/c++/v1/string_view:1059: In file included from /opt/compiler-explorer/clang-trunk-20230812/bin/../include/c++/v1/algorithm:1960: In file included from /opt/compiler-explorer/clang-trunk-20230812/bin/../include/c++/v1/iterator:683: In file included from /opt/compiler-explorer/clang-trunk-20230812/bin/../include/c++/v1/__iterator/common_iterator.h:31: /opt/compiler-explorer/clang-trunk-20230812/bin/../include/c++/v1/variant:680:19: error: static assertion failed due to requirement 'is_invocable_v<(lambda at :24:16), A &>': `std::visit` requires the visitor to be exhaustive. 680 | static_assert(is_invocable_v<_Visitor, _Values...>, | ^~~~ ... ``` But still no mention of why
[Bug middle-end/110986] [14 Regression] aarch64 has support for conditional not (and vectorized conditional not ) after r14-3110-g7fb65f10285
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110986 --- Comment #15 from Andrew Pinski --- Note the main issue with the 2 different type is a different issue (even though my patches improve the situtation, other issues shows up). Will file a few testcase for that ...
[Bug target/111005] New: SVE produced code for different type sizes (smaller than int) with comparison in a loop can be improved
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111005 Bug ID: 111005 Summary: SVE produced code for different type sizes (smaller than int) with comparison in a loop can be improved Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: pinskia at gcc dot gnu.org Target Milestone: --- Target: aarch64 Take: ``` void __attribute__ ((noipa)) f0 (int *__restrict r, int *__restrict a, short *__restrict pred) { for (int i = 0; i < 1024; ++i) { int p = pred[i]?-1:0; r[i] = p ; } } void __attribute__ ((noipa)) f1 (int *__restrict r, int *__restrict a, short *__restrict pred) { for (int i = 0; i < 1024; ++i) { int p = pred[i]; r[i] = p ; } } ``` f1 produces: ``` .L6: ld1sh z31.s, p7/z, [x2, x1, lsl 1] st1wz31.s, p7, [x0, x1, lsl 2] incwx1 whilelo p7.s, w1, w3 b.any .L6 ``` While f0 produces: ``` .L2: ld1hz0.h, p0/z, [x2, x1, lsl 1] punpklo p2.h, p0.b cmpne p3.h, p1/z, z0.h, #0 punpkhi p0.h, p0.b mov z0.h, p3/z, #1 neg z0.h, p1/m, z0.h sunpklo z1.s, z0.h sunpkhi z0.s, z0.h st1wz1.s, p2, [x0, x1, lsl 2] st1wz0.s, p0, [x4, x1, lsl 2] inchx1 whilelo p0.h, w1, w3 b.any .L2 ``` While it should produce: ``` .L6: ld1sh z31.s, p7/z, [x2, x1, lsl 1] cmpne p1.s, p7/z, z31.s, #0 mov z31.s, p1/z, #-1 // =0x st1wz31.s, p7, [x0, x1, lsl 2] incwx1 whilelo p7.s, w1, w3 b.any .L6 ``` That is: sign extend load compare-not-equal to 0; setting p1 set z31 to -1 or 0 based on p1 store z31 But instead we push to do unpacking from VN2HI to VHI ...
[Bug modula2/108485] CppArg is broken for whitespaces
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108485 Gaius Mulley changed: What|Removed |Added Last reconfirmed||2023-08-12 Ever confirmed|0 |1 Status|UNCONFIRMED |ASSIGNED --- Comment #1 from Gaius Mulley --- Indeed - thanks for spotting and reporting the bug.
[Bug tree-optimization/111006] New: [SVE] Extra neg for storing to short from int comparison
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111006 Bug ID: 111006 Summary: [SVE] Extra neg for storing to short from int comparison Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: pinskia at gcc dot gnu.org Target Milestone: --- Take: ``` void __attribute__ ((noipa)) f0 (unsigned short *__restrict r, int *__restrict a, int *__restrict pred) { for (int i = 0; i < 1024; ++i) { int p = pred[i]?-1:0; r[i] = p ; } } ``` Compile with `-march=armv8.5+sve2 -O3`. Currently we get: ``` .L2: ld1wz31.s, p7/z, [x2, x1, lsl 2] cmpne p15.s, p6/z, z31.s, #0 mov z31.s, p15/z, #1 neg z31.h, p6/m, z31.h st1hz31.s, p7, [x0, x1, lsl 1] incwx1 whilelo p7.s, w1, w3 b.any .L2 ``` But we should just get: ``` .L2: ld1wz31.s, p7/z, [x2, x1, lsl 2] cmpne p15.s, p6/z, z31.s, #0 mov z31.s, p15/z, #-1 st1hz31.s, p7, [x0, x1, lsl 1] incwx1 whilelo p7.s, w1, w3 b.any .L2 ```
[Bug middle-end/110986] [14 Regression] aarch64 has support for conditional not (and vectorized conditional not ) after r14-3110-g7fb65f10285
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110986 Andrew Pinski changed: What|Removed |Added See Also||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=111005, ||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=111006 --- Comment #16 from Andrew Pinski --- (In reply to Andrew Pinski from comment #15) > Note the main issue with the 2 different type is a different issue (even > though my patches improve the situtation, other issues shows up). Will file > a few testcase for that ... PR 111005 and PR 111006 .
[Bug target/111005] SVE produced code for different type sizes (smaller than int) with comparison in a loop can be improved
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111005 --- Comment #1 from Andrew Pinski --- I forgot to say Compile with `-march=armv8.5+sve2 -O3`.
[Bug tree-optimization/111006] [SVE] Extra neg for storing to short from int comparison
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111006 Andrew Pinski changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |pinskia at gcc dot gnu.org Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2023-08-12 Ever confirmed|0 |1 --- Comment #1 from Andrew Pinski --- We have: vect_patt_30.10_67 = VEC_COND_EXPR ; vect_patt_29.11_68 = (vector([4,4]) signed short) vect_patt_30.10_67; vect_patt_28.12_69 = -vect_patt_29.11_68; So: /* Sink convert to branches, but only if we do fold both and @0 is the thurth type for the new version too. */ (simplify (convert (vec_cond:s @0 @1 @2)) (if (is_truth_type_for (type, TREE_TYPE (@0))) (vec_cond @0 (convert! @1) (convert! @2
[Bug tree-optimization/111006] [SVE] Extra neg for storing to short from int comparison
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111006 --- Comment #2 from Andrew Pinski --- Note the non-SVE code generation can be improved too. With: (simplify (negate (vec_pack_trunc:s (vec_cond:s @0 uniform_integer_cst_p@1 uniform_integer_cst_p@2) (vec_cond:s @3 @1 @2))) (with { tree outer_mask_type = truth_type_for (type); tree allones = build_minus_one_cst (type); tree zeros = build_zero_cst (type); } (if (integer_onep (@1) && integer_zerop (@2)) (vec_cond (vec_pack_trunc:outer_mask_type @0 @3) { allones; } { zeros; } ) (if (integer_onep (@2) && integer_zerop (@1)) (vec_cond (vec_pack_trunc:outer_mask_type @0 @3) { zeros; } { allones; } ) I will submit both later next week.
[Bug libstdc++/96733] std::clamp for floats and doubles produces worse code than a combo of std::min / std::max
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96733 Kacper changed: What|Removed |Added CC||cosiekvfj at o2 dot pl --- Comment #10 from Kacper --- Still not fixed? https://godbolt.org/z/1ehf9EsEa
[Bug libstdc++/111004] Visitor and concept error message
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111004 Jonathan Wakely changed: What|Removed |Added Ever confirmed|0 |1 Component|c++ |libstdc++ Status|UNCONFIRMED |NEW Last reconfirmed||2023-08-13 --- Comment #2 from Jonathan Wakely --- The static assert message for libc++ does say why, but not in user friendly language. I think we could change libstdc++ to use decltype(auto) for the return type of visit, and then use a nice static assert (with better message) to diagnose the invalid cases. We currently constrain the function using SFINAE, which isn't actually required by the standard, and makes it harder to give a good error here.