[Bug bootstrap/98860] [11 Regression] boostrap failure on MinGW-w64 windows 10
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98860 --- Comment #7 from cqwrteur --- configure:3736: $? = 0 configure:3725: /home/unlvs/mcf_build/src/build-x86_64-w64-mingw32/./gcc/xgcc -B/home/unlvs/mcf_build/src/build-x86_64-w64-mingw32/./gcc/ -L/mingw64/x86_64-w64-mingw32/lib -L/mingw64/lib -isystem /mingw64/x86_64-w64-mingw32/include -isystem /mingw64/include -B/mingw64/x86_64-w64-mingw32/bin/ -B/mingw64/x86_64-w64-mingw32/lib/ -isystem /mingw64/x86_64-w64-mingw32/include -isystem /mingw64/x86_64-w64-mingw32/sys-include-V >&5 xgcc.exe: error: unrecognized command-line option '-V' xgcc.exe: fatal error: no input files compilation terminated. configure:3736: $? = 1 configure:3725: /home/unlvs/mcf_build/src/build-x86_64-w64-mingw32/./gcc/xgcc -B/home/unlvs/mcf_build/src/build-x86_64-w64-mingw32/./gcc/ -L/mingw64/x86_64-w64-mingw32/lib -L/mingw64/lib -isystem /mingw64/x86_64-w64-mingw32/include -isystem /mingw64/include -B/mingw64/x86_64-w64-mingw32/bin/ -B/mingw64/x86_64-w64-mingw32/lib/ -isystem /mingw64/x86_64-w64-mingw32/include -isystem /mingw64/x86_64-w64-mingw32/sys-include-qversion >&5 xgcc.exe: error: unrecognized command-line option '-qversion'; did you mean '--version'? xgcc.exe: fatal error: no input files compilation terminated. configure:3736: $? = 1 configure:3756: checking whether the C compiler works configure:3778: /home/unlvs/mcf_build/src/build-x86_64-w64-mingw32/./gcc/xgcc -B/home/unlvs/mcf_build/src/build-x86_64-w64-mingw32/./gcc/ -L/mingw64/x86_64-w64-mingw32/lib -L/mingw64/lib -isystem /mingw64/x86_64-w64-mingw32/include -isystem /mingw64/include -B/mingw64/x86_64-w64-mingw32/bin/ -B/mingw64/x86_64-w64-mingw32/lib/ -isystem /mingw64/x86_64-w64-mingw32/include -isystem /mingw64/x86_64-w64-mingw32/sys-include-g -march=x86-64 -mtune=generic -O2 -pipe -pipe -Wl,--dynamicbase,--high-entropy-va,--nxcompat,--default-image-base-high conftest.c >&5 configure:3782: $? = 0 configure:3830: result: yes configure:3833: checking for C compiler default output file name configure:3835: result: a.exe configure:3841: checking for suffix of executables configure:3848: /home/unlvs/mcf_build/src/build-x86_64-w64-mingw32/./gcc/xgcc -B/home/unlvs/mcf_build/src/build-x86_64-w64-mingw32/./gcc/ -L/mingw64/x86_64-w64-mingw32/lib -L/mingw64/lib -isystem /mingw64/x86_64-w64-mingw32/include -isystem /mingw64/include -B/mingw64/x86_64-w64-mingw32/bin/ -B/mingw64/x86_64-w64-mingw32/lib/ -isystem /mingw64/x86_64-w64-mingw32/include -isystem /mingw64/x86_64-w64-mingw32/sys-include-o conftest.exe -g -march=x86-64 -mtune=generic -O2 -pipe -pipe -Wl,--dynamicbase,--high-entropy-va,--nxcompat,--default-image-base-high conftest.c >&5 configure:3852: $? = 0 configure:3874: result: .exe configure:3896: checking whether we are cross compiling configure:3904: /home/unlvs/mcf_build/src/build-x86_64-w64-mingw32/./gcc/xgcc -B/home/unlvs/mcf_build/src/build-x86_64-w64-mingw32/./gcc/ -L/mingw64/x86_64-w64-mingw32/lib -L/mingw64/lib -isystem /mingw64/x86_64-w64-mingw32/include -isystem /mingw64/include -B/mingw64/x86_64-w64-mingw32/bin/ -B/mingw64/x86_64-w64-mingw32/lib/ -isystem /mingw64/x86_64-w64-mingw32/include -isystem /mingw64/x86_64-w64-mingw32/sys-include-o conftest.exe -g -march=x86-64 -mtune=generic -O2 -pipe -pipe -Wl,--dynamicbase,--high-entropy-va,--nxcompat,--default-image-base-high conftest.c >&5 configure:3908: $? = 0 configure:3915: ./conftest.exe /home/unlvs/mcf_build/src/gcc-git/libatomic/configure: line 3917: ./conftest.exe: cannot execute binary file: Exec format error configure:3919: $? = 126 configure:3926: error: in `/home/unlvs/mcf_build/src/build-x86_64-w64-mingw32/x86_64-w64-mingw32/libatomic': configure:3928: error: cannot run C compiled programs. If you meant to cross compile, use `--host'. See `config.log' for more details
[Bug ipa/98594] [11 Regression] IPA modref codegen bug
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98594 Richard Biener changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |INVALID --- Comment #5 from Richard Biener --- Hmm, I don't think struct X { int i; } and struct X { unsigned int i; } have the same alias-set (int and unsigned int do). So a reinterpret-cast from one to the other variant violates TBAA rules. That menas that glm::vec needs a CTOR from the signed/unsigned variant rather than using a hack like this - and it seems it has one if Honzas fix still lets things compile? I see template template inline constexpr vec<1, T, Q>::vec(vec<1, U, P> const& v) : x(static_cast(v.x)) {} which probably applies and the used CTOR in the bogus case is the compiler-generated one.
[Bug rtl-optimization/80960] [8/9/10/11 Regression] Huge memory use when compiling a very large test case
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80960 --- Comment #27 from CVS Commits --- The master branch has been updated by Richard Biener : https://gcc.gnu.org/g:a523add327c6cfdd68cf9b788ea808068d0f508c commit r11-6948-ga523add327c6cfdd68cf9b788ea808068d0f508c Author: Richard Biener Date: Wed Jan 27 15:35:52 2021 +0100 rtl-optimization/80960 - avoid creating garbage RTL in DSE The following avoids repeatedly turning VALUE RTXen into sth useful and re-applying a constant offset through get_addr via DSE check_mem_read_rtx. Instead perform this once for all stores to be visited in check_mem_read_rtx. This avoids allocating 1.6GB of garbage PLUS RTXen on the PR80960 testcase, fixing the memory usage regression from old GCC. 2021-01-27 Richard Biener PR rtl-optimization/80960 * dse.c (check_mem_read_rtx): Call get_addr on the offsetted address.
[Bug tree-optimization/98856] [11 Regression] botan AES-128/XTS is slower by ~17% since r11-6649-g285fa338b06b804e72997c4d876ecf08a9c083af
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98856 --- Comment #3 from Martin Liška --- (In reply to Richard Biener from comment #2) > The cxx bench Botan doesn't know --cxxflags, what Botan version are you > looking at? I used this fixed version: https://gitlab.suse.de/marxin/cpp-benchmarks/-/tree/master/botan
[Bug bootstrap/98860] [11 Regression] boostrap failure on MinGW-w64 windows 10
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98860 --- Comment #8 from cqwrteur --- I tried to build this commit and it is successful. https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=0411ae7f08e0f5a8b02ff313d26d27a0f6d1bb34 which means it is another commit that breaks it
[Bug fortran/86470] [8/9/10/11 Regression] [OOP] ICE with OMP
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86470 --- Comment #9 from CVS Commits --- The master branch has been updated by Harald Anlauf : https://gcc.gnu.org/g:33a7a93218b1393d0135e3c4a9ad9ced87808f5e commit r11-6950-g33a7a93218b1393d0135e3c4a9ad9ced87808f5e Author: Harald Anlauf Date: Thu Jan 28 10:13:46 2021 +0100 PR fortran/86470 - ICE with OpenMP, class(*) allocatable gfc_call_malloc should malloc an area of size 1 if no size given. gcc/fortran/ChangeLog: PR fortran/86470 * trans.c (gfc_call_malloc): Allocate area of size 1 if passed size is NULL (as documented). gcc/testsuite/ChangeLog: PR fortran/86470 * gfortran.dg/gomp/pr86470.f90: New test.
[Bug bootstrap/98860] [11 Regression] boostrap failure on MinGW-w64 windows 10
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98860 --- Comment #9 from cqwrteur --- I tried to build this commit and it is successful. https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=0411ae7f08e0f5a8b02ff313d26d27a0f6d1bb34 which means it is another commit that breaks it. Daily bump of 2021-01-18 fails https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=4c9bcd5c81a010c06d177089176499926550625c 4c9bcd5c81a010c06d177089176499926550625c fails However. Daily bump of 2021-01-17 succ https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=59cf67d1cf77e9594e58fd2848ac94d505546546 succ Which means it must be in the 6 patches that breaks
[Bug bootstrap/98860] [11 Regression] boostrap failure on MinGW-w64 windows 10
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98860 Jakub Jelinek changed: What|Removed |Added CC|jakub at redhat dot com| --- Comment #10 from Jakub Jelinek --- In that range guess the only important change was the switch from -gdwarf-4 to -gdwarf-5 by default. So either a bug in your assembler or linker or something else that doesn't like DWARF 5?
[Bug bootstrap/98860] [11 Regression] boostrap failure on MinGW-w64 windows 10
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98860 --- Comment #11 from cqwrteur --- (In reply to Jakub Jelinek from comment #10) > In that range guess the only important change was the switch from -gdwarf-4 > to -gdwarf-5 by default. So either a bug in your assembler or linker or > something else that doesn't like DWARF 5? MinGW-w64. I do not know. Maybe Binutils do not work with gdwarf-5 whatever?
[Bug bootstrap/98860] [11 Regression] boostrap failure on MinGW-w64 windows 10
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98860 --- Comment #12 from Jakub Jelinek --- So you need to figure it out, most people here don't have any access to Windows. If mingw64 is using binutils, there were important DWARF 5 related bugs in in binutils 2.35, but for the known ones we've added workarounds. And much older binutils don't really cope with DWARF 5 at all, but that on Linux certainly doesn't result in inability to properly link programs.
[Bug bootstrap/98860] [11 Regression] boostrap failure on MinGW-w64 windows 10
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98860 --- Comment #13 from cqwrteur --- $ as --version GNU assembler (GNU Binutils) 2.35.1 Copyright (C) 2020 Free Software Foundation, Inc. This program is free software; you may redistribute it under the terms of the GNU General Public License version 3 or later. This program has absolutely no warranty. This assembler was configured for a target of `x86_64-w64-mingw32'.
[Bug c/98294] [9/10/11 Regression] ICE in calculate_line_spans, at diagnostic-show-locus.c:1296 since r6-6901-g876217ae71cf0b34
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98294 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #2 from Jakub Jelinek --- David, do you think you could have a look?
[Bug tree-optimization/98856] [11 Regression] botan AES-128/XTS is slower by ~17% since r11-6649-g285fa338b06b804e72997c4d876ecf08a9c083af
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98856 --- Comment #4 from Richard Biener --- Slow: Samples: 4K of event 'cycles:u', Event count (approx.): 4565667242 Overhead Samples Command Shared Object Symbol 30.88% 1252 botanlibbotan-2.so.17 [.] Botan::Block_Cipher_Fixed_Params<16ul, 16ul, 0ul, 1ul, Botan 30.24% 1235 botanlibbotan-2.so.17 [.] Botan::Block_Cipher_Fixed_Params<16ul, 16ul, 0ul, 1ul, Botan 26.04% 1055 botanlibbotan-2.so.17 [.] Botan::poly_double_n_le Fast Samples: 4K of event 'cycles:u', Event count (approx.): 4427277434 Overhead Samples Command Shared ObjectSymbol 33.59% 1372 botanlibbotan-2.so.17 [.] Botan::Block_Cipher_Fixed_Params<16ul, 16ul, 0ul, 1ul, Bo 33.16% 1356 botanlibbotan-2.so.17 [.] Botan::Block_Cipher_Fixed_Params<16ul, 16ul, 0ul, 1ul, Bo 18.71% 765 botanlibbotan-2.so.17 [.] Botan::poly_double_n_le also fast on trunk when not vectorizing, so the rev does what it was intended to (more vectorization). I'll look into what we do to poly_double_n_le.
[Bug bootstrap/98860] [11 Regression] boostrap failure on MinGW-w64 windows 10
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98860 --- Comment #14 from cqwrteur --- (In reply to Jakub Jelinek from comment #12) > So you need to figure it out, most people here don't have any access to > Windows. > If mingw64 is using binutils, there were important DWARF 5 related bugs in > in binutils 2.35, but for the known ones we've added workarounds. And much > older binutils don't really cope with DWARF 5 at all, but that on Linux > certainly doesn't result in inability to properly link programs. https://github.com/msys2/MINGW-packages-dev/blob/master/mingw-w64-binutils-git/PKGBUILD I think MSYS2 uses binutils of this.
[Bug bootstrap/98860] [11 Regression] boostrap failure on MinGW-w64 windows 10
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98860 --- Comment #15 from Liu Hao --- Why did you add me in CC without asking for my acknowledgement? If you had asked MSYS2 people, I am pretty sure you would have received more constructive suggestions.
[Bug c++/98861] I want deterministic exceptions (Herbception)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98861 --- Comment #1 from Jonathan Wakely --- (In reply to cqwrteur from comment #0) > The mailing list requires me to request the feature here. I put it here. > https://www.mail-archive.com/gcc@gcc.gnu.org/msg94104.html > "However, I desperately need that feature since current C++ exceptions > are totally unusable." Ridiculous claims like "totally unusable" aren't going to convince anybody. > http://open-std.org/JTC1/SC22/WG21/docs/papers/2019/p0709r4.pdf This is just a proposal, one of many. It hasn't been approved by the committee and is not without critics. It might be suitable to implement in a branch, as a proof of concept, but most G++ developers are already busy working on things that have actually been approved for inclusion in C++.
[Bug c++/98861] I want deterministic exceptions (Herbception)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98861 --- Comment #2 from cqwrteur --- (In reply to Jonathan Wakely from comment #1) > (In reply to cqwrteur from comment #0) > > The mailing list requires me to request the feature here. I put it here. > > https://www.mail-archive.com/gcc@gcc.gnu.org/msg94104.html > > > "However, I desperately need that feature since current C++ exceptions > > are totally unusable." > > Ridiculous claims like "totally unusable" aren't going to convince anybody. > > > http://open-std.org/JTC1/SC22/WG21/docs/papers/2019/p0709r4.pdf > > This is just a proposal, one of many. It hasn't been approved by the > committee and is not without critics. It might be suitable to implement in a > branch, as a proof of concept, but most G++ developers are already busy > working on things that have actually been approved for inclusion in C++. How to start a branch? It can be an experimental branch, right?
[Bug c++/98861] I want deterministic exceptions (Herbception)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98861 --- Comment #3 from cqwrteur --- > Ridiculous claims like "totally unusable" aren't going to convince anybody. It is totally unusable. Binary bloat of runtime in bare-metal systems. Relying on stdio.h even stdio.h is not freestanding. C++ EH is thousands of times slower than syscalls on Linux. Any EH thrown is basically a DDOS vulnerability. The worst part is that vector would throw std::length_error/std::bad_alloc/std::bad_array_length which are completely useless tbh.
[Bug c++/98861] I want deterministic exceptions (Herbception)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98861 --- Comment #4 from cqwrteur --- (In reply to cqwrteur from comment #3) > > Ridiculous claims like "totally unusable" aren't going to convince anybody. > > It is totally unusable. Binary bloat of runtime in bare-metal systems. > Relying on stdio.h even stdio.h is not freestanding. > > C++ EH is thousands of times slower than syscalls on Linux. Any EH thrown is > basically a DDOS vulnerability. > > The worst part is that vector would throw > std::length_error/std::bad_alloc/std::bad_array_length which are completely > useless tbh. BTW. std::terminate() is not thread-safe which is terrible.
[Bug c++/98861] I want deterministic exceptions (Herbception)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98861 --- Comment #5 from cqwrteur --- (In reply to cqwrteur from comment #4) > (In reply to cqwrteur from comment #3) > > > Ridiculous claims like "totally unusable" aren't going to convince > > > anybody. > > > > It is totally unusable. Binary bloat of runtime in bare-metal systems. > > Relying on stdio.h even stdio.h is not freestanding. > > > > C++ EH is thousands of times slower than syscalls on Linux. Any EH thrown is > > basically a DDOS vulnerability. > > > > The worst part is that vector would throw > > std::length_error/std::bad_alloc/std::bad_array_length which are completely > > useless tbh. > > BTW. std::terminate() is not thread-safe which is terrible. I ban C++ EH every day.
[Bug lto/85574] [8/9 Regression] LTO bootstapped binaries differ
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85574 --- Comment #39 from CVS Commits --- The master branch has been updated by Eric Botcazou : https://gcc.gnu.org/g:f7a6d314e7f7eeb6240a4f62511c189c90ef300c commit r11-6951-gf7a6d314e7f7eeb6240a4f62511c189c90ef300c Author: Eric Botcazou Date: Thu Jan 28 11:31:35 2021 +0100 Fix LTO bootstrap on Windows The latest fix introduced a comparison of executables and this cannot directly work on Windows because they are timestamped. Moreover nobody sets $(exeext) at top level, at least on MinGW, so you get weird behavior because some tools add the implicit .exe suffix and others do not. contrib/ PR lto/85574 * compare-lto: Deal with PE-COFF executables specifically.
[Bug lto/85574] [8/9 Regression] LTO bootstapped binaries differ
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85574 --- Comment #40 from CVS Commits --- The releases/gcc-10 branch has been updated by Eric Botcazou : https://gcc.gnu.org/g:4be929be0317b2baf1c67b430ad0a2fbaed05152 commit r10-9307-g4be929be0317b2baf1c67b430ad0a2fbaed05152 Author: Eric Botcazou Date: Thu Jan 28 11:31:35 2021 +0100 Fix LTO bootstrap on Windows The latest fix introduced a comparison of executables and this cannot directly work on Windows because they are timestamped. Moreover nobody sets $(exeext) at top level, at least on MinGW, so you get weird behavior because some tools add the implicit .exe suffix and others do not. contrib/ PR lto/85574 * compare-lto: Deal with PE-COFF executables specifically.
[Bug lto/85574] [8/9 Regression] LTO bootstapped binaries differ
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85574 --- Comment #41 from CVS Commits --- The releases/gcc-9 branch has been updated by Eric Botcazou : https://gcc.gnu.org/g:faed344ee5f17b9a19961b3b1f8ea0ed10db6f2d commit r9-9208-gfaed344ee5f17b9a19961b3b1f8ea0ed10db6f2d Author: Eric Botcazou Date: Thu Jan 28 11:31:35 2021 +0100 Fix LTO bootstrap on Windows The latest fix introduced a comparison of executables and this cannot directly work on Windows because they are timestamped. Moreover nobody sets $(exeext) at top level, at least on MinGW, so you get weird behavior because some tools add the implicit .exe suffix and others do not. contrib/ PR lto/85574 * compare-lto: Deal with PE-COFF executables specifically.
[Bug tree-optimization/98499] [11 Regression] Possibly bad std::string initialization in constructors
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98499 Jan Hubicka changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |hubicka at gcc dot gnu.org CC||hubicka at gcc dot gnu.org Status|NEW |ASSIGNED --- Comment #9 from Jan Hubicka --- Thanks for all the detailed analysis and sorry for getting into this late. > Oh, thank you! Only after many printf() attempts it sunk in that > `036t.ealias` is using data from seemingly later `043t.modref1` pass. It is > so confusing! This is because it is an inter-procedural analysis. We compile in topological order and propagate info from function to callers. Here I think poblem is: void Importer::Importer (struct Importer * const this) { struct string * _1; : *this_3(D) ={v} {CLOBBER}; *this_3(D).base_path = dir_name (); [return slot optimization] return; } We get parm 0 flags: direct noescape nodirectescape While dir_name does: struct string dir_name () { : string::string (_2(D)); return _2(D); } and that gets to void string::string (struct string * const this) { char[16] * _1; : *this_3(D) ={v} {CLOBBER}; _1 = &this_3(D)->_M_local_buf; *this_3(D)._M_buf = _1; return; } which indeed conflict with noescape. So problem here is that return slot optimized variables are behaving kind of like parameters. Since modref does not track EAF flags for them I think your conservative fix makes sense. It is also relatively easy to track the EAF flags here, I will try to get quick stats on how often this makes difference (and whether we want to add trakcing now or next stage1). Honza
[Bug c++/98331] [9/10/11 Regression] ICE in haifa_luid_for_non_insn, at haifa-sched.c:7845 since r8-5479-g67a8d7199fe4e474
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98331 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #2 from Jakub Jelinek --- Reduced testcase, only -g -O2 -m32 -march=x86-64 is needed: void bar (const char *); unsigned long long x; void foo (void) { bar ("foo"); __atomic_fetch_add (&x, 1, 0); __builtin_unreachable (); }
[Bug tree-optimization/98856] [11 Regression] botan AES-128/XTS is slower by ~17% since r11-6649-g285fa338b06b804e72997c4d876ecf08a9c083af
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98856 --- Comment #5 from Richard Biener --- Looks like STLF issues. There's a ls_stlf counter, with SLP vectorization disabled I see 34.39% 1417 botanlibbotan-2.so.17 [.] Botan::Block_Cipher_Fixed_Params<16ul, 16ul, 0ul, 1ul, Botan::BlockCip 32.27% 1333 botanlibbotan-2.so.17 [.] Botan::Block_Cipher_Fixed_Params<16ul, 16ul, 0ul, 1ul, Botan::BlockCip 7.31% 306 botanlibbotan-2.so.17 [.] Botan::poly_double_n_le while with SLP vectorization enabled there's Samples: 4K of event 'ls_stlf:u', Event count (approx.): 723886942 Overhead Samples Command Shared Object Symbol 32.41% 1320 botanlibbotan-2.so.17 [.] Botan::poly_double_n_le 27.23% 1114 botanlibbotan-2.so.17 [.] Botan::Block_Cipher_Fixed_Params<16ul, 16ul, 0ul, 1ul, Botan::BlockCip 27.06% 1107 botanlibbotan-2.so.17 [.] Botan::Block_Cipher_Fixed_Params<16ul, 16ul, 0ul, 1ul, Botan::BlockCip but then the register docs suggest that the unnamed cpu/event=0x24,umask=0x2/u is supposed to be the forwarding fails due to incomplete/misaligned data. Unvectorized: Samples: 4K of event 'cpu/event=0x24,umask=0x2/u', Event count (approx.): 1024347253 Overhead Samples Command Shared Object Symbol 33.56% 1382 botanlibbotan-2.so.17 [.] Botan::Block_Cipher_Fixed_Params<16ul, 16ul, 0ul, 1ul, Botan::BlockCip 30.32% 1246 botanlibbotan-2.so.17 [.] Botan::Block_Cipher_Fixed_Params<16ul, 16ul, 0ul, 1ul, Botan::BlockCip 23.18% 953 botanlibbotan-2.so.17 [.] Botan::poly_double_n_le vectorized: Samples: 4K of event 'cpu/event=0x24,umask=0x2/u', Event count (approx.): 489384781 Overhead Samples Command Shared Object Symbol 30.17% 1229 botanlibbotan-2.so.17 [.] Botan::poly_double_n_le 29.40% 1203 botanlibbotan-2.so.17 [.] Botan::Block_Cipher_Fixed_Params<16ul, 16ul, 0ul, 1ul, Botan::BlockCip 28.09% 1147 botanlibbotan-2.so.17 [.] Botan::Block_Cipher_Fixed_Params<16ul, 16ul, 0ul, 1ul, Botan::BlockCip but the masking doesn't work as expected since I get hits for either bit on 4.05 | vmovdqa%xmm4,0x10(%rsp) # | const uint64_t carry = POLY * (W[LIMBS-1] >> 63); # 12.24 | mov0x18(%rsp),%rdx # | W[0] = (W[0] << 1) ^ carry; # 24.00 | vmovdqa0x10(%rsp),%xmm5 which should only happen for bit 2 (data not ready). Of course this code-gen is weird since 0x10(%rsp) is available in %xmm4. Well, changing the above doesn't make a difference. I guess the event hit is just quite delayed - that makes perf quite useless here. As a general optimization remark we fail to scalarize 'W' in poly_double_le for the larger sizes, but the relevant differences likely appear for the cases we expand the memcpy inline on GIMPLE, specifically [local count: 1431655747]: _60 = MEM <__int128 unsigned> [(char * {ref-all})in_6(D)]; _61 = BIT_FIELD_REF <_60, 64, 64>; _62 = _61 >> 63; carry_63 = _62 * 135; _308 = _61 << 1; _228 = (long unsigned int) _60; _310 = _228 >> 63; _311 = _308 ^ _310; _71 = _228 << 1; _72 = carry_63 ^ _71; MEM [(char * {ref-all})out_5(D)] = _72; MEM [(char * {ref-all})out_5(D) + 8B] = _311; this is turned into [local count: 1431655747]: _60 = MEM <__int128 unsigned> [(char * {ref-all})in_6(D)]; _114 = VIEW_CONVERT_EXPR(_60); vect__71.335_298 = _114 << 1; _61 = BIT_FIELD_REF <_60, 64, 64>; _62 = _61 >> 63; carry_63 = _62 * 135; _228 = (long unsigned int) _60; _310 = _228 >> 63; _147 = {carry_63, _310}; vect__72.336_173 = _147 ^ vect__71.335_298; MEM [(char * {ref-all})out_5(D)] = vect__72.336_173; after the patch which is build/include/botan/mem_ops.h:148:15: note: Basic block will be vectorized using SLP build/include/botan/mem_ops.h:148:15: note: Vectorizing SLP tree: build/include/botan/mem_ops.h:148:15: note: node 0x275d8e8 (max_nunits=2, refcnt=1) build/include/botan/mem_ops.h:148:15: note: op template: MEM [(char * {ref-all})out_5(D)] = _72; build/include/botan/mem_ops.h:148:15: note: stmt 0 MEM [(char * {ref-all})out_5(D)] = _72; build/include/botan/mem_ops.h:148:15: note: stmt 1 MEM [(char * {ref-all})out_5(D) + 8B] = _311; build/include/botan/mem_ops.h:148:15: note: children 0x275d960 build/include/botan/mem_ops.h:148:15: note: node 0x275d960 (max_nunits=2, refcnt=1) build/include/botan/mem_ops.h:148:15: note: op te
[Bug c++/98331] [8/9/10/11 Regression] ICE in haifa_luid_for_non_insn, at haifa-sched.c:7845 since r8-5479-g67a8d7199fe4e474
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98331 Jakub Jelinek changed: What|Removed |Added Target Milestone|9.4 |8.5 Status|NEW |ASSIGNED Summary|[9/10/11 Regression] ICE in |[8/9/10/11 Regression] ICE |haifa_luid_for_non_insn, at |in haifa_luid_for_non_insn, |haifa-sched.c:7845 since|at haifa-sched.c:7845 since |r8-5479-g67a8d7199fe4e474 |r8-5479-g67a8d7199fe4e474 Assignee|unassigned at gcc dot gnu.org |jakub at gcc dot gnu.org --- Comment #3 from Jakub Jelinek --- When I slightly tweak the testcase, it started with r8-3191-gc5f597633467c2fc0fa80afa19c99f049fac8466 void bar (const char *); unsigned long long x; void foo (void) { int a = 1; bar ("foo"); int b = 2; __atomic_fetch_add (&x, 1, 0); int c = 3; __builtin_unreachable (); }
[Bug tree-optimization/98856] [11 Regression] botan AES-128/XTS is slower by ~17% since r11-6649-g285fa338b06b804e72997c4d876ecf08a9c083af
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98856 --- Comment #6 from Richard Biener --- The following testcase reproduces the assembly: typedef __UINT64_TYPE__ uint64_t; void poly_double_le2 (unsigned char *out, const unsigned char *in) { uint64_t W[2]; __builtin_memcpy (&W, in, 16); uint64_t carry = (W[1] >> 63) * 135; W[1] = (W[1] << 1) ^ (W[0] >> 63); W[0] = (W[0] << 1) ^ carry; __builtin_memcpy (out, &W[0], 8); __builtin_memcpy (out + 8, &W[1], 8); }
[Bug rtl-optimization/98863] New: WRF with LTO consumes a lot of memory in split2 pass
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98863 Bug ID: 98863 Summary: WRF with LTO consumes a lot of memory in split2 pass Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: marxin at gcc dot gnu.org Blocks: 26163 Target Milestone: --- Created attachment 50072 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50072&action=edit CPU and memory usage Using -flto and -Ofast -march=znver needs >20GB for a single huge ltrans. Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163 [Bug 26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)
[Bug rtl-optimization/98863] WRF with LTO consumes a lot of memory in split2 pass
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98863 Martin Liška changed: What|Removed |Added Ever confirmed|0 |1 CC||jamborm at gcc dot gnu.org Last reconfirmed||2021-01-28 Status|UNCONFIRMED |NEW --- Comment #1 from Martin Liška --- We have an ltrans that needs > 1000s to compile and has a memory hog here: 1034s: current pass = gcse2 (306) 1034s: current pass = split2 (307) {'ltrans': {'memory': 1.8760414123535156, 'cpu': 6.25}} {'ltrans': {'memory': 3.2761878967285156, 'cpu': 6.25}} {'ltrans': {'memory': 6.182369232177734, 'cpu': 6.25}} {'ltrans': {'memory': 9.13412094116211, 'cpu': 6.25}} {'ltrans': {'memory': 12.164928436279297, 'cpu': 6.25}} {'ltrans': {'memory': 15.184154510498047, 'cpu': 6.25}} {'ltrans': {'memory': 18.196331024169922, 'cpu': 6.25}} {'ltrans': {'memory': 21.150096893310547, 'cpu': 6.25}} {'ltrans': {'memory': 21.467578887939453, 'cpu': 6.24375}} {'ltrans': {'memory': 21.467578887939453, 'cpu': 6.25}} {'ltrans': {'memory': 21.468082427978516, 'cpu': 6.25}} 1045s: current pass = ree (308)
[Bug tree-optimization/96974] [10/11 Regression] ICE in vect_get_vector_types_for_stmt compiling for SVE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96974 --- Comment #8 from Stam Markianos-Wright --- I have a liiitle bit more progress here, but I have a question about vect_get_smallest_scalar_type. If we look at the comment before the function: >/* Return the smallest scalar part of STMT_INFO. > This is used to determine the vectype of the stmt. We generally set the > vectype according to the type of the result (lhs). For stmts whose > result-type is different than the type of the arguments (e.g., demotion, > promotion), vectype will be reset appropriately (later). Note that we have > to visit the smallest datatype in this function, because that determines the > VF. If the smallest datatype in the loop is present only as the rhs of a > promotion operation - we'd miss it. Would this be "smallest datatype in all cases", or is this more like "the smallest datatype within the same promotion/demotion chain"? i.e. how should we react if we detect a smallest datatype on the rhs of "float" when everything else in the stmt has been in the integer chain (int or, like in this case, long int)? > Such a case, where a variable of this datatype does not appear in the lhs > anywhere in the loop, can only occur if it's an invariant: e.g.: > 'int_x = (int) short_inv', which we'd expect to have been optimized away by > invariant motion. However, we cannot rely on invariant motion to always > take invariants out of the loop, and so in the case of promotion we also > have to check the rhs. > LHS_SIZE_UNIT and RHS_SIZE_UNIT contain the sizes of the corresponding > types. */ I have found that this is why we end up with a smaller number in: TYPE_VECTOR_SUBPARTS (nunits_vectype) == 4 than in: TYPE_VECTOR_SUBPARTS (*stmt_vectype_out) == 8 So I'm thinking that either A) We shouldn't allow this, and add in some check maybe for "GET_MODE_CLASS (x) == GET_MODE_CLASS (y)" or B) Some of the logic that generates stmt_vectype_out is deficient and it should also be detecting the existence of a "float" type to get the VF.
[Bug tree-optimization/96974] [10/11 Regression] ICE in vect_get_vector_types_for_stmt compiling for SVE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96974 Stam Markianos-Wright changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |stammark at gcc dot gnu.org Status|NEW |ASSIGNED
[Bug tree-optimization/98856] [11 Regression] botan AES-128/XTS is slower by ~17% since r11-6649-g285fa338b06b804e72997c4d876ecf08a9c083af
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98856 --- Comment #7 from Richard Biener --- OK, and the spill is likely because we expand as (insn 7 6 0 (set (reg:TI 84 [ _9 ]) (mem:TI (reg/v/f:DI 93 [ in ]) [0 MEM <__int128 unsigned> [(char * {ref-all})in_8(D)]+0 S16 A8])) -1 (nil)) (insn 8 7 9 (parallel [ (set (reg:DI 95) (lshiftrt:DI (subreg:DI (reg:TI 84 [ _9 ]) 8) (const_int 63 [0x3f]))) (clobber (reg:CC 17 flags)) ]) "t.c":7:26 -1 (nil)) ^^^ (subreg:DI (reg:TI 84 [ _9 ]) 8) ... (insn 12 11 13 (set (reg:V2DI 98 [ vect__5.3 ]) (ashift:V2DI (subreg:V2DI (reg:TI 84 [ _9 ]) 0) (const_int 1 [0x1]))) "t.c":9:16 -1 (nil)) ^^^ (subreg:V2DI (reg:TI 84 [ _9 ]) 0) LRA then does Choosing alt 4 in insn 7: (0) v (1) vm {*movti_internal} Creating newreg=103 from oldreg=84, assigning class ALL_SSE_REGS to r103 7: r103:TI=[r101:DI] REG_DEAD r101:DI Inserting insn reload after: 20: r84:TI=r103:TI Choosing alt 0 in insn 8: (0) =rm (1) 0 (2) cJ {*lshrdi3_1} Creating newreg=104 from oldreg=95, assigning class GENERAL_REGS to r104 Inserting insn reload before: 21: r104:DI=r84:TI#8 but somehow this means the reload 20 is used for the reload 21 instead of avoiding the reload 20 and doing a movhlps / movq combo? (I guess there's no high part xmm extract to gpr) As said the assembly is a bit weird: poly_double_le2: .LFB0: .cfi_startproc vmovdqu (%rsi), %xmm2 vmovdqa %xmm2, -24(%rsp) movq-16(%rsp), %rax ok, well ... vmovdqa -24(%rsp), %xmm3 ??? shrq$63, %rax imulq $135, %rax, %rax vmovq %rax, %xmm0 movq-24(%rsp), %rax ??? movq %xmm2/3, %rax vpsllq $1, %xmm3, %xmm1 shrq$63, %rax vpinsrq $1, %rax, %xmm0, %xmm0 vpxor %xmm1, %xmm0, %xmm0 vmovdqu %xmm0, (%rdi) note even with -march=core-avx2 (and thus inter-unit moves not pessimized) we get poly_double_le2: .LFB0: .cfi_startproc vmovdqu (%rsi), %xmm2 vmovdqa %xmm2, -24(%rsp) movq-16(%rsp), %rax vmovdqa -24(%rsp), %xmm3 shrq$63, %rax vpsllq $1, %xmm3, %xmm1 imulq $135, %rax, %rax vmovq %rax, %xmm0 movq-24(%rsp), %rax shrq$63, %rax vpinsrq $1, %rax, %xmm0, %xmm0 vpxor %xmm1, %xmm0, %xmm0 vmovdqu %xmm0, (%rdi) with .L56: .cfi_restore_state vmovdqu (%rsi), %xmm4 movq8(%rsi), %rdx shrq$63, %rdx imulq $135, %rdx, %rdi movq8(%rsi), %rdx vmovq %rdi, %xmm0 vpsllq $1, %xmm4, %xmm1 shrq$63, %rdx vpinsrq $1, %rdx, %xmm0, %xmm0 vpxor %xmm1, %xmm0, %xmm0 vmovdqu %xmm0, (%rax) jmp .L53 we arrive at ES-128/XTS 672043 key schedule/sec; 0.00 ms/op 4978.00 cycles/op (2 ops in 0.00 ms) AES-128/XTS encrypt buffer size 1024 bytes: 843.310 MiB/sec 4.18 cycles/byte (421.66 MiB in 500.00 ms) AES-128/XTS decrypt buffer size 1024 bytes: 847.215 MiB/sec 4.16 cycles/byte (421.66 MiB in 497.70 ms) a variant using movhlps isn't any faster than spilling unfortunately :/ I guess re-materializing from a load is too much to be asked from LRA. On the vectorizer side costing is 52 scalar vs. 40 vector (as usual the vectorized store alone leads to a big boost).
[Bug tree-optimization/96974] [10/11 Regression] ICE in vect_get_vector_types_for_stmt compiling for SVE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96974 --- Comment #9 from rsandifo at gcc dot gnu.org --- (In reply to Stam Markianos-Wright from comment #8) > I have a liiitle bit more progress here, but I have a question about > vect_get_smallest_scalar_type. > > If we look at the comment before the function: > > >/* Return the smallest scalar part of STMT_INFO. > > This is used to determine the vectype of the stmt. We generally set the > > vectype according to the type of the result (lhs). For stmts whose > > result-type is different than the type of the arguments (e.g., demotion, > > promotion), vectype will be reset appropriately (later). Note that we > > have > > to visit the smallest datatype in this function, because that determines > > the > > VF. If the smallest datatype in the loop is present only as the rhs of a > > promotion operation - we'd miss it. > > Would this be "smallest datatype in all cases", or is this more like "the > smallest > datatype within the same promotion/demotion chain"? > > i.e. how should we react if we detect a smallest datatype on the rhs of > "float" > when everything else in the stmt has been in the integer chain (int or, > like in this case, long int)? Not sure if this is really answering the question (let me know if it's not), but: for an lhs of int and an rhs of float, the choice doesn't really matter, since they're the same size. But for an lhs of long int and an rhs of float, the smallest vectype has to be taken from the float.
[Bug c++/98770] [modules] including certain stdlib headers in the global module fragment of different modules causes conflicting global module declarations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98770 --- Comment #1 from CVS Commits --- The master branch has been updated by Nathan Sidwell : https://gcc.gnu.org/g:af66f4f1b06f5e0c099dfced2fcf7b1b23fa53e7 commit r11-6954-gaf66f4f1b06f5e0c099dfced2fcf7b1b23fa53e7 Author: Nathan Sidwell Date: Thu Jan 28 04:48:33 2021 -0800 c++: header unit template alias merging [PR 98770] Typedefs are streamed by streaming the underlying type, and then recreating the typedef. But this breaks checking a duplicate is the same as the original when it is a template alias -- we end up checking a template alias (eg __void_t) against the underlying type (void). And those are not the same template alias. This stops pretendig that the underlying type is the typedef for that checking and tells is_matching_decl 'you have a typedef', so it knows what to do. (We do not want to recreate the typedef of the duplicate, because that whole set of nodes is going to go away.) PR c++/98770 gcc/cp/ * module.cc (trees_out::decl_value): Swap is_typedef & TYPE_NAME check order. (trees_in::decl_value): Do typedef frobbing only when installing a new typedef, adjust is_matching_decl call. Swap is_typedef & TYPE_NAME check. (trees_in::is_matching_decl): Add is_typedef parm. Adjust variable names and deal with typedef checking. gcc/testsuite/ * g++.dg/modules/pr98770_a.C: New. * g++.dg/modules/pr98770_b.C: New.
[Bug c++/98770] [modules] including certain stdlib headers in the global module fragment of different modules causes conflicting global module declarations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98770 Nathan Sidwell changed: What|Removed |Added Resolution|--- |FIXED Status|UNCONFIRMED |RESOLVED --- Comment #2 from Nathan Sidwell --- af66f4f1b06 2021-01-28 | c++: header unit template alias merging [PR 98770]
[Bug c++/98761] [modules] use of a module causes SIGSEGV at runtime
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98761 Nathan Sidwell changed: What|Removed |Added Last reconfirmed||2021-01-28 Ever confirmed|0 |1 Status|UNCONFIRMED |ASSIGNED --- Comment #2 from Nathan Sidwell --- well this is exciting, I think the closest I got to code generation issues were with incorrect vtables on occasion.
[Bug target/98862] Complex reduction support in offload region
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98862 Jakub Jelinek changed: What|Removed |Added CC||burnus at gcc dot gnu.org, ||jakub at gcc dot gnu.org --- Comment #1 from Jakub Jelinek --- libstdc++-v3 isn't supported ATM on either nvptx* or amdgcn* offloading, so if one needs anything from libstdc++, it will not work. As for the 16 byte atomics, I thought this was meant to be solved through -latomic, but I might misremember.
[Bug tree-optimization/96974] [10/11 Regression] ICE in vect_get_vector_types_for_stmt compiling for SVE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96974 --- Comment #10 from Richard Biener --- Still reproduces on trunk with -fdisable-tree-fre4. So the issue is that for the case of 'long int' and 'float' get_vectype_for_scalar_type when passed 8 as group_size returns V8DI and V4SF - where eventually V16SF does not exist. The vinfos vector mode is E_VNx4SFmode. We're rejecting that because it's 11122 && maybe_ge (TYPE_VECTOR_SUBPARTS (vectype), group_size)) and iterating with get_related_vectype_for_scalar_type arriving at v4sf. OTOH for long int we start with vector([2,2]) long int and iterate to v8di. I guess the assert is really misplaced and it should instead try to find a proper related type based on the nunits vectype? Or give up when we arrive at such incompatible choices for input/output vector types.
[Bug c++/98864] New: Warning for unnecessary final keyword
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98864 Bug ID: 98864 Summary: Warning for unnecessary final keyword Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: drepper.fsp+rhbz at gmail dot com Target Milestone: --- Compile the following code: struct foo { virtual void f(); }; struct bar final : foo { void f() final override; }; It is correct and should compile but the function bar::f is annotated with 'final' even though the entire class is also annotated with 'final'. This adds nothing and might be an indication of misunderstanding or leftovers from previous versions of the code. Perhaps a warning can be added to point out the issue.
[Bug rtl-optimization/98863] WRF with LTO consumes a lot of memory in split2 pass
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98863 --- Comment #2 from Richard Biener --- Huh. I guess you need to trace that with detailed mem stats, split itself should be really OK it should be linear in the number of (split) insns.
[Bug rtl-optimization/98863] WRF with LTO consumes a lot of memory in split2 pass
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98863 --- Comment #3 from Martin Liška --- It's 521.wrf_r from SPEC 2017.
[Bug debug/98331] [8/9/10/11 Regression] ICE in haifa_luid_for_non_insn, at haifa-sched.c:7845 since r8-5479-g67a8d7199fe4e474
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98331 --- Comment #4 from Jakub Jelinek --- Created attachment 50073 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50073&action=edit gcc11-pr98331.patch Untested fix. If there are debug insns in between a flow control insn and barrier after it and that barrier is later followed by some normal insn other than label (i.e. dead insn), without -g we'd split after the barrier, which puts the barrier in between bbs. But with -g, the code would incorrectly split before the first debug insn following the control flow, which results in a barrier inside of a bb rather than in between and also different behavior from -g0. If there are debug insns after the barrier, we should obviously split before the first of those debug insns (i.e. after the barrier) like we'd do with -g0.
[Bug middle-end/98865] New: Missed transform of (a >> 63) * b
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98865 Bug ID: 98865 Summary: Missed transform of (a >> 63) * b Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: rguenth at gcc dot gnu.org Target Milestone: --- unsigned long foo (unsigned long a, unsigned long b) { return (a >> 63) * b; } generates foo: .LFB0: .cfi_startproc shrq$63, %rdi movq%rdi, %rax imulq %rsi, %rax ret but we can do (like llvm): foo:# @foo .cfi_startproc # %bb.0: movq%rdi, %rax sarq$63, %rax andq%rsi, %rax retq
[Bug middle-end/98865] Missed transform of (a >> 63) * b
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98865 Richard Biener changed: What|Removed |Added Keywords||missed-optimization Target||x86_64-*-* --- Comment #1 from Richard Biener --- Happens in Botan AES-128/XTS (seen in PR98856). Probably sth for RTL expansion or even match.pd and not target specific. Quite faster for > word_mode arithmetic (only the upper part needs shifting and can be shared for the bitwise and) - but that's then really for RTL expansion.
[Bug tree-optimization/98866] New: [11 Regresion] Compile time hog in VRP since r11-3685-gfcae5121154d1c33
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98866 Bug ID: 98866 Summary: [11 Regresion] Compile time hog in VRP since r11-3685-gfcae5121154d1c33 Product: gcc Version: 11.0 Status: UNCONFIRMED Keywords: compile-time-hog Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: marxin at gcc dot gnu.org CC: aldyh at gcc dot gnu.org, amacleod at redhat dot com Target Milestone: --- Created attachment 50074 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50074&action=edit test-case Since the revision, g++ -Ofast -fno-checking pr12392.cpp -c is slower by about 50%. Note that the source file comes from a different compile time hog PR.
[Bug rtl-optimization/98863] WRF with LTO consumes a lot of memory in split2 pass
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98863 --- Comment #4 from Martin Liška --- Created attachment 50075 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50075&action=edit time and memory report
[Bug target/98862] Complex reduction support in offload region
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98862 --- Comment #2 from Tobias Burnus --- (In reply to Jakub Jelinek from comment #1) > libstdc++-v3 isn't supported ATM on either nvptx* or amdgcn* offloading, so > if one needs anything from libstdc++, it will not work. I can confirm that it does not work with '-O0', showing that the symbol _ZNSt7complexIfEC1Eff alias std::complex::complex(float, float) is missing. But: > As for the 16 byte atomics, I thought this was meant to be solved through > -latomic, but I might misremember. Yes, $ g++ -fopenmp -O2 complex_reduction.cpp -foffload=-latomic works – both compiling and running (on nvptx). Note the added '-foffload=-latomic' (and -O2). See also: https://gcc.gnu.org/wiki/Offloading#Compilation_options
[Bug rtl-optimization/98863] WRF with LTO consumes a lot of memory in split2 pass
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98863 --- Comment #5 from Richard Biener --- So GC memory according to -ftime-report isn't so bad. tail of sorted (after time): TOTAL : 25.40 0.32 25.75 244M TOTAL : 26.66 0.22 26.90 130M TOTAL : 67.58 1.49 69.11 834M TOTAL : 98.32 2.98101.36 1342M TOTAL : 671.02 9.38680.77 2576M the outlier is ltrans34 for me which is also the biggest unit by far.
[Bug c/98294] [9/10/11 Regression] ICE in calculate_line_spans, at diagnostic-show-locus.c:1296 since r6-6901-g876217ae71cf0b34
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98294 David Malcolm changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |dmalcolm at gcc dot gnu.org Status|NEW |ASSIGNED
[Bug rtl-optimization/98863] WRF with LTO consumes a lot of memory in split2 pass
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98863 --- Comment #6 from Richard Biener --- df-problems.c:228 (df_rd_alloc)702M: 6.9% 705M 18M: 0.7% 0 0 heap df-problems.c:509 (df_rd_transfer_function) 3709M: 36.6% 3709M 188M: 7.1% 0 0 heap df-problems.c:227 (df_rd_alloc) 4417M: 43.6% 4417M 111M: 4.2% 0 0 heap that's not 20Gb but quite a bit. For GC memory complete unrolling is the biggest offender (but "only" 500MB).
[Bug fortran/83927] Type-Bound Procedure on element of Derived Type PARAMETER Array
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83927 Thomas Koenig changed: What|Removed |Added Status|WAITING |ASSIGNED CC||tkoenig at gcc dot gnu.org Assignee|unassigned at gcc dot gnu.org |tkoenig at gcc dot gnu.org --- Comment #4 from Thomas Koenig --- Seems fixed... I'll try to commit the test case this evening.
[Bug middle-end/98865] Missed transform of (a >> 63) * b
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98865 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #2 from Jakub Jelinek --- PR middle-end/98865 * match.pd (a * (b >> (prec-1)) to ((signed)b >> (prec-1)) & a): New simplification. --- gcc/match.pd.jj 2021-01-22 11:50:09.882909120 +0100 +++ gcc/match.pd2021-01-28 15:20:20.536238614 +0100 @@ -793,6 +793,16 @@ (define_operator_list COND_TERNARY && tree_nop_conversion_p (type, TREE_TYPE (@1))) (lshift @0 @2))) +/* Fold (a * (b >> (prec-1))) with logical shift into + ((signed)b >> (prec-1)) & a. */ +(simplify + (mult:c @0 (nop_convert? (rshift @1 INTEGER_CST@2))) + (if (INTEGRAL_TYPE_P (TREE_TYPE (@1)) + && TYPE_UNSIGNED (TREE_TYPE (@1)) + && wi::to_widest (@2) + 1 == TYPE_PRECISION (TREE_TYPE (@1))) + (with { tree stype = signed_type_for (TREE_TYPE (@1)); } +(bit_and (convert:type (rshift (convert:stype @1) @2)) @0 + /* Fold (1 << (C - x)) where C = precision(type) - 1 into ((1 << C) >> x). */ (simplify (completely untested) does that. It doesn't handle vector types, whether that is a good idea or not depends on how do we deal with the match.pd simplifications after last veclower pass issue. And, given: unsigned long long foo (unsigned long long a, unsigned long long b) { return (a >> 63) * b; } long long bar (long long a, long long b) { return -(a >> 63) * b; } long long baz (long long a, long long b) { long long c = a >> 63; long long d = -c; return d * b; } we optimize with it for and bar but not baz, apparently the -(a >> 63) arithmetic to (a >> 63) logical shift is done only in GENERIC folding and not later.
[Bug bootstrap/98860] [11 Regression] boostrap failure on MinGW-w64 windows 10
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98860 cqwrteur changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #16 from cqwrteur --- I update to latest binutils and it still fails to build. ... yes checking for C compiler default output file name... a.exe checking for suffix of executables... .exe configure: error: in `/home/unlvs/mcf_build/src/build-x86_64-w64-mingw32/x86_64-w64-mingw32/libgomp': configure: error: cannot run C compiled programs. If you meant to cross compile, use `--host'. See `config.log' for more details configure: error: in `/home/unlvs/mcf_build/src/build-x86_64-w64-mingw32/x86_64-w64-mingw32/libatomic': configure: error: cannot run C compiled programs. If you meant to cross compile, use `--host'. See `config.log' for more details .exe checking whether we are cross compiling... checking whether we are cross compiling... make[1]: *** [Makefile:15606: configure-target-libgomp] Error 1 make[1]: *** Waiting for unfinished jobs make[1]: *** [Makefile:16174: configure-target-libatomic] Error 1 configure: error: in `/home/unlvs/mcf_build/src/build-x86_64-w64-mingw32/x86_64-w64-mingw32/libssp': configure: error: cannot run C compiled programs. If you meant to cross compile, use `--host'. See `config.log' for more details configure: error: in `/home/unlvs/mcf_build/src/build-x86_64-w64-mingw32/x86_64-w64-mingw32/libquadmath': configure: error: cannot run C compiled programs. If you meant to cross compile, use `--host'. See `config.log' for more details make[1]: *** [Makefile:13329: configure-target-libssp] Error 1 make[1]: *** [Makefile:14375: configure-target-libquadmath] Error 1 make[1]: Leaving directory '/home/unlvs/mcf_build/src/build-x86_64-w64-mingw32' make: *** [Makefile:973: all] Error 2 ==> ERROR: A failure occurred in build(). Aborting...
[Bug bootstrap/98860] [11 Regression] boostrap failure on MinGW-w64 windows 10
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98860 Jakub Jelinek changed: What|Removed |Added CC|jakub at gcc dot gnu.org | --- Comment #17 from Jakub Jelinek --- Why do you keep CCing me on this? I have nothing to do with Windows.
[Bug target/98867] New: Failure to use SRI instruction for shift-right-and-insert vector operations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98867 Bug ID: 98867 Summary: Failure to use SRI instruction for shift-right-and-insert vector operations Product: gcc Version: unknown Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: ktkachov at gcc dot gnu.org Target Milestone: --- Target: aarch64 #define N 1024 unsigned char in[N]; unsigned char out[N]; #define SHIFT 6 void foo (void) { for (int i = 0; i < N; i++) { unsigned char mask = 255u >> SHIFT; unsigned char shifted = in[i] >> SHIFT; out[i] = (out[i] & ~mask) | shifted; } } at -O3 generates: foo: adrpx1, .LANCHOR0 add x1, x1, :lo12:.LANCHOR0 moviv2.16b, 0xfffc add x2, x1, 1024 mov x0, 0 .L2: ldr q0, [x1, x0] ldr q1, [x0, x2] and v0.16b, v0.16b, v2.16b ushrv1.16b, v1.16b, 6 orr v0.16b, v0.16b, v1.16b str q0, [x1, x0] add x0, x0, 16 cmp x0, 1024 bne .L2 ret whereas it could use the SRI instruction as clang does (unrolled 2x): foo:// @foo adrpx9, in adrpx10, out mov x8, xzr add x9, x9, :lo12:in add x10, x10, :lo12:out .LBB0_1:// %vector.body add x11, x9, x8 add x12, x10, x8 ldp q0, q1, [x11] ldp q2, q3, [x12] add x8, x8, #32 // =32 cmp x8, #1024 // =1024 sri v2.16b, v0.16b, #6 sri v3.16b, v1.16b, #6 stp q2, q3, [x12] b.ne.LBB0_1 This may be a bit too complex for combine to match though
[Bug bootstrap/98318] [11 Regression] libcody breaks DragonFly bootstrap
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98318 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #13 from Jakub Jelinek --- So anything left to do here? This seemed to be marked as fixed, then reopened for the testsuite, but the testsuite has been removed.
[Bug rtl-optimization/98863] WRF with LTO consumes a lot of memory in split2 pass
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98863 --- Comment #7 from Martin Liška --- I set up ulimit -v 16GB and I attached gdb. Allocation failure happens here: (gdb) p *current_pass $1 = { = { type = RTL_PASS, name = 0x19f43ae "ree", (gdb) bt #0 xmalloc_failed (size=size@entry=65536) at /home/marxin/Programming/gcc/libiberty/xmalloc.c:119 #1 0x019ac968 in xmalloc (size=65536) at /home/marxin/Programming/gcc/libiberty/xmalloc.c:149 #2 0x019a3158 in call_chunkfun (size=, h=0x51b5240) at /home/marxin/Programming/gcc/libiberty/obstack.c:94 #3 _obstack_newchunk (h=h@entry=0x51b5240, length=length@entry=40) at /home/marxin/Programming/gcc/libiberty/obstack.c:206 #4 0x00853cde in bitmap_element_allocate (head=0x5908d020, head=0x5908d020) at /home/marxin/Programming/gcc/gcc/bitmap.c:123 #5 bitmap_list_insert_element_after (head=0x5908d020, elt=0x3d861a4f8, indx=3548, node=) at /home/marxin/Programming/gcc/gcc/bitmap.c:312 #6 0x00855b34 in bitmap_set_range (count=, start=0, head=0x5908d020) at /home/marxin/Programming/gcc/gcc/bitmap.c:1624 #7 bitmap_set_range (head=0x5908d020, start=0, count=) at /home/marxin/Programming/gcc/gcc/bitmap.c:1577 #8 0x0090926f in df_mir_alloc (all_blocks=) at /home/marxin/Programming/gcc/gcc/df-problems.c:1921 #9 0x00901ef6 in df_analyze_problem (dflow=0x3ce62b0, blocks_to_consider=0x445a888, postorder=0x4a821208, n_blocks=56609) at /home/marxin/Programming/gcc/gcc/df-core.c:1162 #10 0x00902e42 in df_analyze_1 () at /home/marxin/Programming/gcc/gcc/df-core.c:1228 #11 0x017bb663 in find_and_remove_re () at /home/marxin/Programming/gcc/gcc/ree.c:1290 #12 rest_of_handle_ree () at /home/marxin/Programming/gcc/gcc/ree.c:1384 #13 (anonymous namespace)::pass_ree::execute (this=) at /home/marxin/Programming/gcc/gcc/ree.c:1412 #14 0x00c62a38 in execute_one_pass (pass=0x24e5d90) at /home/marxin/Programming/gcc/gcc/passes.c:2567 #15 0x00c63413 in execute_pass_list_1 (pass=0x24e5d90) at /home/marxin/Programming/gcc/gcc/passes.c:2656 #16 0x00c63425 in execute_pass_list_1 (pass=0x24e5c10) at /home/marxin/Programming/gcc/gcc/passes.c:2657 #17 0x00c63425 in execute_pass_list_1 (pass=0x24e4810) at /home/marxin/Programming/gcc/gcc/passes.c:2657 #18 0x00c63456 in execute_pass_list (fn=0x76b457e8, pass=) at /home/marxin/Programming/gcc/gcc/passes.c:2667 #19 0x008e2235 in cgraph_node::expand (this=0x771c8990) at /home/marxin/Programming/gcc/gcc/context.h:48 #20 0x008e389f in expand_all_functions () at /home/marxin/Programming/gcc/gcc/cgraphunit.c:1995 #21 symbol_table::compile (this=) at /home/marxin/Programming/gcc/gcc/cgraphunit.c:2359 #22 symbol_table::compile (this=) at /home/marxin/Programming/gcc/gcc/cgraphunit.c:2270 #23 0x0082d575 in lto_main () at /home/marxin/Programming/gcc/gcc/lto/lto.c:653 #24 0x00d3b83e in compile_file () at /home/marxin/Programming/gcc/gcc/toplev.c:457 #25 0x00801a80 in do_compile () at /home/marxin/Programming/gcc/gcc/toplev.c:2193 #26 toplev::main (this=this@entry=0x7fffdd4e, argc=, argc@entry=20, argv=, argv@entry=0x7fffde58) at /home/marxin/Programming/gcc/gcc/toplev.c:2332 #27 0x00805c35 in main (argc=20, argv=0x7fffde58) at /home/marxin/Programming/gcc/gcc/main.c:39 Does it help?
[Bug inline-asm/98847] Miscompilation with c++17, templates, and register keyword
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98847 --- Comment #5 from CVS Commits --- The master branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:6bb207b468da36d9d99c63409dc4098514759c90 commit r11-6958-g6bb207b468da36d9d99c63409dc4098514759c90 Author: Jakub Jelinek Date: Thu Jan 28 16:13:11 2021 +0100 c++: Fix up handling of register ... asm ("...") vars in templates [PR33661, PR98847] As the testcase shows, for vars appearing in templates, we don't attach the asm spec string to the pattern decls, nor pass it back to cp_finish_decl during instantiation. The following patch does that. 2021-01-28 Jakub Jelinek PR c++/33661 PR c++/98847 * decl.c (cp_finish_decl): For register vars with asmspec in templates call set_user_assembler_name and set DECL_HARD_REGISTER. * pt.c (tsubst_expr): When instantiating DECL_HARD_REGISTER vars, pass asmspec_tree to cp_finish_decl. * g++.target/i386/pr98847.C: New test.
[Bug c++/33661] template methods forget explicit local register asm vars
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=33661 --- Comment #18 from CVS Commits --- The master branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:6bb207b468da36d9d99c63409dc4098514759c90 commit r11-6958-g6bb207b468da36d9d99c63409dc4098514759c90 Author: Jakub Jelinek Date: Thu Jan 28 16:13:11 2021 +0100 c++: Fix up handling of register ... asm ("...") vars in templates [PR33661, PR98847] As the testcase shows, for vars appearing in templates, we don't attach the asm spec string to the pattern decls, nor pass it back to cp_finish_decl during instantiation. The following patch does that. 2021-01-28 Jakub Jelinek PR c++/33661 PR c++/98847 * decl.c (cp_finish_decl): For register vars with asmspec in templates call set_user_assembler_name and set DECL_HARD_REGISTER. * pt.c (tsubst_expr): When instantiating DECL_HARD_REGISTER vars, pass asmspec_tree to cp_finish_decl. * g++.target/i386/pr98847.C: New test.
[Bug rtl-optimization/98863] WRF with LTO consumes a lot of memory in split2 pass
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98863 --- Comment #8 from Richard Biener --- So -fno-ree doesn't help (just figured it might be given the DF numbers). But confirmed: > /usr/bin/time > /home/rguenther/install/gcc-11.0/usr/local/bin/../lib64/gcc/../../lib/gcc/x86_64-pc-linux-gnu/11.0.0/lto1 > -quiet -dumpbase ./wrf_r.ltrans34.ltrans -march=znver2 -g0 -Ofast -Ofast > -version -fno-openacc -fno-pie -fcf-protection=none -fno-openmp -ftime-report > -fltrans @./wrf_r.ltrans34.ltrans.args.0 -o ./wrf_r.ltrans34.ltrans.s GNU GIMPLE (GCC) version 11.0.0 20210128 (experimental) (x86_64-pc-linux-gnu) compiled by GNU C version 11.0.0 20210128 (experimental), GMP version 6.1.2, MPFR version 4.0.1, MPC version 1.1.0, isl version isl-0.18-GMP GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072 GNU GIMPLE (GCC) version 11.0.0 20210128 (experimental) (x86_64-pc-linux-gnu) compiled by GNU C version 11.0.0 20210128 (experimental), GMP version 6.1.2, MPFR version 4.0.1, MPC version 1.1.0, isl version isl-0.18-GMP GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072 df reaching defs : 26.41 ( 4%) 1.49 ( 16%) 28.07 ( 4%) 0 ( 0%) df live regs : 81.46 ( 12%) 0.12 ( 1%) 81.73 ( 12%) 0 ( 0%) df live&initialized regs : 83.78 ( 13%) 0.06 ( 1%) 83.77 ( 13%) 0 ( 0%) ... PRE: 214.60 ( 33%) 1.35 ( 15%) 216.04 ( 32%) 2619k ( 0%) ... LRA create live ranges : 30.87 ( 5%) 0.00 ( 0%) 30.85 ( 5%) 4168k ( 0%) ... TOTAL : 657.16 9.30666.82 2576M 657.16user 9.35system 11:06.87elapsed 99%CPU (0avgtext+0avgdata 25834184maxresident)k 0inputs+21088outputs (0major+11450874minor)pagefaults 0swaps but there isn't really anything in the mem-report that explains the 25GB max-rss. Some int overflows might result in spectacular (but unused) mallocs but then those shouldn't show up in resident size. Need to rebuild GCC with dwarf4 to be able to leak-check with valgrind (will need the whole night I guess ;))
[Bug rtl-optimization/98863] WRF with LTO consumes a lot of memory in split2 pass
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98863 --- Comment #9 from Richard Biener --- Ah, I guess -fno-ree on the lto1 command-line gets ignored :/ So yeah, there's known issues with REE (PR80930 and PR98144).
[Bug c++/98570] [8/9/10/11 Regression] ICE: canonical types differ for identical types
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98570 Patrick Palka changed: What|Removed |Added CC||ppalka at gcc dot gnu.org --- Comment #4 from Patrick Palka --- FWIW, here's a slightly more reduced ICE-on-valid testcase: template struct h {}; struct b { static constexpr int c = true; }; template using i = b; template h::c> m(); template using k = bool; template h...>::c> m(); Bisection to r7-7375 as the commit that introduced the ICE for this particular testcase.
[Bug rtl-optimization/98863] WRF with LTO consumes a lot of memory in split2 pass
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98863 --- Comment #10 from Richard Biener --- But I wonder why the mem-report doesn't show these? They dont' sum up to 20GB for me.
[Bug target/97827] bootstrap error building the amdgcn-amdhsa offload compiler with LLVM 11
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97827 Tobias Burnus changed: What|Removed |Added Resolution|--- |FIXED Status|REOPENED|RESOLVED --- Comment #16 from Tobias Burnus --- (In reply to Tobias Burnus from comment #15) > I unfortunately missed in my the LLVM patch that '.rodata' implies flags and > messed up the check. Should by fixed by: https://reviews.llvm.org/D94072 Now merged to LLVM 12/trunk (see ↑ or next link); for LLVM 11 backporting, see: https://bugs.llvm.org/show_bug.cgi?id=48922 For related Debian bug, see https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=975692 Close as FIXED – hopefully, now for real.
[Bug tree-optimization/98868] New: [8/9/10/11 Regression] polyhedron rnflow.f90 regression since r8-2555-g344be1fd47d7d64e
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98868 Bug ID: 98868 Summary: [8/9/10/11 Regression] polyhedron rnflow.f90 regression since r8-2555-g344be1fd47d7d64e Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: marxin at gcc dot gnu.org Target Milestone: --- Since the revision the benchmark is much slower: $ gfortran rnflow.f90 -Ofast -march=znver1 && time ./a.out >/dev/null 0m7.690s -> 0m13.121s One can see it here: https://lnt.opensuse.org/db_default/v4/CPP/graph?plot.0=194.791.0&plot.1=188.791.0&plot.2=202.791.0&plot.3=154.791.0&plot.4=245.791.0&plot.5=171.791.0&;
[Bug inline-asm/98847] Miscompilation with c++17, templates, and register keyword
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98847 Andreas Krebbel changed: What|Removed |Added CC||krebbel at gcc dot gnu.org --- Comment #6 from Andreas Krebbel --- Thanks for fixing this. When I had a look at it in 2015 I found that template instantiation explicitly zeroes out the asm name. Solution for me was to prevent that for hard reg decls. Not sure what approach is preferable here: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=33661#c13
[Bug rtl-optimization/98863] WRF with LTO consumes a lot of memory in split2 pass
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98863 Richard Biener changed: What|Removed |Added Depends on||80930 --- Comment #11 from Richard Biener --- Confirmed, -fno-ree improves it. Peak is way down to 8GB which is still too much of course. Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80930 [Bug 80930] REE pass causes high memory usage via df_mir_alloc() with ASAN+UBSAN turned on
[Bug bootstrap/98860] [11 Regression] boostrap failure on MinGW-w64 windows 10
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98860 --- Comment #18 from cqwrteur --- I can confirm it is the commit that breaks the bootstrap on windows 10 https://gcc.gnu.org/git/?p=gcc.git;a=blobdiff;f=gcc/doc/invoke.texi;h=c290b6f4938f45e8d491d76d4940060e7b59be36;hp=3f30230b0c244245e23e29ea409277528cfc390c;hb=3804e937b0e252a7e42632fe6d9f898f1851a49c;hpb=59cf67d1cf77e9594e58fd2848ac94d505546546
[Bug c++/98864] Warning for unnecessary final keyword
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98864 Marek Polacek changed: What|Removed |Added Status|UNCONFIRMED |NEW CC||mpolacek at gcc dot gnu.org Ever confirmed|0 |1 Last reconfirmed||2021-01-28 --- Comment #1 from Marek Polacek --- Confirmed.
[Bug bootstrap/98860] [11 Regression] boostrap failure on MinGW-w64 windows 10
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98860 --- Comment #19 from cqwrteur --- After reverting the change, the compilation succeeds. However, this is a temporary solution. https://bitbucket.org/ejsvifq_mabmip/mingw-gcc-mcf-gthread/src/master/9000-Revert-testsuite-Skip-DWARF-5-testcases-on-AIX.patch
[Bug c++/98861] I want deterministic exceptions (Herbception)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98861 Marek Polacek changed: What|Removed |Added CC||mpolacek at gcc dot gnu.org --- Comment #6 from Marek Polacek --- I don't think this is a useful bug report. If/when the proposal gets accepted, it will be useful to track who's working on it, but until then I see no point.
[Bug c++/98861] I want deterministic exceptions (Herbception)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98861 --- Comment #7 from cqwrteur --- (In reply to Marek Polacek from comment #6) > I don't think this is a useful bug report. If/when the proposal gets > accepted, it will be useful to track who's working on it, but until then I > see no point. I would like to work on it tbh. However, I do not know how to start this branch.
[Bug c++/94775] [8/9/10/11 Regression] ICE in strip_typedefs, at cp/tree.c:1734 since r8-4668-g8a5ee94a082b3d48
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94775 --- Comment #17 from Marek Polacek --- Another attempt: https://gcc.gnu.org/pipermail/gcc-patches/2021-January/564461.html
[Bug c++/98861] I want deterministic exceptions (Herbception)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98861 --- Comment #8 from cqwrteur --- (In reply to Marek Polacek from comment #6) > I don't think this is a useful bug report. If/when the proposal gets > accepted, it will be useful to track who's working on it, but until then I > see no point. Because current C++ exceptions are completely unusable for me. I suffer from it every day. I desperately need to do this on my own.
[Bug target/98849] [11 Regression] ICE in expand_shift_1, at expmed.c:2658 since g:7432f255b70811dafaf325d9403
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98849 --- Comment #12 from Jakub Jelinek --- Created attachment 50076 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50076&action=edit gcc11-pr98849.patch So like this then? From quick skimming of iwmmxt.md, it does have the vector by scalar shifts, but doesn't have vector by vector shifts, so it seems correct to me to do what this patch does, plus if somebody cared about IWMMXT, it could announce its shifts through the standard pattern names.
[Bug c++/98861] I want deterministic exceptions (Herbception)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98861 --- Comment #9 from Marek Polacek --- You can clone the gcc repo as explained here https://gcc.gnu.org/git.html and then start your own local branch.
[Bug c++/96045] [11 Regression] Wrong line and column diagnostic message in a class template instantiation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96045 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #7 from Jakub Jelinek --- Assuming fixed: ./cc1plus.20200929 -quiet pr96045.C pr96045.C:3: error: expected unqualified-id at end of input ./cc1plus.20210107 -quiet pr96045.C pr96045.C:2:15: error: expected unqualified-id at end of input 2 | struct A | ^
[Bug bootstrap/98338] [11 Regression] profiledbootstrap failure on x86_64-linux
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98338 --- Comment #10 from Jakub Jelinek --- Honza, any ideas on this?
[Bug c++/91849] [8/9/10/11 Regression] Misleading diagnostic message when binding reference to unrelated type
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91849 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #5 from Jakub Jelinek --- Regressed with r5-601-gd02f620dc0bb3bea393d04b8639a1f4748ad8821
[Bug libstdc++/92546] [10/11 Regression] Large increase in preprocessed file sizes in C++2a mode
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92546 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #17 from Jakub Jelinek --- Do we want to do further header reorganizations for GCC11 at this point, or defer to GCC12?
[Bug tree-optimization/92005] [10/11 Regression] switch code generation regression
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92005 --- Comment #5 from Jakub Jelinek --- But ideally should be able to adjust already converted switches if e.g. better range info allows to optimize them further.
[Bug libstdc++/58909] C++11's condition variables fail with static linking
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58909 Keegan Saunders changed: What|Removed |Added CC||ksaunders at nowsecure dot com --- Comment #12 from Keegan Saunders --- This behaviour does not occur with clang when statically linking libc++. It is, however, still an issue on clang and gcc when using libstdc++. It is still possible to reproduce this issue with: clang version 11.0.0 (Fedora 11.0.0-2.fc33) and g++ (GCC) 10.2.1 20201125 (Red Hat 10.2.1-9) As libc++ links against glibc, this appears to be strictly an issue with libstdc++ rather than glibc.
[Bug libstdc++/58909] C++11's condition variables fail with static linking
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58909 --- Comment #13 from Jakub Jelinek --- Maybe libc++ doesn't bother with supporting not linking against -lpthread.
[Bug tree-optimization/98848] [9/10/11 regression] vectorizer failed to reduce max pattern since r9-1590
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98848 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org Summary|[9/10/11 regression]|[9/10/11 regression] |vectorizer failed to reduce |vectorizer failed to reduce |max pattern.|max pattern since r9-1590 --- Comment #3 from Jakub Jelinek --- Regressed with r9-1590-g370c2ebe8fa20e0812cd2d533d4ed38ee2d37c85
[Bug libstdc++/92546] Large increase in preprocessed file sizes in C++2a mode
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92546 Jonathan Wakely changed: What|Removed |Added Target Milestone|10.3|12.0 Summary|[10/11 Regression] Large|Large increase in |increase in preprocessed|preprocessed file sizes in |file sizes in C++2a mode|C++2a mode --- Comment #18 from Jonathan Wakely --- Definitely defer. I think we could remove the regression marker now too.
[Bug tree-optimization/98848] [9/10/11 regression] vectorizer failed to reduce max pattern since r9-1590
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98848 --- Comment #4 from Jakub Jelinek --- Alternatively, couldn't we support truncation in the reductions if SSA_NAME_RANGE_INFO suggests that the values are always in the narrower range?
[Bug c++/98861] I want deterministic exceptions (Herbception)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98861 --- Comment #10 from Jonathan Wakely --- (In reply to cqwrteur from comment #3) > Relying on stdio.h even stdio.h is not freestanding. Nonsense. (In reply to cqwrteur from comment #4) > BTW. std::terminate() is not thread-safe which is terrible. Nonsense.
[Bug c++/98861] I want deterministic exceptions (Herbception)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98861 --- Comment #11 from cqwrteur --- (In reply to Jonathan Wakely from comment #10) > (In reply to cqwrteur from comment #3) > > Relying on stdio.h even stdio.h is not freestanding. > > Nonsense. > > (In reply to cqwrteur from comment #4) > > BTW. std::terminate() is not thread-safe which is terrible. > > Nonsense. Functions without thread-safety are always terrible. Like all functions in cctype. They should be avoided like plague.
[Bug rtl-optimization/97684] [11 Regression] ICE in reg_preferred_class, at reginfo.c:789 by r11-4577
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97684 seurer at gcc dot gnu.org changed: What|Removed |Added CC||seurer at gcc dot gnu.org --- Comment #7 from seurer at gcc dot gnu.org --- The two bits of code shown seem to compile fine now on powerpc. Is this done then?
[Bug libstdc++/58909] C++11's condition variables fail with static linking
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58909 --- Comment #14 from Jonathan Wakely --- (In reply to Jakub Jelinek from comment #13) > Maybe libc++ doesn't bother with supporting not linking against -lpthread. libc++ is linked to libpthread.so unconditionally.
[Bug c++/98861] I want deterministic exceptions (Herbception)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98861 --- Comment #12 from cqwrteur --- (In reply to Jonathan Wakely from comment #10) > (In reply to cqwrteur from comment #3) > > Relying on stdio.h even stdio.h is not freestanding. > > Nonsense. stdio.h should not get included in any circumstances for EH. You are implementing the operating system, but you need to enable EH by the standard and EH relies on stdio. Chicken-egg problem.
[Bug target/98065] [11 Regression] ICE in rs6000_expand_vector_set, at config/rs6000/rs6000.c:7024 since r11-5457
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98065 seurer at gcc dot gnu.org changed: What|Removed |Added Resolution|--- |FIXED Status|NEW |RESOLVED CC||seurer at gcc dot gnu.org --- Comment #6 from seurer at gcc dot gnu.org --- This appears to be fixed from testing with latest trunk.
[Bug libstdc++/58909] C++11's condition variables fail with static linking
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58909 --- Comment #15 from Jonathan Wakely --- And the static libc++.a doesn't use weak symbols, so you need to link to libpthread youself, and the necessary symbols get pulled in correctly. The problem for libstdc++.a is that the weak symbols do not cause the libpthread.a symbols to get linked into the binary.
[Bug c++/98861] I want deterministic exceptions (Herbception)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98861 --- Comment #13 from Jonathan Wakely --- (In reply to cqwrteur from comment #11) > Functions without thread-safety are always terrible. Like all functions in > cctype. They should be avoided like plague. It's thread-safe though. What are you talking about? (In reply to cqwrteur from comment #12) > stdio.h should not get included in any circumstances for EH. You are > implementing the operating system, but you need to enable EH by the standard > and EH relies on stdio. Chicken-egg problem. It doesn't depend on stdio though. What are you talking about?
[Bug c++/98861] I want deterministic exceptions (Herbception)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98861 --- Comment #14 from cqwrteur --- (In reply to Jonathan Wakely from comment #13) > (In reply to cqwrteur from comment #11) > > Functions without thread-safety are always terrible. Like all functions in > > cctype. They should be avoided like plague. > > It's thread-safe though. What are you talking about? It calls std::abort() and std::abort() will close FILE* and FILE* might not be thread-safe. BTW std::terminate() is slow compared to compiler intrinsic like __builtin_trap(). > (In reply to cqwrteur from comment #12) > > stdio.h should not get included in any circumstances for EH. You are > > implementing the operating system, but you need to enable EH by the standard > > and EH relies on stdio. Chicken-egg problem. > > It doesn't depend on stdio though. What are you talking about? https://github.com/gcc-mirror/gcc/blob/e11e5d3889f9e54c547efee50fa1b72b50f0f265/libstdc%2B%2B-v3/libsupc%2B%2B/vterminate.cc#L93
[Bug libstdc++/58909] C++11's condition variables fail with static linking
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58909 --- Comment #16 from Jakub Jelinek --- Are those weak refs from libstdc++.a objects or from the user *.o files? If the former, perhaps we could declare some libstdc++ APIs (related to threading) as requiring linking of -lpthread and made them non-weak in libstdc++.a. I don't really see how can one reproduce this on Fedora/RHEL/CentOS where libpthread.a contains a single libpthread.o and therefore it is either you link no thread support at all, or link it completely.
[Bug target/98730] vceqzq_p64 does not generate vceq with immediate 0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98730 --- Comment #5 from CVS Commits --- The master branch has been updated by Christophe Lyon : https://gcc.gnu.org/g:31a0ab9213f780d2fa1da6e4879df214c0f247f9 commit r11-6961-g31a0ab9213f780d2fa1da6e4879df214c0f247f9 Author: Christophe Lyon Date: Thu Jan 28 17:55:45 2021 + arm: Adjust cost of vector of constant zero Neon vector comparisons have a dedicated version when comparing with constant zero: it means its cost is free. Adjust the cost in arm_rtx_costs_internal accordingly, for Neon only, since MVE does not support this. 2021-01-28 Christophe Lyon gcc/ PR target/98730 * config/arm/arm.c (arm_rtx_costs_internal): Adjust cost of vector of constant zero for comparisons. gcc/testsuite/ PR target/98730 * gcc.target/arm/simd/vceqzq_p64.c: Update expected result.
[Bug c++/98861] I want deterministic exceptions (Herbception)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98861 --- Comment #15 from Jonathan Wakely --- > > (In reply to cqwrteur from comment #12) > > > stdio.h should not get included in any circumstances for EH. You are > > > implementing the operating system, but you need to enable EH by the > > > standard > > > and EH relies on stdio. Chicken-egg problem. > > > > It doesn't depend on stdio though. What are you talking about? > > https://github.com/gcc-mirror/gcc/blob/ > e11e5d3889f9e54c547efee50fa1b72b50f0f265/libstdc%2B%2B-v3/libsupc%2B%2B/ > vterminate.cc#L93 Which is disabled in freestanding, and can be optionally disabled in hosted. This is an OPTIONAL feature of libstdc++, not something that exception handling intrinsically relies on. If you don't want it, you don't have to use it. And if you're implementing the OS then you're using freestanding and it's automatically disabled. Looks at line 27 in that file. Once again you are talking out of your backside. Stop wasting our time.
[Bug target/98730] vceqzq_p64 does not generate vceq with immediate 0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98730 Christophe Lyon changed: What|Removed |Added Resolution|--- |FIXED Status|UNCONFIRMED |RESOLVED --- Comment #6 from Christophe Lyon --- Now fixed on trunk