[Bug target/104593] Problem with va_list
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104593 Jonathan Wakely changed: What|Removed |Added Status|WAITING |RESOLVED Resolution|--- |INVALID --- Comment #7 from Jonathan Wakely --- .
[Bug tree-optimization/104582] [11/12 Regression] Unoptimal code for __negdi2 (and others) from libgcc2 due to unwanted vectorization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104582 Richard Biener changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org CC||rsandifo at gcc dot gnu.org Status|NEW |ASSIGNED
[Bug libstdc++/104592] Problem with std::basic_ostream
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104592 Jonathan Wakely changed: What|Removed |Added Status|WAITING |RESOLVED Resolution|--- |INVALID --- Comment #7 from Jonathan Wakely --- Not a bug.
[Bug tree-optimization/104582] [11/12 Regression] Unoptimal code for __negdi2 (and others) from libgcc2 due to unwanted vectorization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104582 --- Comment #13 from Richard Biener --- Created attachment 52476 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52476&action=edit minimal patch This is a minimal untested patch adjusting APIs to allow for the cost hook to receive a slp_node in addition to a stmt_vec_info and make the x86 backend use it and successfully disregard the vectorization that's not doing a CTOR from memory. Other targets need minimal adjustments as well of course and some of the cleanups (additional overloads for record/add_stmt_cost for scalar and branch stmts and two fixes using scalar_stmt rather than vector_stmt kinds for versioning costs can and will be split out). Richard - any comments? Would you object to doing this for GCC 12 (give we changed the costing API anyway)?
[Bug c++/104000] Ordinary constructor cannot delegate to `consteval` constructor
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104000 --- Comment #5 from Fedor Chelnokov --- Based on stackoverflow answer, a modified example was found with the delegation to consteval constructor: ``` struct A { int i = 0; consteval A() = default; A(const A&) = delete; A(int) : A(A()) {} }; ``` which is accepted in GCC. Demo: https://gcc.godbolt.org/z/5PjraK5ox Clang rejects it until one remove `A(const A&) = delete`, which is probably another issue.
[Bug tree-optimization/104595] unvectorized loop due to bool condition loaded from memory
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104595 Richard Biener changed: What|Removed |Added Ever confirmed|0 |1 Last reconfirmed||2022-02-18 Blocks||53947 Keywords||missed-optimization Status|UNCONFIRMED |NEW --- Comment #1 from Richard Biener --- Confirmed. I think this is a omission somewhere in bool pattern recog since we need a tem = pb[i] != 0 ? -1 : 0; kind of computation to generate a mask suitable for vectorization here. We might have a duplicate bugreport as well. Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947 [Bug 53947] [meta-bug] vectorizer missed-optimizations
[Bug tree-optimization/104582] [11/12 Regression] Unoptimal code for __negdi2 (and others) from libgcc2 due to unwanted vectorization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104582 --- Comment #14 from Richard Biener --- Another testcase is struct S { double a, b; } s; void foo (double a, double b) { s.a = a; s.b = b; } which also receives the same costs and compiles vectorized to unpcklpd %xmm1,%xmm0 movaps %xmm0,0x0(%rip) ret which is also smaller than unvectorized.
[Bug tree-optimization/104582] [11/12 Regression] Unoptimal code for __negdi2 (and others) from libgcc2 due to unwanted vectorization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104582 --- Comment #15 from Richard Biener --- The patch will cause FAIL: gcc.target/i386/pr91446.c scan-assembler-times vmovdqa[^\\n\\r]*xmm[0-9] 2 FAIL: gcc.target/i386/pr92658-avx512bw-2.c scan-assembler-times pmovsxdq 2 FAIL: gcc.target/i386/pr92658-sse4-2.c scan-assembler-times pmovsxbq 2 FAIL: gcc.target/i386/pr92658-sse4-2.c scan-assembler-times pmovsxdq 2 FAIL: gcc.target/i386/pr92658-sse4-2.c scan-assembler-times pmovsxwq 2 FAIL: gcc.target/i386/pr92658-sse4.c scan-assembler-times pmovzxbq 2 FAIL: gcc.target/i386/pr92658-sse4.c scan-assembler-times pmovzxdq 2 FAIL: gcc.target/i386/pr92658-sse4.c scan-assembler-times pmovzxwq 2 XPASS: gcc.target/i386/pr99881.c scan-assembler-not xmm[0-9] I have to look into some of them. The pr92658 one seems to be cases like void bar_u32_u64 (v2di * dst, v4si src) { unsigned long long tem[2]; tem[0] = src[0]; tem[1] = src[1]; dst[0] = *(v2di *) tem; } where we fail to recognize the BIT_FIELD_REF as accessing a pre-existing vector (we only support a subset of cases during SLP discovery): _1 = BIT_FIELD_REF ; _2 = (long long unsigned int) _1; tem[0] = _2; _3 = BIT_FIELD_REF ; _4 = (long long unsigned int) _3; tem[1] = _4; but when vectorizing just store and the conversion as [local count: 1073741824]: _1 = BIT_FIELD_REF ; _3 = BIT_FIELD_REF ; _13 = {_1, _3}; vect__2.110_14 = (vector(2) long long unsigned int) _13; MEM [(long long unsigned int *)&tem] = vect__2.110_14; we can recover things on the RTL side. So we just realize that costing is a difficult thing. Cost model analysis: _2 1 times scalar_store costs 12 in body _4 1 times scalar_store costs 12 in body (long long unsigned int) _1 1 times scalar_stmt costs 4 in body (long long unsigned int) _3 1 times scalar_stmt costs 4 in body (long long unsigned int) _1 1 times vector_stmt costs 4 in body node 0x415e268 1 times vec_construct costs 20 in prologue _2 1 times vector_store costs 16 in body Cost model analysis for part in loop 0: Vector cost: 40 Scalar cost: 32 not vectorized: vectorization is not profitable. note this uses icelake-server costs which has an unusally high sse_to_integer cost. The fix here would best be to recognize the BIT_FIELD_REF vector use of course.
[Bug rtl-optimization/104596] New: Means to add a comment in the assembly
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104596 Bug ID: 104596 Summary: Means to add a comment in the assembly Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: enhancement Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: vries at gcc dot gnu.org Target Milestone: --- I wanted to mark some insns in a way that is visible in the assembly, without having to tinker with the .md file. The user-level equivalent would be something like: ... asm ("// Start: added by x") ... asm ("// End: added by x") ... So I wrote: ... static rtx gen_comment (const char *s) { const char *sep = " "; size_t len = strlen (ASM_COMMENT_START) + strlen (sep) + strlen (s) + 1; char *comment = (char *) alloca (len); snprintf (comment, len, "%s%s%s", ASM_COMMENT_START, sep, s); return gen_rtx_ASM_INPUT_loc (VOIDmode, ggc_strdup (comment), cfun->function_start_locus); } ... and used it to generate comments like this: ... // #APP // 2 "pr53465.c" 1 // Start: Added by -minit-regs=3: // #NO_APP mov.u32 %r25, 0; // #APP // 2 "pr53465.c" 1 // End: Added by -minit-regs=3: // #NO_APP ... This however is a bit verbose. The APP/NO_APP is there to separate user insn from compiler insn, but in this case, the compiler added the comments. Furthermore, the file info is not meaningful either, we just use cfun->function_start_locus because with UNKNOWN_LOCATION we run into a segfault. Both these issues are addressed by: ... diff --git a/gcc/final.cc b/gcc/final.cc index a9868861bd2c..5d47f3d5ba0e 100644 --- a/gcc/final.cc +++ b/gcc/final.cc @@ -2642,15 +2642,20 @@ final_scan_insn_1 (rtx_insn *insn, FILE *file, int optimize_p ATTRIBUTE_UNUSED, if (string[0]) { expanded_location loc; - - app_enable (); - loc = expand_location (ASM_INPUT_SOURCE_LOCATION (body)); - if (*loc.file && loc.line) - fprintf (asm_out_file, "%s %i \"%s\" 1\n", - ASM_COMMENT_START, loc.line, loc.file); + bool unknown_loc_p + = ASM_INPUT_SOURCE_LOCATION (body) == UNKNOWN_LOCATION; + + if (!unknown_loc_p) + { + app_enable (); + loc = expand_location (ASM_INPUT_SOURCE_LOCATION (body)); + if (*loc.file && loc.line) + fprintf (asm_out_file, "%s %i \"%s\" 1\n", + ASM_COMMENT_START, loc.line, loc.file); + } fprintf (asm_out_file, "\t%s\n", string); #if HAVE_AS_LINE_ZERO - if (*loc.file && loc.line) + if (!unknown_loc_p && loc.file && *loc.file && loc.line) fprintf (asm_out_file, "%s 0 \"\" 2\n", ASM_COMMENT_START); #endif } ... after which we can do in gen_comment: ... - return gen_rtx_ASM_INPUT_loc (VOIDmode, ggc_strdup (comment), - cfun->function_start_locus); + return gen_rtx_ASM_INPUT (VOIDmode, ggc_strdup (comment)); ... and have the less verbose: ... // Start: Added by -minit-regs=3: mov.u32 %r25, 0; // End: Added by -minit-regs=3: ...
[Bug target/104590] ppc64: even/odd permutation for VSX 64-bit to 32-bit conversions is no longer necessary.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104590 Segher Boessenkool changed: What|Removed |Added Status|UNCONFIRMED |NEW Severity|normal |enhancement Ever confirmed|0 |1 Last reconfirmed||2022-02-18 --- Comment #2 from Segher Boessenkool --- Please send the patch to gcc-patches@ if you want it included. It cannot be committed until stage 1 opens though (but feel free to send it). Thanks!
[Bug c++/104588] memset loses alignment infomation in some cases
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104588 --- Comment #3 from LIU Hao --- Sounds so. Changing `char a[32]` to `long a[4]` or `void* a[4]` makes GCC generate MOVAPS like Clang, but `int a[8]` or `short a[16]` does not.
[Bug c++/104597] New: LTO does not inline indirect call
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104597 Bug ID: 104597 Summary: LTO does not inline indirect call Product: gcc Version: 11.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: m.cencora at gmail dot com Target Milestone: --- Given following files: // main.cpp using intfunc = int (*)(); intfunc getIntFunc(int i); namespace { int test() { auto func = getIntFunc(1); return func(); } } int main() { return test(); } // lib1.cpp namespace { int getInt0() { return 0; } int getInt1() { return 1; } int getInt2() { return 2; } } using intfunc = int (*)(); intfunc getIntFunc(int i) { if (i == 0) { return getInt0; } else if (i == 1) { return getInt1; } else if (i == 2) { return getInt2; } __builtin_abort(); } and compilation with: g++ -std=c++20 -Wall -Wextra -O3 -flto -fvisibility=hidden -fvisibility-inlines-hidden -ffunction-sections -Wl,-gc-sections main.cpp lib1.cpp -o test Call to getInt1 does not get inlined: Dump of assembler code for function main: 0x1040 <+0>: endbr64 0x1044 <+4>: jmp0x1140 <_ZN12_GLOBAL__N_17getInt1Ev>
[Bug c++/104597] LTO does not inline indirect call
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104597 --- Comment #1 from m.cencora at gmail dot com --- clang-12 optimizes it to: Dump of assembler code for function main: 0x00401110 <+0>: mov$0x1,%eax 0x00401115 <+5>: ret
[Bug tree-optimization/104582] [11/12 Regression] Unoptimal code for __negdi2 (and others) from libgcc2 due to unwanted vectorization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104582 Richard Biener changed: What|Removed |Added See Also||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=99881 --- Comment #16 from Richard Biener --- See also PR99881 where this XPASSes its testcase for eventual fallout in x264_r on CLX and 538.imagick_r on Kabylake. Unlike the fix for that PR I'm simply re-using x86_cost->sse_to_integer here.
[Bug c++/102286] [constexpr] construct_at incorrectly starts union array lifetime in some cases
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102286 Artur Bać changed: What|Removed |Added CC||gcc at ebasoft dot com.pl --- Comment #4 from Artur Bać --- This is not legal in c++20 but gcc allows accessing not active member of a union in constexpr https://godbolt.org/z/KsEhffeEa clang refuses and it is right "construction of subobject of member 'object' of union with active member 'init' is not allowed in a constant expression" With c++20 there is no way to have aligned_storage for not trivial type in constexpr (storage without initialization, not std::array that requires construction of non trivial objects upon member activation) https://godbolt.org/z/9EGrf8fbr #include struct foo { int * i; constexpr foo() { i = new int; } constexpr ~foo() { delete(i); } }; union storage { constexpr storage() : init{} {} constexpr ~storage() {} foo object[1]; char init = '\0'; }; consteval bool test() { storage u; auto p = std::addressof(u.object[0]); std::construct_at(p); std::destroy_at(p); return true; } int main() { static_assert( test() ); return test(); }
[Bug tree-optimization/104582] [11/12 Regression] Unoptimal code for __negdi2 (and others) from libgcc2 due to unwanted vectorization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104582 --- Comment #17 from Richard Biener --- For FAIL: gcc.target/i386/pr91446.c scan-assembler-times vmovdqa[^\\n\\r]*xmm[0-9] 2 we used to produce : 0: 48 83 ec 28 sub$0x28,%rsp 4: c4 e1 f9 6e d7 vmovq %rdi,%xmm2 9: c4 e1 f9 6e da vmovq %rdx,%xmm3 e: c4 e3 e9 22 ce 01 vpinsrq $0x1,%rsi,%xmm2,%xmm1 14: c4 e3 e1 22 c1 01 vpinsrq $0x1,%rcx,%xmm3,%xmm0 1a: 48 89 e7mov%rsp,%rdi 1d: c5 f9 7f 0c 24 vmovdqa %xmm1,(%rsp) 22: c5 f9 7f 44 24 10 vmovdqa %xmm0,0x10(%rsp) 28: e8 00 00 00 00 call 2d 2d: 48 83 c4 28 add$0x28,%rsp 31: c3 ret but now reject this on costing grounds. The scalar code is : 0: 48 83 ec 28 sub$0x28,%rsp 4: 48 89 3c 24 mov%rdi,(%rsp) 8: 48 89 e7mov%rsp,%rdi b: 48 89 74 24 08 mov%rsi,0x8(%rsp) 10: 48 89 54 24 10 mov%rdx,0x10(%rsp) 15: 48 89 4c 24 18 mov%rcx,0x18(%rsp) 1a: e8 00 00 00 00 call 1f 1f: 48 83 c4 28 add$0x28,%rsp 23: c3 ret I think the scalar variant is 5 uops up to the call while the vector variant is 9 uops. The scalar variant can also execute 4 of the uops in parallel (well, I guess only up to 3 with 3 store ports). I think the scalar variant is better and so I'm inclined to adjust the testcase.
[Bug c++/104597] LTO does not inline indirect call
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104597 --- Comment #2 from m.cencora at gmail dot com --- Similarly when indirect call is a result of virtual function call, gcc cannot optimize it, while clang can: // main.cpp struct foo { virtual int getInt0() const = 0; virtual int getInt1() const = 0; }; const foo& getFooInstance(); namespace { int test() { auto& foo = getFooInstance(); return foo.getInt1(); } } int main() { return test(); } // lib1.cpp struct foo { virtual int getInt0() const = 0; virtual int getInt1() const = 0; }; namespace { struct bar final : foo { int getInt0() const override { return 0; } int getInt1() const override { return 1; } }; constexpr bar b; } const foo& getFooInstance() { return b; } gcc-11 output: Dump of assembler code for function main: 0x1040 <+0>: endbr64 0x1044 <+4>: lea0x2d75(%rip),%rdi# 0x3dc0 <_ZN12_GLOBAL__N_1L1bE> 0x104b <+11>:jmp0x1150 <_ZNK12_GLOBAL__N_13bar7getInt1Ev> clang-12 output: Dump of assembler code for function main: 0x00401110 <+0>: mov$0x1,%eax 0x00401115 <+5>: ret
[Bug target/104121] [12 Regression] v850: Infinite loop in find_reload_regno_insns() since r12-5852-g50e8b0c9bca6cdc57804f860ec5311b641753fbb
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104121 Alexandre Oliva changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |aoliva at gcc dot gnu.org Status|NEW |ASSIGNED --- Comment #9 from Alexandre Oliva --- Thanks, I've succeeded in duplicating the problem with the preprocessed testcase, both with the earlier tree and with a more recent one. Now I can look into it.
[Bug target/104121] [12 Regression] v850: Infinite loop in find_reload_regno_insns() since r12-5852-g50e8b0c9bca6cdc57804f860ec5311b641753fbb
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104121 --- Comment #10 from Alexandre Oliva --- and then, as I reduced it myself down to the following and compared with the minimized test, I've finally turned on both of my neurons ;-) and it finally hit me: "only with -mv850e2v3" didn't mean "not with other multilibs", but rather "without any optimization". of course, none of the minimized test would survive with optimization. doh! this one triggers with -O2 -g -mv850e2v3: typedef float DFtype __attribute__ ((mode (DF))); typedef _Complex float DCtype __attribute__ ((mode (DC))); DCtype __muldc3 (DFtype a, DFtype b, DFtype c, DFtype d) { DFtype x = __builtin_huge_val () * (a * c - b * d); DFtype y = __builtin_huge_val () * (a * d + b * c); DCtype res; __real__ res = x; __imag__ res = y; return res; }
[Bug target/104598] New: [12 regression] g++.dg/ext/undef-bool-1.C has excess errors after r12-7284-gefbb17db52afd8
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104598 Bug ID: 104598 Summary: [12 regression] g++.dg/ext/undef-bool-1.C has excess errors after r12-7284-gefbb17db52afd8 Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: seurer at gcc dot gnu.org Target Milestone: --- g:efbb17db52afd802300c4dcce208fab326ec2915, r12-7284-gefbb17db52afd8 make -k check-gcc RUNTESTFLAGS="dg.exp=g++.dg/ext/undef-bool-1.C" FAIL: g++.dg/ext/undef-bool-1.C -std=gnu++98 (test for excess errors) FAIL: g++.dg/ext/undef-bool-1.C -std=gnu++14 (test for excess errors) FAIL: g++.dg/ext/undef-bool-1.C -std=gnu++17 (test for excess errors) FAIL: g++.dg/ext/undef-bool-1.C -std=gnu++20 (test for excess errors) # of unexpected failures4 Excess errors: /home/seurer/gcc/git/build/gcc-test/gcc/include/mm_malloc.h:50:7: error: '__posix_memalign' was not declared in this scope; did you mean 'posix_memalign'? commit efbb17db52afd802300c4dcce208fab326ec2915 (HEAD, refs/bisect/bad) Author: Paul A. Clarke Date: Wed Feb 16 20:01:41 2022 -0600 rs6000: __Uglify non-uglified local variables in headers
[Bug target/104598] [12 regression] g++.dg/ext/undef-bool-1.C has excess errors after r12-7284-gefbb17db52afd8
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104598 --- Comment #1 from CVS Commits --- The master branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:df5ed150ee5fbcb8255e05eed978c4af2b3d9bcc commit r12-7294-gdf5ed150ee5fbcb8255e05eed978c4af2b3d9bcc Author: Jakub Jelinek Date: Fri Feb 18 17:21:43 2022 +0100 rs6000: Fix up posix_memalign call in _mm_malloc [PR104598] The uglification changes went in one spot too far and uglified also the anem of function, posix_memalign should be called like that and not a non-existent function instead of it. 2022-02-18 Jakub Jelinek PR target/104257 PR target/104598 * config/rs6000/mm_malloc.h (_mm_malloc): Call posix_memalign rather than __posix_memalign.
[Bug target/104257] rs6000/*intrin.h headers using non-uglified automatic variables
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104257 --- Comment #2 from CVS Commits --- The master branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:df5ed150ee5fbcb8255e05eed978c4af2b3d9bcc commit r12-7294-gdf5ed150ee5fbcb8255e05eed978c4af2b3d9bcc Author: Jakub Jelinek Date: Fri Feb 18 17:21:43 2022 +0100 rs6000: Fix up posix_memalign call in _mm_malloc [PR104598] The uglification changes went in one spot too far and uglified also the anem of function, posix_memalign should be called like that and not a non-existent function instead of it. 2022-02-18 Jakub Jelinek PR target/104257 PR target/104598 * config/rs6000/mm_malloc.h (_mm_malloc): Call posix_memalign rather than __posix_memalign.
[Bug target/104598] [12 regression] g++.dg/ext/undef-bool-1.C has excess errors after r12-7284-gefbb17db52afd8
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104598 Jakub Jelinek changed: What|Removed |Added Target Milestone|--- |12.0 Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED Priority|P3 |P1 CC||jakub at gcc dot gnu.org --- Comment #2 from Jakub Jelinek --- Fixed now.
[Bug tree-optimization/104595] unvectorized loop due to bool condition loaded from memory
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104595 --- Comment #2 from Segher Boessenkool --- This is exactly the same as the char case here though, so it is a bit silly that we miss it :-)
[Bug testsuite/104599] New: [12 regression] gcc.dg/deprecated.c has excess errors after r12-7287-g1b71bc7c8b18bd
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104599 Bug ID: 104599 Summary: [12 regression] gcc.dg/deprecated.c has excess errors after r12-7287-g1b71bc7c8b18bd Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: testsuite Assignee: unassigned at gcc dot gnu.org Reporter: seurer at gcc dot gnu.org Target Milestone: --- g:1b71bc7c8b18bd1b22debfde155f175fd1654942, r12-7287-g1b71bc7c8b18bd make -k check-gcc RUNTESTFLAGS="dg.exp=gcc.dg/deprecated.c" FAIL: gcc.dg/deprecated.c (test for warnings, line 28) FAIL: gcc.dg/deprecated.c (test for excess errors) # of expected passes22 # of unexpected failures2 Excess errors: /home/seurer/gcc/git/gcc-test/gcc/testsuite/gcc.dg/deprecated.c:28:1: warning: type is deprecated [-Wdeprecated-declarations] commit 1b71bc7c8b18bd1b22debfde155f175fd1654942 (HEAD, refs/bisect/bad) Author: Jason Merrill Date: Tue Feb 15 19:17:03 2022 -0500 tree: tweak warn_deprecated_use
[Bug middle-end/104550] bogus warning from -Wuninitialized + -ftrivial-auto-var-init=pattern
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104550 --- Comment #17 from qinzhao at gcc dot gnu.org --- So, based on the discussion so far, I'd like to take the following steps: 1. In GCC12, I will take a conservative solution to fix this bug, i.e: mark the load "MEM" as not needing a warning during __builtin_clear_padding folding phase; this should resolve this issue and has lowest risk to introduce more issues. 2. In GCC13, seeking a better way to do padding initialization. right now, based on the discussion so far, there is no conclusion on which way is better yet. let me know if you have other comments or suggestions.
[Bug target/103623] [12 Regression] error: unable to generate reloads (ICE in curr_insn_transform, at lra-constraints.c:4132), or error: insn does not satisfy its constraints (ICE in extract_constrain
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103623 --- Comment #27 from Segher Boessenkool --- OTOH, it makes no sense to test if we have hard float. The pack and unpack builtins should work (and work the same) whenever long double is double-double.
[Bug target/104024] ICE in curr_insn_transform with -O1 -mpower10-fusion -mpower10-fusion-2logical with __int128_t and __builtin_add_overflow
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104024 --- Comment #3 from Segher Boessenkool --- Most of those options were removed. Does this problem (adjusted properly, those options are now enabled iff you use -mcpu=power10 or later) still happen on trunk?
[Bug go/103573] [12 Regression] trunk 20211203 fails to build libgo on i686-gnu (hurd)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103573 Ian Lance Taylor changed: What|Removed |Added Resolution|--- |DUPLICATE Status|NEW |RESOLVED --- Comment #2 from Ian Lance Taylor --- Closing in favor of PR 104290. *** This bug has been marked as a duplicate of bug 104290 ***
[Bug go/104290] [12 Regression] trunk 20220126 fails to build libgo on i686-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104290 --- Comment #21 from Ian Lance Taylor --- *** Bug 103573 has been marked as a duplicate of this bug. ***
[Bug target/104581] [12 Regression] Huge compile-time regression building SPEC 2017 538.imagick_r with PGO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104581 --- Comment #12 from CVS Commits --- The master branch has been updated by H.J. Lu : https://gcc.gnu.org/g:1931cbad498e625b1e24452dcfffe02539b12224 commit r12-7295-g1931cbad498e625b1e24452dcfffe02539b12224 Author: H.J. Lu Date: Fri Feb 18 10:36:53 2022 -0800 pieces-memset-21.c: Expect vzeroupper for ia32 Update gcc.target/i386/pieces-memset-21.c to expect vzeroupper for ia32 caused by commit fe79d652c96b53384ddfa43e312cb0010251391b Author: Richard Biener Date: Thu Feb 17 14:40:16 2022 +0100 target/104581 - compile-time regression in mode-switching PR target/104581 * gcc.target/i386/pieces-memset-21.c: Expect vzeroupper for ia32.
[Bug target/103623] [12 Regression] error: unable to generate reloads (ICE in curr_insn_transform, at lra-constraints.c:4132), or error: insn does not satisfy its constraints (ICE in extract_constrain
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103623 --- Comment #28 from Peter Bergner --- (In reply to Segher Boessenkool from comment #27) > OTOH, it makes no sense to test if we have hard float. The pack and unpack > builtins should work (and work the same) whenever long double is > double-double. Agreed. For soft-float, the value would be a a GPR pair versus a FPR pair (for -m64). It's a little tricker for -m32 -msoft-float compiles, since a 128-bit long double would live in 4 32-bit GPRs, so more regs than it takes to hold them in FPRs. Not much of a complication, but just needs to be tested on 32-bit to ensure it works as expected.
[Bug middle-end/104550] bogus warning from -Wuninitialized + -ftrivial-auto-var-init=pattern
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104550 --- Comment #18 from qinzhao at gcc dot gnu.org --- One question here, for the following testing case: [opc@qinzhao-ol7u9 104550]$ cat t1.c struct vx_audio_level { int has_monitor_level : 1; }; void vx_set_monitor_level() { struct vx_audio_level info; __builtin_clear_padding (&info); } [opc@qinzhao-ol7u9 104550]$ sh t /home/opc/Install/latest/bin/gcc -O -Wuninitialized -Wall t1.c -S t1.c: In function ‘vx_set_monitor_level’: t1.c:7:2: warning: ‘info’ is used uninitialized [-Wuninitialized] 7 | __builtin_clear_padding (&info); | ^~~ t1.c:6:24: note: ‘info’ declared here 6 | struct vx_audio_level info; |^~~~ We can see that the compiler emitted the exactly same warning as with -ftrivial-auto-var-init=pattern. my question is, is the "info" in __builtin_clear_padding(&info) a REAL use of "info"? is it correct to report the uninitialized use message for it?
[Bug target/104121] [12 Regression] v850: Infinite loop in find_reload_regno_insns() since r12-5852-g50e8b0c9bca6cdc57804f860ec5311b641753fbb
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104121 Alexandre Oliva changed: What|Removed |Added See Also||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=103302 --- Comment #11 from Alexandre Oliva --- Ok, now I think the patch for bug 103302, that brought us this regression, is wrong. Unlike the old reload, lra computes live ranges for reload pseudos, and without the clobbers, they end up much longer, possibly overlapping, to the point that assignments become impossible. But this is unrelated with the loop. find_reload_regno_insns assumes single-insn input and output reloads, and it won't find sequences like those emitted by emit_move_multi_word (or emit_move_complex_parts, for that matter). That was fine when we had sequences that amounted to a clobber plus a pair of moves, because those plus start_insn added up to more than 3, the cut-off for find_reload_regno_insns before entering the endless loop. But an expander for a reload insn that issued two insns could, AFAICT, trigger the problem in which we find a first_insn and then loop forever looking for the second_insn after next_insn became NULL and prev_insn isn't looked at any more, or vice-versa for an output reload. Alas, neither of the fixes for that solve the problem: - getting the loop to terminate and return false when we won't find all of the reload insns with the current logic gets us an infinite loop one level up, as we attempt to spill the reg and assign it again indefinitely. - getting the loop to recognize the entire contiguous sequences, which is what we should probably do, enables progress, but then, we issue more reloads, and because of the extended live ranges, we also fail to assign them, and so on, until we hit the lra max iteration count. Restoring the clobber renders these changes unnecessary, and I guess that's what we should do. It will however bring back the obscure reloading problem we had on risc-v, that likely affects v850 as well, in which a shared register assignment crossing such a clobber could end up killing the source assigned to the same hardware register before copying it to the reload destination. That is far less common, but far more painful when it silently hits.
[Bug c++/96445] extern template results in missing constructor symbol
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96445 --- Comment #2 from tyu at eridex dot org --- The extern template and constant are what would appear in the header file for class C. The explicit instantiation would appear in the source file: // -- C.h template class C { private: constexpr C(T){}; public: constexpr static C make(T t) { return C(t); } }; extern template class C; inline constexpr C constant = C::make(0); // -- C.cpp -- template class C; // -- main.cpp --- int main() { return 0; } // ---
[Bug tree-optimization/104600] New: VCE(vector){} should be converted (or expanded) into BIT_INSERT_EXPR
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104600 Bug ID: 104600 Summary: VCE(vector){} should be converted (or expanded) into BIT_INSERT_EXPR Product: gcc Version: 12.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: enhancement Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: pinskia at gcc dot gnu.org Target Milestone: --- When I looked at PR 104582, I Noticed that we had: _14 = {_1, _5}; _8 = VIEW_CONVERT_EXPR<__int128>(_14); Which can be converted into (with the ordering corrected for endianness): t1 = (__128)_1 _8 = BIT_INSERT_EXPR(t1, 64, _5); You can see this by taking the following testcases: #define vector __attribute__((vector_size(16))) __int128 f(long a, long b) { vector long t = {a, b}; return (__int128)t; } void f1(__int128 *t1, long a, long b) { vector long t = {a, b}; *t1 = (__int128)t; } void f2(__int128 *t1, long a, long b) { vector long t = {a, b}; *t1 = ((__int128)t) + 1; } f2 is really bad for x86_64 as GCC does a store to the stack and then loads back. Note if you use | instead of +, GCC does the right thing even.
[Bug tree-optimization/104582] [11/12 Regression] Unoptimal code for __negdi2 (and others) from libgcc2 due to unwanted vectorization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104582 --- Comment #18 from Andrew Pinski --- (In reply to Andrew Pinski from comment #6) > Hmm: > _14 = {_1, _5}; > _8 = VIEW_CONVERT_EXPR<__int128>(_14); > > Wouldn't it better to convert that to just (hopefully I got the order > correct): > t1 = (__128)_1 > _8 = BIT_INSERT_EXPR(t1, 64, _5); > > ? I filed that as PR 104600 since it might be useful in the general case too.
[Bug c/104506] [12 Regression] ICE: tree check: expected class 'type', have 'exceptional' (error_mark) in useless_type_conversion_p, at gimple-expr.cc:87 on invalid symbol redeclaration
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104506 Andrew Pinski changed: What|Removed |Added URL||https://gcc.gnu.org/piperma ||il/gcc-patches/2022-Februar ||y/590595.html Keywords||patch --- Comment #5 from Andrew Pinski --- Patch submitted: https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590595.html
[Bug rtl-optimization/104596] Means to add a comment in the assembly
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104596 --- Comment #1 from Andrew Pinski --- I am trying to understand what you are trying to do. You want to mark an insn with a comment which is emitted during formation of the prologue generation as being generated because of a specific option? and you don't want to add some extra patterns to do the marking? Is there a reason why you want to annotate the instruction in the assembly besides just easier to see if it was emitted because of that option or is there some assembler reason? If it is just for debugging, why not while emitting the prologue, print out the instruction # that was added (if details dump is enabled) and then use -dP to see the instruction outputs the assembler. The other thing you could is have a INSN_NOTE which takes a string which you then output during the final scan. This requires adding some extra stuff to the rest of the compiler but it should work.
[Bug target/103623] [12 Regression] error: unable to generate reloads (ICE in curr_insn_transform, at lra-constraints.c:4132), or error: insn does not satisfy its constraints (ICE in extract_constrain
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103623 --- Comment #29 from Segher Boessenkool --- (In reply to Peter Bergner from comment #28) > (In reply to Segher Boessenkool from comment #27) > > OTOH, it makes no sense to test if we have hard float. The pack and unpack > > builtins should work (and work the same) whenever long double is > > double-double. > > Agreed. For soft-float, the value would be a a GPR pair versus a FPR pair > (for -m64). It's a little tricker for -m32 -msoft-float compiles, since a > 128-bit long double would live in 4 32-bit GPRs, so more regs than it takes > to hold them in FPRs. Not much of a complication, but just needs to be > tested on 32-bit to ensure it works as expected. It can be in memory, even; it doesn't matter. But it is boring data movement, and in many cases it doesn't generate any code even :-)
[Bug target/103623] [12 Regression] error: unable to generate reloads (ICE in curr_insn_transform, at lra-constraints.c:4132), or error: insn does not satisfy its constraints (ICE in extract_constrain
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103623 --- Comment #30 from Segher Boessenkool --- Btw, does this issue exist for the corresponding __builtin_{un,}pack_ibm128 builtins as well?
[Bug gcov-profile/100289] [11/12 Regression] libgcc/libgcov.h: bootstrap failure due to missing #include
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100289 Joerg Wunsch changed: What|Removed |Added CC||j at uriah dot heep.sax.de --- Comment #16 from Joerg Wunsch --- Can confirm this bug when building an AVR cross-compiler (11.2) on FreeBSD. To get it working, I'm now patching it to #undef HAVE_SYS_MMAN_H in libgcov.h before starting.
[Bug c++/104601] New: [11 Regression] Invalid branch elimination at -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104601 Bug ID: 104601 Summary: [11 Regression] Invalid branch elimination at -O2 Product: gcc Version: 11.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: markus.boeck02 at gmail dot com Target Milestone: --- Following code has been produced via reduction with `creduce`. When compiled with `-O2`, GCC 11 and later versions will incorrectly print `f`, while if `-O1` or lower, or an older version of GCC is used, it will correctly print 't'. #include #include #include inline std::optional a(std::vector::iterator b, std::vector::iterator c, std::optional h(int)) { std::optional d; find_if(b, c, [&](auto e) { d = h(e); return d; }); return d; } std::optional f(int) { return 1; } main() { std::vector g(100); auto e = a(g.begin(), g.end(), f); printf("%c", e ? 't' : 'f'); } For the sake of completion, this was the original code: https://godbolt.org/z/enx19v7E5
[Bug go/104290] [12 Regression] trunk 20220126 fails to build libgo on i686-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104290 --- Comment #22 from Ian Lance Taylor --- Thanks. I'll commit your patches #1 through #8. Your patch #9 is to a generated file. The fix there can't be to patch just the top-level Makefile.in. It has to be to patch whatever is causing Makefile.in to be generated the way that it is. I don't myself know what is going wrong there.
[Bug tree-optimization/104601] [11/12 Regression] Invalid branch elimination at -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104601 Andrew Pinski changed: What|Removed |Added Summary|[11 Regression] Invalid |[11/12 Regression] Invalid |branch elimination at -O2 |branch elimination at -O2 Target Milestone|--- |11.3 --- Comment #1 from Andrew Pinski --- >From fre3 (with details): Value numbering stmt = *__pred$__d_53 = _58 (); Setting value number of .MEM_143 to .MEM_135 (changed) Value numbering stmt = SR.60_59 = MEM [(const struct optional &)__pred$__d_53 + 4]; Setting value number of SR.60_59 to 0 (changed) Value numbering stmt = _60 = VIEW_CONVERT_EXPR(SR.60_59); Match-and-simplified VIEW_CONVERT_EXPR(SR.60_59) to 0 RHS VIEW_CONVERT_EXPR(SR.60_59) simplified to 0 Hmm, Somehow *__pred$__d_53 is missed.
[Bug tree-optimization/104601] [11/12 Regression] Invalid branch elimination at -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104601 Jakub Jelinek changed: What|Removed |Added Last reconfirmed||2022-02-18 Status|UNCONFIRMED |NEW Ever confirmed|0 |1 CC||hubicka at gcc dot gnu.org, ||jakub at gcc dot gnu.org --- Comment #2 from Jakub Jelinek --- This changed with r11-3408-ge977dd5edbcc3a3b88c3bd7efa1026c845af7487
[Bug tree-optimization/104601] [11/12 Regression] Invalid branch elimination at -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104601 --- Comment #3 from Jakub Jelinek --- Testcase without the unneeded which aborts if miscompiled. #include #include inline std::optional a(std::vector::iterator b, std::vector::iterator c, std::optional h(int)) { std::optional d; find_if(b, c, [&](auto e) { d = h(e); return d; }); return d; } std::optional f(int) { return 1; } int main() { std::vector g(100); auto b = g.begin(); auto c = g.end(); auto e = a(b, c, f); if (!e) __builtin_abort(); }
[Bug tree-optimization/104601] [11/12 Regression] Invalid branch elimination at -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104601 --- Comment #4 from Andrew Pinski --- (In reply to Jakub Jelinek from comment #2) > This changed with r11-3408-ge977dd5edbcc3a3b88c3bd7efa1026c845af7487 Hmm, even -fno-ipa-modref does not prevent the wrong code from showing up.
[Bug tree-optimization/104601] [11/12 Regression] Invalid branch elimination at -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104601 --- Comment #5 from Andrew Pinski --- But adding noipa to f does though: [[gnu::noipa]] std::optional f() { return 1; }
[Bug go/104290] [12 Regression] trunk 20220214 fails to build libgo on i686-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104290 --- Comment #23 from CVS Commits --- The master branch has been updated by Ian Lance Taylor : https://gcc.gnu.org/g:3343e7e2c4cd2cd111cda86737f539cc6eda49ff commit r12-7298-g3343e7e2c4cd2cd111cda86737f539cc6eda49ff Author: Ian Lance Taylor Date: Fri Feb 18 15:04:00 2022 -0800 libgo: update Hurd support Patches from Svante Signell for PR go/104290. Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/386797
[Bug ipa/104597] LTO does not inline indirect call
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104597 Andrew Pinski changed: What|Removed |Added Keywords||lto, missed-optimization Severity|normal |enhancement CC||marxin at gcc dot gnu.org Component|c++ |ipa --- Comment #3 from Andrew Pinski --- I suspect this is just the standard issue where we don't inline again after some optimizations. There is another bug like that before. clang does though.
[Bug libstdc++/104602] New: std::source_location::current uses cast from void*
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104602 Bug ID: 104602 Summary: std::source_location::current uses cast from void* Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: foom at fuhm dot net Target Milestone: --- I'm working on implementing __builtin_source_location() in Clang (https://reviews.llvm.org/D120159). In testing it against the libstdc++ header, I ran into a minor issue. "current()" in GNU libstdc++ is defined as so: static consteval source_location current(const void* __p = __builtin_source_location()) noexcept { source_location __ret; __ret._M_impl = static_cast (__p); return __ret; } But! A static_cast from a `const void*` parameter to `const __impl*` is not permitted in constexpr evaluation: """ 5. An expression E is a core constant expression unless the evaluation of E, [...] would evaluate one of the following: [...] 5.15. a conversion from type cv void* to a pointer-to-object type;" """ http://eel.is/c++draft/expr.const#5.15 Clang diagnoses this rule, but GCC apparently does not. (it's not really clear to me why this rule really needs to exist in the standard -- why bother to police which kinds of pointer casts you're allowed to do, instead of just raising an error upon _access_ through the wrong type?) Anyhow, to workaround this issue, I plan to simply hardcode an exception to the check in Clang for casts which occur in a "std::source_location::current" method. Yet, although it's perhaps too late to avoid this workaround, it'd be nice if libstdc++ didn't require the use of an invalid cast. In clang (in my proposed change), __builtin_source_location already returns the expected `const __impl*` type, rather than `const void*` as it does in GCC. So, the issue is only the cast TO `void*` and back again in libstdc++. ISTM this would be fixed by moving the `static_cast ` into the default parameter expression. That would then be a no-op cast on clang, and an (invalid but undiagnosed) cast from void in GCC.
[Bug libstdc++/104602] std::source_location::current uses cast from void*
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104602 --- Comment #1 from Andrew Pinski --- https://gcc.gnu.org/pipermail/gcc-patches/2019-November/534374.html Explains why it is currently this way.
[Bug c++/104603] New: wrong detection of g++ -Warray-bounds about downcast
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104603 Bug ID: 104603 Summary: wrong detection of g++ -Warray-bounds about downcast Product: gcc Version: 10.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: herumi at nifty dot com Target Milestone: --- Created attachment 52477 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52477&action=edit a minimal sample of the bug g++-10.3.0 and g++-11.2 -O2 -Warray-bounds on Ubuntu 20.04.3 LTS show wrong warnings to the attachment code. g++ -O2 -Warray-bounds -DA t.cpp does not show the warnings. g++-9 -O2 -Warray-bounds does not, too. --- >cat t.cpp struct Base { bool isX_; Base(bool isX = false) : isX_(isX) { } bool isX() const { return isX_; } bool operator==(const Base& rhs) const; }; struct X : public Base { X(const Base& b) : Base(true), b_(b) { } bool operator==(const X& rhs) const { return b_ == rhs.b_; } Base b_; }; inline bool Base::operator==(const Base& rhs) const { return isX() && rhs.isX() && static_cast(*this) == static_cast(rhs); } Base base; #ifndef A void f() { X(base) == X(base); } #endif int main() { #ifdef A X(base) == X(base); #endif } --- --- % g++-10 --version g++-10 (Ubuntu 10.3.0-1ubuntu1~20.04) 10.3.0 % g++-10 -O2 -Warray-bounds array-bounds-bug.cpp array-bounds-bug.cpp: In function 'void f()': array-bounds-bug.cpp:18:29: warning: array subscript 2 is outside array bounds of 'X [1]' [-Warray-bounds] 18 | bool isX() const { return isX_; } | ^~~~ array-bounds-bug.cpp:38:9: note: while referencing '' 38 | X(base) == X(base); | ^ array-bounds-bug.cpp:18:29: warning: array subscript 2 is outside array bounds of 'X [1]' [-Warray-bounds] 18 | bool isX() const { return isX_; } | ^~~~ array-bounds-bug.cpp:38:20: note: while referencing '' 38 | X(base) == X(base); |^ array-bounds-bug.cpp:18:29: warning: array subscript 3 is outside array bounds of 'X [1]' [-Warray-bounds] 18 | bool isX() const { return isX_; } | ^~~~ array-bounds-bug.cpp:38:9: note: while referencing '' 38 | X(base) == X(base); | ^ array-bounds-bug.cpp:18:29: warning: array subscript 3 is outside array bounds of 'X [1]' [-Warray-bounds] 18 | bool isX() const { return isX_; } | ^~~~ array-bounds-bug.cpp:38:20: note: while referencing '' 38 | X(base) == X(base); |^ array-bounds-bug.cpp:18:29: warning: array subscript 4 is outside array bounds of 'X [1]' [-Warray-bounds] 18 | bool isX() const { return isX_; } | ^~~~ array-bounds-bug.cpp:38:9: note: while referencing '' 38 | X(base) == X(base); | ^ array-bounds-bug.cpp:18:29: warning: array subscript 4 is outside array bounds of 'X [1]' [-Warray-bounds] 18 | bool isX() const { return isX_; } | ^~~~ array-bounds-bug.cpp:38:20: note: while referencing '' 38 | X(base) == X(base); |^ array-bounds-bug.cpp:18:29: warning: array subscript 5 is outside array bounds of 'X [1]' [-Warray-bounds] 18 | bool isX() const { return isX_; } | ^~~~ array-bounds-bug.cpp:38:9: note: while referencing '' 38 | X(base) == X(base); | ^ array-bounds-bug.cpp:18:29: warning: array subscript 5 is outside array bounds of 'X [1]' [-Warray-bounds] 18 | bool isX() const { return isX_; } | ^~~~ array-bounds-bug.cpp:38:20: note: while referencing '' 38 | X(base) == X(base); |^ array-bounds-bug.cpp:24:51: warning: array subscript 2 is outside array bounds of 'X [1]' [-Warray-bounds] 24 | bool operator==(const X& rhs) const { return b_ == rhs.b_; } |~~~^ array-bounds-bug.cpp:38:9: note: while referencing '' 38 | X(base) == X(base); | ^ array-bounds-bug.cpp:24:58: warning: array subscript 2 is outside array bounds of 'X [1]' [-Warray-bounds] 24 | bool operator==(const X& rhs) const { return b_ == rhs.b_; } | ^~ array-bounds-bug.cpp:38:20: note: while referencing '' 38 | X(base) == X(base); ---
[Bug c++/104603] wrong detection of g++ -Warray-bounds about downcast
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104603 --- Comment #1 from Andrew Pinski --- -DA just changes inlining. This is just an inlining mess which you can see from the diagnostic on the trunk: In member function 'bool Base::isX() const', inlined from 'bool Base::operator==(const Base&) const' at :16:15, inlined from 'bool X::operator==(const X&) const' at :10:51, inlined from 'bool Base::operator==(const Base&) const' at :16:63, inlined from 'bool X::operator==(const X&) const' at :10:51, inlined from 'void f()' at :24:11: :4:29: warning: array subscript 2 is outside array bounds of 'X [1]' [-Warray-bounds] 4 | bool isX() const { return isX_; } | ^~~~ The warning happens before some other optimizations happen which allows GCC to prove the function will just always return false ...
[Bug tree-optimization/104603] [10/11/12 Regression] wrong detection of -Warray-bounds for interesting tail resusive case
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104603 Andrew Pinski changed: What|Removed |Added Summary|wrong detection of g++ |[10/11/12 Regression] wrong |-Warray-bounds about|detection of |downcast|-Warray-bounds for ||interesting tail resusive ||case Component|c++ |tree-optimization Target Milestone|--- |10.4 --- Comment #2 from Andrew Pinski --- Someone else needs to look into this further than me because the warning only happens because there are cases where the access can happen but the accesses are not really used. Also if this is from some larger code, it might be useful to have the non-reduced testcase since the reduced testcase might being showing something different.
[Bug tree-optimization/104603] [10/11/12 Regression] wrong detection of -Warray-bounds for interesting tail resusive case
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104603 --- Comment #3 from herumi --- >Also if this is from some larger code, >it might be useful to have the non-reduced testcase >since the reduced testcase might being showing something different. The reason why I made this code is from the issue: https://github.com/herumi/xbyak/issues/137
[Bug tree-optimization/104603] [10/11/12 Regression] wrong detection of -Warray-bounds for interesting tail resusive case
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104603 --- Comment #4 from Andrew Pinski --- (In reply to herumi from comment #3) > The reason why I made this code is from the issue: > https://github.com/herumi/xbyak/issues/137 Can you file a seperate issue with the preprocessed source (-save-temps) since it really does look like a seperate issue all together. If it is not a seperate issue in the end, at least we recorded the issue with the original source.
[Bug tree-optimization/104603] [10/11/12 Regression] wrong detection of -Warray-bounds for interesting tail resusive case
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104603 --- Comment #5 from herumi --- >Can you file a seperate issue with the preprocessed source (-save-temps) since >it really does look like a seperate issue all together. May I attach a zipped a.ii which is generated by the following commands? The size of a.ii is over 1700KiB. --- >cat a.cpp #include using namespace Xbyak::util; void f() { ptr[eax] == ptr[eax]; } --- --- g++-11.2 -O2 -I ../ -Warray-bounds -c a.cpp -save-temps ---
[Bug tree-optimization/104603] [10/11/12 Regression] wrong detection of -Warray-bounds for interesting tail resusive case
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104603 --- Comment #6 from Andrew Pinski --- (In reply to herumi from comment #5) > >Can you file a seperate issue with the preprocessed source (-save-temps) > >since it really does look like a seperate issue all together. > > May I attach a zipped a.ii which is generated by the following commands? Zipped is perfect.
[Bug rtl-optimization/104596] Means to add a comment in the assembly
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104596 --- Comment #2 from Tom de Vries --- (In reply to Andrew Pinski from comment #1) > I am trying to understand what you are trying to do. > You want to mark an insn with a comment One ore more insns, yes. > which is emitted during formation of > the prologue generation as being generated because of a specific option? Not necessarily in the prologue, it could be anywhere. > and you don't want to add some extra patterns to do the marking? > Yes. Instead, ideally I'd like gcc to provide a gen_comment (or, some alternative, like the ability to tag insns themselves with a comment that is output before the insn). [ Both approaches raise questions in the context of optimizations. But I'm planning to use this late enough in the compiler not to have to bother with those questions. ] > Is there a reason why you want to annotate the instruction in the assembly > besides just easier to see if it was emitted because of that option or is > there some assembler reason? > No, it's just the former. > If it is just for debugging, why not while emitting the prologue, print out > the instruction # that was added (if details dump is enabled) and then use > -dP to see the instruction outputs the assembler. > That's feasible if you're interested in say, one insn (and it still requires you to go and reproduce the command line to generate the code, add the dump flags, find the relevant line in the dump file and then find the corresponding insn in the assembly. If the compiler already emits the comment, all you need to do is read the assembly). Another scenario is: I have assembly for an entire executable, including all libraries, and I want to easily be able to find all insns that where introduced because of a compiler pass, without having to diff against a version with that pass disabled. Note that for nvptx, a library or executable is assembly, so those comments can still be useful when 'disassembling' an executable. F.i. there are various workarounds in the nvptx port that introduce insns, and an executable is hand-editable, so we can do the sort of tinkering of: the executable fails, this code introduced by the workaround looks suspicious, lets disable it and see if it passes. > The other thing you could is have a INSN_NOTE which takes a string which you > then output during the final scan. This requires adding some extra stuff to > the rest of the compiler but it should work. Sure, that would work as well. Though I think the concept just maps very well on the user-level 'asm ("// comment")'.
[Bug tree-optimization/104603] [10/11/12 Regression] wrong detection of -Warray-bounds for interesting tail resusive case
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104603 --- Comment #7 from herumi --- Created attachment 52478 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52478&action=edit an original code The array-bounds.zip file is a little stripped original issue. array-bounds% g++-11.2 -O2 -c a.cpp -Warray-bounds -save-temps
[Bug tree-optimization/104604] New: wrong code with -O2 -fconserve-stack --param=vrp1-mode=ranger
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104604 Bug ID: 104604 Summary: wrong code with -O2 -fconserve-stack --param=vrp1-mode=ranger Product: gcc Version: 12.0 Status: UNCONFIRMED Keywords: wrong-code Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: zsojka at seznam dot cz Target Milestone: --- Host: x86_64-pc-linux-gnu Target: x86_64-pc-linux-gnu Created attachment 52479 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52479&action=edit reduced testcase Output: $ x86_64-pc-linux-gnu-gcc -O2 -fconserve-stack --param=vrp1-mode=ranger testcase.c -Wno-psabi $ ./a.out Aborted The value of x[] is {3, 3, 3, 3, ... } It seems "i /= c;" got optimised out. $ x86_64-pc-linux-gnu-gcc -v Using built-in specs. COLLECT_GCC=/repo/gcc-trunk/binary-latest-amd64/bin/x86_64-pc-linux-gnu-gcc COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-r12-7293-20220218075854-gfe79d652c96-checking-yes-rtl-df-extra-nobootstrap-amd64/bin/../libexec/gcc/x86_64-pc-linux-gnu/12.0.1/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++ --enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra --disable-bootstrap --with-cloog --with-ppl --with-isl --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu --target=x86_64-pc-linux-gnu --with-ld=/usr/bin/x86_64-pc-linux-gnu-ld --with-as=/usr/bin/x86_64-pc-linux-gnu-as --disable-libstdcxx-pch --prefix=/repo/gcc-trunk//binary-trunk-r12-7293-20220218075854-gfe79d652c96-checking-yes-rtl-df-extra-nobootstrap-amd64 Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 12.0.1 20220218 (experimental) (GCC)
[Bug tree-optimization/96522] [9 Regression] Incorrect with with -O -fno-tree-pta
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96522 --- Comment #10 from CVS Commits --- The releases/gcc-9 branch has been updated by Richard Biener : https://gcc.gnu.org/g:308de43fe48321588dca9830a617b92441779429 commit r9-9958-g308de43fe48321588dca9830a617b92441779429 Author: Richard Biener Date: Thu Aug 27 11:48:15 2020 +0200 tree-optimization/96522 - transfer of flow-sensitive info in copy_ref_info This removes the bogus tranfer of flow-sensitive info in copy_ref_info plus fixes one oversight in FRE when flow-sensitive non-NULLness was added to points-to info. 2020-08-27 Richard Biener PR tree-optimization/96522 * tree-ssa-address.c (copy_ref_info): Reset flow-sensitive info of the copied points-to. Transfer bigger alignment via the access type. * tree-ssa-sccvn.c (eliminate_dom_walker::eliminate_stmt): Reset all flow-sensitive info. * gcc.dg/torture/pr96522.c: New testcase. (cherry picked from commit eb68d9d828f94d28afa5900fbf3072bbcd64ba8a)
[Bug tree-optimization/97043] latent wrong-code with SLP vectorization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97043 --- Comment #4 from CVS Commits --- The releases/gcc-9 branch has been updated by Richard Biener : https://gcc.gnu.org/g:e75e5d2c41d294c4da4adfe610204ce5d97c3a4e commit r9-9959-ge75e5d2c41d294c4da4adfe610204ce5d97c3a4e Author: Richard Biener Date: Mon Sep 14 11:25:04 2020 +0200 tree-optimization/97043 - fix latent wrong-code with SLP vectorization When the unrolling decision comes late and would have prevented eliding a SLP load permutation we can end up generating aligned loads when the load is in fact unaligned. Most of the time alignment analysis figures out the load is in fact unaligned but that cannot be relied upon. The following removes the SLP load permutation eliding based on the still premature vectorization factor. 2020-09-14 Richard Biener PR tree-optimization/97043 * tree-vect-slp.c (vect_analyze_slp_instance): Do not elide a load permutation if the current vectorization factor is one. (cherry picked from commit e93428a8b056aed83a7678d4dc8272131ab671ba)
[Bug tree-optimization/96522] [9 Regression] Incorrect with with -O -fno-tree-pta
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96522 Richard Biener changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED Known to fail||9.4.0 Known to work||9.4.1 --- Comment #11 from Richard Biener --- Fixed.
[Bug tree-optimization/102798] [9 Regression] wrong code with -O3 -fno-tree-pta by r9-2475
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102798 Bug 102798 depends on bug 96522, which changed state. Bug 96522 Summary: [9 Regression] Incorrect with with -O -fno-tree-pta https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96522 What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED
[Bug tree-optimization/102798] [9 Regression] wrong code with -O3 -fno-tree-pta by r9-2475
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102798 Richard Biener changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #20 from Richard Biener --- Now really fixed.
[Bug target/84201] 549.fotonik3d_r from SPEC2017 fails verification with recent Intel and AMD CPUs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84201 --- Comment #15 from Richard Biener --- OK, let me try cooking sth up.
[Bug target/104590] New: ppc64: even/odd permutation for VSX 64-bit to 32-bit conversions is no longer necessary.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104590 Bug ID: 104590 Summary: ppc64: even/odd permutation for VSX 64-bit to 32-bit conversions is no longer necessary. Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: seiko at imavr dot com Target Milestone: --- Created attachment 52473 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52473&action=edit proposal patch remove the extra shuffles The following VSX intrinsics did not need to exist in the first place, and they must be replaced or renamed by convinced names suitable to the mapped instructions or at least modified by removing the unnecessary shuffles: - vector unsigned int vec_unsignede(vector double) -> xvcvdpuxws - vector unsigned int vec_unsignedo(vector double) -> xvcvdpuxws - vector float vec_floate(vector double) -> xvcvdpsp - vector float vec_floato(vector double) -> xvcvdpsp - vector float vec_floate(vector signed long long) -> xvcvsxdsp - vector float vec_floato(vector signed long long) -> xvcvsxdsp - vector float vec_floate(vector unsigned long long) -> xvcvuxdsp - vector float vec_floato(vector unsigned long long) -> xvcvuxdsp According to the latest update of ISA 3.1: Previous versions of the architecture allowed the contents of bits 32:63 of each doubleword in the result register to be undefined, however, all processors that support this instruction write the result into bits 32:63 of each doubleword in the result register as well as into bits 0:31, as is required by this version of the architecture.
[Bug tree-optimization/104582] [11/12 Regression] Unoptimal code for __negdi2 (and others) from libgcc2 due to unwanted vectorization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104582 --- Comment #8 from Jakub Jelinek --- Just trying a dumb microbenchmark: struct S { unsigned long a, b; } s; __attribute__((noipa)) void foo (unsigned long a, unsigned long b) { s.a = a; s.b = b; } int main () { int i; for (i = 0; i < 10; i++) foo (42, 43); return 0; } the GCC 11 vs. GCC 12 code: - movq%rdi, s(%rip) - movq%rsi, s+8(%rip) + movq%rdi, %xmm0 + movq%rsi, %xmm1 + punpcklqdq %xmm1, %xmm0 + movaps %xmm0, s(%rip) seems to be exactly the same speed (on i9-7960X) and the GCC 11 code is 7 bytes smaller.
[Bug target/103623] [12 Regression] error: unable to generate reloads (ICE in curr_insn_transform, at lra-constraints.c:4132), or error: insn does not satisfy its constraints (ICE in extract_constrain
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103623 --- Comment #25 from Kewen Lin --- The key difference from the previous bif support is that: previously we checked TARGET_HARD_FLOAT but now we didn't. I think we still need to check it, as the document here https://gcc.gnu.org/onlinedocs/gcc/Basic-PowerPC-Built-in-Functions-Available-on-ISA-2_002e05.html, these bifs requires "-mhard-float" option. And all the alternatives of unpack_nodm and pack with mode iterator FMOVE128 will use constraint d which only takes effect with -mhard-float. Just a record for the guards in the previous support: /* 128-bit long double floating point builtins. */ #define BU_LDBL128_2(ENUM, NAME, ATTR, ICODE) \ RS6000_BUILTIN_2 (MISC_BUILTIN_ ## ENUM, /* ENUM */ \ "__builtin_" NAME, /* NAME */ \ (RS6000_BTM_HARD_FLOAT /* MASK */ \ | RS6000_BTM_LDBL128), \ (RS6000_BTC_ ## ATTR/* ATTR */ \ | RS6000_BTC_BINARY), \ CODE_FOR_ ## ICODE) /* ICODE */ /* 128-bit __ibm128 floating point builtins (use -mfloat128 to indicate that __ibm128 is available). */ #define BU_IBM128_2(ENUM, NAME, ATTR, ICODE)\ RS6000_BUILTIN_2 (MISC_BUILTIN_ ## ENUM, /* ENUM */ \ "__builtin_" NAME, /* NAME */ \ (RS6000_BTM_HARD_FLOAT /* MASK */ \ | RS6000_BTM_FLOAT128),\ (RS6000_BTC_ ## ATTR/* ATTR */ \ | RS6000_BTC_BINARY), \ CODE_FOR_ ## ICODE) /* ICODE */
[Bug c++/104591] New: Problem with unary_function
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104591 Bug ID: 104591 Summary: Problem with unary_function Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: lukaszcz18 at wp dot pl Target Milestone: --- I use library vvenc c++14 https://github.com/fraunhoferhhi/vvenc/commit/69469d7ac5de882d9f5e12b24ee87f376df20262 or jvetvvc c++11 https://vcgit.hhi.fraunhofer.de/jvet/VVCSoftware_VTM/-/commit/ab0bea02235bb876c9d3bd8a9d3b2fca7ad1b8eb and https://github.com/Jamaika1/mingw_std_threads and http://msystem.waw.pl/x265/mingw-gcc1201-20220206.7z In file included from Unit.h:53, from AdaptiveLoopFilter.h:54: Common.h:184:41: warning: 'template struct std::unary_function' is deprecated [-Wdeprecated-declarations] 184 | struct hash : public unary_function | ^~ In file included from c:\msys1200\include\c++\12.0.1\string:48, from c:\msys1200\include\c++\12.0.1\bits\locale_classes.h:40, from c:\msys1200\include\c++\12.0.1\bits\ios_base.h:41, from c:\msys1200\include\c++\12.0.1\ios:42, from c:\msys1200\include\c++\12.0.1\ostream:38, from c:\msys1200\include\c++\12.0.1\iostream:39, from CommonDef.h:53: c:\msys1200\include\c++\12.0.1\bits\stl_function.h:117:12: note: declared here 117 | struct unary_function |^~
[Bug target/104590] ppc64: even/odd permutation for VSX 64-bit to 32-bit conversions is no longer necessary.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104590 --- Comment #1 from Sayed Adel --- forget to mention: - vector signed int vec_signede(vector double) -> xvcvdpsxws - vector signed int vec_signedo(vector double) -> xvcvdpsxws
[Bug libstdc++/104591] Problem with unary_function
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104591 Andrew Pinski changed: What|Removed |Added Resolution|--- |INVALID Component|c++ |libstdc++ Status|UNCONFIRMED |RESOLVED --- Comment #1 from Andrew Pinski --- std::unary_function has been deprecated since C++11 and is also removed in C++17. Just GCC before 12 did not warn about the deprecatation.
[Bug target/104353] ppc64le: Apparent reliance on undefined behavior of xvcvdpsxws
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104353 Sayed Adel changed: What|Removed |Added CC||seiko at imavr dot com --- Comment #3 from Sayed Adel --- that close leads to another bug since GCC follows the previous versions of ISA in several places. for example, VSX intrinsics vec_floate, vec_floato, etc. I filled a bug for it, see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104590
[Bug target/103623] [12 Regression] error: unable to generate reloads (ICE in curr_insn_transform, at lra-constraints.c:4132), or error: insn does not satisfy its constraints (ICE in extract_constrain
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103623 --- Comment #26 from Kewen Lin --- Created attachment 52474 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52474&action=edit Untested patch
[Bug libstdc++/104591] Problem with unary_function
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104591 Andrew Pinski changed: What|Removed |Added See Also||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=91260 --- Comment #2 from Andrew Pinski --- See PR 91260.
[Bug tree-optimization/104582] [11/12 Regression] Unoptimal code for __negdi2 (and others) from libgcc2 due to unwanted vectorization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104582 --- Comment #9 from Richard Biener --- (In reply to Jakub Jelinek from comment #8) > Just trying a dumb microbenchmark: > struct S { unsigned long a, b; } s; > > __attribute__((noipa)) void > foo (unsigned long a, unsigned long b) > { > s.a = a; > s.b = b; > } > > int > main () > { > int i; > for (i = 0; i < 10; i++) > foo (42, 43); > return 0; > } > the GCC 11 vs. GCC 12 code: > - movq%rdi, s(%rip) > - movq%rsi, s+8(%rip) > + movq%rdi, %xmm0 > + movq%rsi, %xmm1 > + punpcklqdq %xmm1, %xmm0 > + movaps %xmm0, s(%rip) > seems to be exactly the same speed (on i9-7960X) and the GCC 11 code is 7 > bytes smaller. The GCC 12 code is 30% slower on Zen 2 (the gpr -> xmm move is comparatively more costly there). As said we fail to account for that. But as I said the cost is not there if it's struct S { unsigned long a, b; } s; __attribute__((noipa)) void foo (unsigned long *a, unsigned long *b) { unsigned long a_ = *a; unsigned long b_ = *b; s.a = a_; s.b = b_; } which vectorizes to movq(%rdi), %xmm0 movhps (%rsi), %xmm0 movaps %xmm0, s(%rip) ret which is _smaller_ than the scalar code. So it's important to be able to distinguish those cases. The above is also a__3 1 times scalar_store costs 12 in body b__5 1 times scalar_store costs 12 in body a__3 1 times vector_store costs 12 in body 1 times vec_construct costs 8 in prologue
[Bug target/104024] ICE in curr_insn_transform with -O1 -mpower10-fusion -mpower10-fusion-2logical with __int128_t and __builtin_add_overflow
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104024 --- Comment #2 from Kewen Lin --- Created attachment 52475 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52475&action=edit Tested patch
[Bug c++/104592] New: Problem with std::basic_ostream
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104592 Bug ID: 104592 Summary: Problem with std::basic_ostream Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: lukaszcz18 at wp dot pl Target Milestone: --- I use boost 1.78 c++11 https://boostorg.jfrog.io/artifactory/main/release/1.78.0/source/boost_1_78_0.zip and charls-tools https://github.com/malaterre/charls-tools/commit/b27c6071a42996b26ca91fbb015d4b42238d13cf and https://github.com/team-charls/charls/commit/662d4f2a0238357ccc4d89cd14b1fa67d2597ff1 jplsinfo.cpp:26:12: error: no match for 'operator<<' (operand types are 'std::stringstream' {aka 'std::__cxx11::basic_stringstream'} and 'const charls::spiff_profile_id') 26 | ss << val; | ~~~^~ In file included from c:\msys1200\include\c++\12.0.1\istream:39, from c:\msys1200\include\c++\12.0.1\fstream:38, from jplsinfo.cpp:6: c:\msys1200\include\c++\12.0.1\ostream:108:7: note: candidate: 'std::basic_ostream<_CharT, _Traits>::__ostream_type& std::basic_ostream<_CharT, _Traits>::operator<<(__ostream_type& (*)(__ostream_type&)) [with _CharT = char; _Traits = std::char_traits; __ostream_type = std::basic_ostream]' 108 | operator<<(__ostream_type& (*__pf)(__ostream_type&)) | ^~~~ c:\msys1200\include\c++\12.0.1\ostream:108:36: note: no known conversion for argument 1 from 'const charls::spiff_profile_id' to 'std::basic_ostream::__ostream_type& (*)(std::basic_ostream::__ostream_type&)' {aka 'std::basic_ostream& (*)(std::basic_ostream&)'} 108 | operator<<(__ostream_type& (*__pf)(__ostream_type&)) | ~~^~
[Bug target/104024] ICE in curr_insn_transform with -O1 -mpower10-fusion -mpower10-fusion-2logical with __int128_t and __builtin_add_overflow
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104024 Kewen Lin changed: What|Removed |Added Status|NEW |ASSIGNED
[Bug tree-optimization/104582] [11/12 Regression] Unoptimal code for __negdi2 (and others) from libgcc2 due to unwanted vectorization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104582 --- Comment #10 from Richard Biener --- Btw, I think it makes sense to build libgcc with -mno-sse, maybe even -mgeneral-regs-only. Or globally with -fno-tree-vectorize (but we likely do not want %xmm uses for parameter setup either with the move-by-pieces changes - IIRC I've seen uses in the unwinder code trapping because of a misaligned stack in an executable).
[Bug libstdc++/104592] Problem with std::basic_ostream
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104592 Andrew Pinski changed: What|Removed |Added Status|UNCONFIRMED |WAITING Last reconfirmed||2022-02-18 Ever confirmed|0 |1 --- Comment #1 from Andrew Pinski --- Can you provide the preprocessed source?
[Bug c++/104593] New: Problem with va_list
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104593 Bug ID: 104593 Summary: Problem with va_list Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: lukaszcz18 at wp dot pl Target Milestone: --- I use vvenc/vvdec c++14 https://github.com/fraunhoferhhi/vvenc/commit/69469d7ac5de882d9f5e12b24ee87f376df20262 In file included from AffineGradientSearch.h:53, from AffineGradientSearch.cpp:57: CommonDef.h:595:62: warning: ignoring attributes on template argument 'void(void*, int, const char*, va_list)' {aka 'void(void*, int, const char*, char*)'} [-Wignored-attributes] 595 | extern std::function g_msgFnc;
[Bug c++/104593] Problem with va_list
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104593 Andrew Pinski changed: What|Removed |Added Last reconfirmed||2022-02-18 Status|UNCONFIRMED |WAITING Ever confirmed|0 |1 --- Comment #1 from Andrew Pinski --- Can you provide the preprocessed source?
[Bug tree-optimization/104582] [11/12 Regression] Unoptimal code for __negdi2 (and others) from libgcc2 due to unwanted vectorization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104582 --- Comment #11 from Jakub Jelinek --- True. So another option is to try to undo some of those short vectorization cases during isel, expansion or later, though e.g. for the negdi2 case it will go already during expansion into memory.
[Bug target/104593] Problem with va_list
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104593 Andrew Pinski changed: What|Removed |Added Keywords||diagnostic Component|c++ |target --- Comment #2 from Andrew Pinski --- Also what exact target is this on?
[Bug target/104593] Problem with va_list
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104593 --- Comment #3 from Andrew Pinski --- #include #include extern std::function g_msgFnc; Does not warn for me on x86_64-linux-gnu.
[Bug libstdc++/104592] Problem with std::basic_ostream
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104592 --- Comment #2 from Jamaika --- http://msystem.waw.pl/x265/mingw-gcc1201-20220206.7z
[Bug target/104593] Problem with va_list
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104593 --- Comment #4 from Jamaika --- (In reply to Andrew Pinski from comment #1) > Can you provide the preprocessed source? http://msystem.waw.pl/x265/mingw-gcc1201-20220206.7z
[Bug libstdc++/104592] Problem with std::basic_ostream
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104592 --- Comment #3 from Andrew Pinski --- (In reply to Jamaika from comment #2) > http://msystem.waw.pl/x265/mingw-gcc1201-20220206.7z That is the GCC binary. Please read https://gcc.gnu.org/bugs/ and provide the preprocessed source for what you are compiling. And also all of the options.
[Bug c++/104594] New: narrowing conversion of -1 to unsigned char at compile time not detected
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104594 Bug ID: 104594 Summary: narrowing conversion of -1 to unsigned char at compile time not detected Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: raffael at casagrande dot ch Target Milestone: --- The current gcc trunk compiles the following piece of code: template concept Geometry = (DIM_FROM == -1); template requires Geometry auto GaussNewton(const INIT& init) -> void {} template struct X { static constexpr int n = N; }; int main() { GaussNewton(X<-1>{}); } -- I think this should NOT compile since it entails a narrowing conversion of -1 to an unsigned char type at compile time. Clang as well as MSVC fail to compile the code. (In many other cases, gcc also fails to compile if such a narrowing conversion happens at compile time.)
[Bug tree-optimization/104582] [11/12 Regression] Unoptimal code for __negdi2 (and others) from libgcc2 due to unwanted vectorization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104582 --- Comment #12 from rguenther at suse dot de --- On Fri, 18 Feb 2022, jakub at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104582 > > --- Comment #11 from Jakub Jelinek --- > True. > So another option is to try to undo some of those short vectorization cases > during isel, expansion or later, though e.g. for the negdi2 case it will go > already during expansion into memory. Yes, there are duplicates about this issue and it's really hard to solve generally. There's the possibility to try improving on the costing side but currently the cost hooks just see ix86_vector_costs::add_stmt_cost (this=0x41b88c0, count=1, kind=vec_construct, stmt_info=0x0, vectype=, misalign=0, where=vect_prologue) so they have no idea about the feeding stmts. The cost entry is generated by vect_prologue_cost_for_slp which knows the scalar operands but we do not pass the SLP node down to the cost hooks (that's something on my list but my idea was to push it back when we only have SLP nodes and thus could go w/o the stmt_info then). The other possibility is (for the original testcase) to anticipate that RTL expansion will expand 'w' to a TImode register and take that as a reason to pessimize vectorization (but we don't know how it's going to be used, so that's probably a flawed attempt). The only short-term fixes are a) biasing the costing, regressing the from memory case, b) pass down the SLP node where available and look at the defs of the CTOR components, costing a gpr->xmm move where it can be anticipated. b) is more future-proof, if we'd take that at this point I can see how intrusive it would be.
[Bug c++/104594] narrowing of -1 to unsigned char not detected with requires concepts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104594 Andrew Pinski changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2022-02-18 Summary|narrowing conversion of -1 |narrowing of -1 to unsigned |to unsigned char at compile |char not detected with |time not detected |requires concepts Ever confirmed|0 |1 --- Comment #1 from Andrew Pinski --- Confirmed. GCC correctly detects: template constexpr bool Geometry = (DIM_FROM == -1); template constexpr bool tt1 = Geometry; template struct X { static constexpr int n = N; }; bool t = tt1>;
[Bug tree-optimization/100464] [11 Regression] emitted binary code changes when -g is enabled at -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100464 --- Comment #16 from CVS Commits --- The releases/gcc-11 branch has been updated by Richard Biener : https://gcc.gnu.org/g:462900ba21f5fdf865c93f693083da3179dd3151 commit r11-9591-g462900ba21f5fdf865c93f693083da3179dd3151 Author: Richard Biener Date: Fri May 7 09:51:18 2021 +0200 middle-end/100464 - avoid spurious TREE_ADDRESSABLE in folding debug stmts canonicalize_constructor_val was setting TREE_ADDRESSABLE on bases of ADDR_EXPRs but that's futile when we're dealing with CTOR values in debug stmts. This rips out the code which was added for Java and should have been an assertion when we didn't have debug stmts. To not regress g++.dg/tree-ssa/array-temp1.C we have to adjust the testcase to not look for a no longer applied invalid optimization. 2021-05-10 Richard Biener PR middle-end/100464 PR c++/100468 gcc/ * gimple-fold.c (canonicalize_constructor_val): Do not set TREE_ADDRESSABLE. gcc/cp/ * call.c (set_up_extended_ref_temp): Mark the temporary addressable if the TARGET_EXPR was. gcc/testsuite/ * gcc.dg/pr100464.c: New testcase. * g++.dg/tree-ssa/array-temp1.C: Adjust. (cherry picked from commit a076632e274abe344ca7648b7c7f299273d4cbe0)
[Bug c++/100468] set_up_extended_ref_temp via extend_ref_init_temps_1 drops TREE_ADDRESSABLE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100468 --- Comment #7 from CVS Commits --- The releases/gcc-11 branch has been updated by Richard Biener : https://gcc.gnu.org/g:462900ba21f5fdf865c93f693083da3179dd3151 commit r11-9591-g462900ba21f5fdf865c93f693083da3179dd3151 Author: Richard Biener Date: Fri May 7 09:51:18 2021 +0200 middle-end/100464 - avoid spurious TREE_ADDRESSABLE in folding debug stmts canonicalize_constructor_val was setting TREE_ADDRESSABLE on bases of ADDR_EXPRs but that's futile when we're dealing with CTOR values in debug stmts. This rips out the code which was added for Java and should have been an assertion when we didn't have debug stmts. To not regress g++.dg/tree-ssa/array-temp1.C we have to adjust the testcase to not look for a no longer applied invalid optimization. 2021-05-10 Richard Biener PR middle-end/100464 PR c++/100468 gcc/ * gimple-fold.c (canonicalize_constructor_val): Do not set TREE_ADDRESSABLE. gcc/cp/ * call.c (set_up_extended_ref_temp): Mark the temporary addressable if the TARGET_EXPR was. gcc/testsuite/ * gcc.dg/pr100464.c: New testcase. * g++.dg/tree-ssa/array-temp1.C: Adjust. (cherry picked from commit a076632e274abe344ca7648b7c7f299273d4cbe0)
[Bug go/100537] [12 Regression] Bootstrap-O3 and bootstrap-debug fail on 32-bit ARM after gcc-12-657-ga076632e274a
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100537 --- Comment #21 from CVS Commits --- The releases/gcc-11 branch has been updated by Richard Biener : https://gcc.gnu.org/g:8a1e92ff45e8e254fb557d20dcfa54a88d354329 commit r11-9592-g8a1e92ff45e8e254fb557d20dcfa54a88d354329 Author: Ian Lance Taylor Date: Sat May 22 19:19:13 2021 -0700 compiler: mark global variables whose address is taken To implement this, change the backend to use flag bits for variables. Fixes https://gcc.gnu.org/PR100537 PR go/100537 * go-gcc.cc (class Gcc_backend): Update methods that create variables to take a flags parameter. Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/322129 (cherry picked from commit 358832c46a378e5a0b8a2fa3c2739125e3e680c7)
[Bug tree-optimization/100464] [11 Regression] emitted binary code changes when -g is enabled at -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100464 Richard Biener changed: What|Removed |Added Priority|P3 |P2 Status|ASSIGNED|RESOLVED Known to fail||11.2.0 Resolution|--- |FIXED Known to work||11.2.1 --- Comment #17 from Richard Biener --- Fixed.
[Bug target/104121] [12 Regression] v850: Infinite loop in find_reload_regno_insns() since r12-5852-g50e8b0c9bca6cdc57804f860ec5311b641753fbb
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104121 --- Comment #8 from Sebastian Huber --- I can't reproduce the issue with the reduced test case, however, compiling the preprocessed file still results in an infinite loop.