[Bug tree-optimization/105883] Memcmp folded only when size is a power of two
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105883 Richard Biener changed: What|Removed |Added Last reconfirmed||2022-06-14 Ever confirmed|0 |1 Status|UNCONFIRMED |NEW --- Comment #1 from Richard Biener --- memcmp is folded when it can be turned into two loads and a comparison - that doesn't work for non-power-of-two sizes. Fully constant folding the loads isn't attempted - in theory value-numbering could use partial def tracking to prune equal prefixes and fold on different known bytes but I think this is all hardly worth the trouble? Who will write such code anyway.
[Bug c++/105885] [12/13 Regression] the address of 'template argument' will never be NULL warning
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105885 --- Comment #2 from Richard Biener --- We diagnose only after template substitution where we cannot distinguish literal if (nullptr == nullptr) from if (ARG == nullptr) I think. I guess reporters reasoning is that ARG is defaulted to nullptr and that's the reason the diagnostic is unwanted?
[Bug middle-end/101836] __builtin_object_size(P->M, 1) where M is an array and the last member of a struct fails
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101836 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #24 from Jakub Jelinek --- For the default, a complication is that standard C++ doesn't allow neither flexible array members nor zero sized arrays, so unless one uses extensions one can only write [1]. I think differentiating between only allowing [] as flex, or [] and [0], or [], [0] and [1], or any trailing array is useful.
[Bug c++/105967] New: Forming a pointer to ref-qualified member function using a function typedef ignores the qualifier
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105967 Bug ID: 105967 Summary: Forming a pointer to ref-qualified member function using a function typedef ignores the qualifier Product: gcc Version: 12.1.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: iamsupermouse at mail dot ru Target Milestone: --- Consider following code: #include struct A {}; using F = void() &; static_assert(std::is_same_v); GCC fails the static_assert, while Clang and MSVC accept it. Apparently `F A::*` becomes `void(A::*)()`, without `&`. Same happens for `&&`.
[Bug ipa/105917] [10/11/12/13 regression] Missed passthru jump function
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105917 Richard Biener changed: What|Removed |Added Target Milestone|--- |10.4 Version|unknown |13.0 Keywords||missed-optimization
[Bug c++/105967] Forming a pointer to ref-qualified member function using a function typedef ignores the qualifier
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105967 --- Comment #1 from Andrew Pinski --- Note it looks like the pointer to member function type is where it loses the ref-qualifer and not earlier. That is GCC correctly rejects: using F = void() &; F t;
[Bug c++/105967] Forming a pointer to ref-qualified member function using a function typedef ignores the qualifier
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105967 Andrew Pinski changed: What|Removed |Added Ever confirmed|0 |1 Status|UNCONFIRMED |NEW Known to fail||4.8.1, 5.1.0 Last reconfirmed||2022-06-14 --- Comment #2 from Andrew Pinski --- here is a testcase without using static_assert: struct A {void g()&;}; using F = void() &; F A::* t = &A::g; Confirmed. Not a regression.
[Bug target/105922] autovectorizer does not handle fp exceptions correctly for SVE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105922 Richard Biener changed: What|Removed |Added Keywords||wrong-code --- Comment #2 from Richard Biener --- (In reply to Andrew Pinski from comment #1) > Confirmed. The division should have been predicated on the same as the > load/store but currently GCC does not do that. > > GCC does not really support looking into fpu status bits or exceptions while > vectorizing either. It effectively "supports" it by failing to vectorize when exception state builtins are used in the vectorized region and otherwise it just accumulates exception bits (but it doesn't support in-order traps if you enable exceptions to trap). Note there's a bit of confusion as to what exactly controls FP exception bit correctness and the documentation should probably be clarified.
[Bug c/105923] unsupported return type ‘complex double’ for simd
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105923 Richard Biener changed: What|Removed |Added Ever confirmed|0 |1 Status|UNCONFIRMED |NEW Keywords||ABI Last reconfirmed||2022-06-14 Target|x86-64 |x86_64-*-* --- Comment #2 from Richard Biener --- Confirmed.
[Bug tree-optimization/105736] [12/13 Regression] ICE in force_gimple_operand_1, at gimplify-me.cc:79 since r13-222-g28896b38fabce818
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105736 --- Comment #3 from Siddhesh Poyarekar --- Here we go, I'll put it into builtin-dynamic-object-size-0.c, bootstrap and post a patch. struct TV4 { __attribute__((vector_size (sizeof (int) * 4))) int v; }; struct TV4 val3; int * f1 (struct TV4 *a) { return &a->v[0]; } int f2 (void) { int *t = f1 (&val3); if (__builtin_dynamic_object_size (t, 0) != -1) __builtin_abort (); return 0; }
[Bug c/105923] unsupported return type ‘complex double’ for simd
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105923 --- Comment #3 from Hongtao.liu --- An alternative is taking vector complex as a 2*N length vector(just like vectorizer did). But __attribute__ ((__simd__ ("notinbranch"))) need to be extent for that.
[Bug rust/105913] gccrs doesn't compile on 32-bit targets
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105913 Thomas Schwinge changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |dkm at gcc dot gnu.org See Also||https://github.com/Rust-GCC ||/gccrs/pull/1308
[Bug target/105930] [12/13 Regression] Excessive stack spill generation on 32-bit x86
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105930 --- Comment #15 from Richard Biener --- So we feed DImode rotates into RA which constrains register allocation enough to require spills (all 4 DImode vals are live across the kernel, not even -fschedule-insn can do anything here). I wonder if it ever makes sense to not split wide ops before reload.
[Bug target/105932] Small structures returned incorrectly in i386 Microsoft ABI
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105932 --- Comment #2 from Richard Biener --- (In reply to Andrew Pinski from comment #1) > I suspect this is a dup of bug 81943. That's for a 64bit target though.
[Bug lto/105933] LTO ltrans object files does not have proper st_bind and st_visibility
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105933 Richard Biener changed: What|Removed |Added CC||hubicka at gcc dot gnu.org, ||rguenth at gcc dot gnu.org --- Comment #1 from Richard Biener --- You are refering to the LTRANS objects created from the LTRANS compile phase. These should be perfectly valid to link and have correct st_bind/visibility, but not necessarily the same as originally since link optimization combines multiple TUs and distributes them to multiple LTRANS units, requiring former local symbols to refer to each other from different LTRANS units. Do you have a testcase that shows linking not-through-the-plugin doesn't work?
[Bug libstdc++/105934] [10/11/12/13 Regression] C++11 pointer versions of atomic_fetch_add missing because of P0558
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105934 Richard Biener changed: What|Removed |Added Summary|[9/10/11/12/13 Regression] |[10/11/12/13 Regression] |C++11 pointer versions of |C++11 pointer versions of |atomic_fetch_add missing|atomic_fetch_add missing |because of P0558|because of P0558 Target Milestone|--- |10.4
[Bug target/105938] [12/13 Regression] ICE in get_insn_temp late, at final.cc:2050 on nvptx-none
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105938 Richard Biener changed: What|Removed |Added Keywords||needs-bisection Target Milestone|--- |12.2 Summary|[12 Regression] ICE in |[12/13 Regression] ICE in |get_insn_temp late, at |get_insn_temp late, at |final.cc:2050 on nvptx-none |final.cc:2050 on nvptx-none
[Bug target/105938] [12/13 Regression] ICE in get_insn_temp late, at final.cc:2050 on nvptx-none
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105938 Richard Biener changed: What|Removed |Added Priority|P3 |P1
[Bug d/105942] [12/13 Regression] d: internal compiler error: in visit, at d/expr.cc:945
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105942 Richard Biener changed: What|Removed |Added Target Milestone|--- |12.2 Priority|P3 |P4 Summary|[12 Regression] d: internal |[12/13 Regression] d: |compiler error: in visit, |internal compiler error: in |at d/expr.cc:945|visit, at d/expr.cc:945
[Bug tree-optimization/105943] [12/13 Regression] ICE in expand_LOOP_VECTORIZED, at internal-fn.cc:2640
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105943 Richard Biener changed: What|Removed |Added Resolution|--- |INVALID Status|UNCONFIRMED |RESOLVED --- Comment #2 from Richard Biener --- Yeah, that's not expected to work. Use -fdisable-tree-vect with care.
[Bug c/105944] [10/11/12/13 Regression] ICE in expand_LOOP_DIST_ALIAS, at internal-fn.cc:2648
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105944 Richard Biener changed: What|Removed |Added Resolution|--- |INVALID Status|UNCONFIRMED |RESOLVED --- Comment #1 from Richard Biener --- Similar.
[Bug c/105950] > O2 optimization causes runtime (SIGILL) during main initialization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105950 --- Comment #9 from Richard Biener --- Note that GCC 9 is no longer supported. Note one common error resulting in SIGILL is when you fall through to an unreachable place which could be padding (like when there's a missing return in a function).
[Bug c/105945] [12/13 Regression] ICE in maybe_gen_insn, at optabs.cc:7956
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105945 Richard Biener changed: What|Removed |Added Resolution|--- |INVALID Status|UNCONFIRMED |RESOLVED --- Comment #1 from Richard Biener --- Likewise - ifcvt creates masked load ops expected to be elided by the vectorizer.
[Bug middle-end/105951] [12/13 Regression] ICE in emit_store_flag, at expmed.cc:6027
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105951 Richard Biener changed: What|Removed |Added Priority|P3 |P2
[Bug rtl-optimization/105952] [12/13 Regression] ICE in sel_redirect_edge_and_branch, at sel-sched-ir.cc:5680
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105952 Richard Biener changed: What|Removed |Added Target Milestone|--- |12.2
[Bug target/105953] [12/13 Regression] ICE in extract_insn, at recog.cc:2791
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105953 Richard Biener changed: What|Removed |Added Priority|P3 |P2
[Bug target/105965] x86: single-element vectors don't have scalar FMA insns used anymore
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105965 Hongtao.liu changed: What|Removed |Added CC||crazylht at gmail dot com --- Comment #1 from Hongtao.liu --- It looks like a regression since GCC9 typedef float v1sf __attribute__((vector_size(4))); v1sf foo43 (v1sf a, v1sf b, v1sf c) { return a * b + c; } gcc9 also don't generate vfmaddXXXss. pushq %rbp movq%rdi, %rax movq%rsp, %rbp andq$-32, %rsp vmovss 24(%rbp), %xmm0 vmulss 16(%rbp), %xmm0, %xmm0 vaddss 32(%rbp), %xmm0, %xmm0 vmovss %xmm0, -64(%rsp) movl-64(%rsp), %edx movl%edx, (%rdi) leave ret
[Bug tree-optimization/97185] inconsistent builtin elimination for impossible range
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97185 --- Comment #1 from Siddhesh Poyarekar --- While the missed optimization ought to be fixed, what's the value of -Wstringop-* warning on an impossible range, i.e. when low > high? Shouldn't it just bail out silently if it detects an impossible range?
[Bug d/105942] [12/13 Regression] d: internal compiler error: in visit, at d/expr.cc:945
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105942 Iain Buclaw changed: What|Removed |Added URL||https://github.com/dlang/dm ||d/pull/14210 --- Comment #1 from Iain Buclaw --- Fix landed in upstream.
[Bug target/105960] Crash in 32-bit mode
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105960 Richard Biener changed: What|Removed |Added Target|x86_64 |i?86-*-* Last reconfirmed||2022-06-14 Status|UNCONFIRMED |NEW Ever confirmed|0 |1 CC||hjl.tools at gmail dot com --- Comment #5 from Richard Biener --- Confirmed. Something is wrong with either ld.so or GCC. We end up with .globl exp_ref .type exp_ref, @function exp_ref: .LFB1: .cfi_startproc pushl %ebx .cfi_def_cfa_offset 8 .cfi_offset 3, -8 popl%ebx .cfi_restore 3 .cfi_def_cfa_offset 4 jmp expfull_ref@PLT ^^^ this crashes .type expfull_ref, @gnu_indirect_function .setexpfull_ref,expfull_ref.resolver .type expfull_ref.resolver, @function expfull_ref.resolver: .LFB4: .cfi_startproc pushl %ebx but expfull_ref isn't .globl!? #define TARGET_CLONES __attribute__((target_clones("default","fma"))) TARGET_CLONES static inline double expfull_ref(double x) { return __builtin_pow(x, 0.1234); } double exp_ref(double x) { return expfull_ref(x); }
[Bug target/105960] [12/13 Regression] Crash in 32-bit mode
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105960 Richard Biener changed: What|Removed |Added Priority|P3 |P2 Keywords||needs-bisection, wrong-code Summary|Crash in 32-bit mode|[12/13 Regression] Crash in ||32-bit mode Known to work||11.3.0 Known to fail||12.1.0 Target Milestone|--- |12.2
[Bug other/105819] GCC 12.1.0 Make failed - Compiled with GCC 4.9.4 and under Mac OS X lion - I
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105819 --- Comment #10 from bug-reports.delphin at laposte dot net --- Hi : Ah, OK maybe a mistypping from my own. I will look at this. Kind regards ! PS Please note taht my spectacles were too old, and I have new ones since last friday. Progressive lenses, and not easy to see the computer screen... Adapatation period is needed.
[Bug target/105966] x86: operations on certain few-element vectors yield very inefficient code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105966 Hongtao.liu changed: What|Removed |Added CC||crazylht at gmail dot com --- Comment #1 from Hongtao.liu --- What's interesting is extending slp vectorizer to handle non-pow2p elements with vector mask.
[Bug target/105965] [10/11/12/13 Regression] x86: single-element vectors don't have scalar FMA insns used anymore
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105965 Richard Biener changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org Target Milestone|--- |10.4 Priority|P3 |P2 Status|UNCONFIRMED |ASSIGNED Summary|x86: single-element vectors |[10/11/12/13 Regression] |don't have scalar FMA insns |x86: single-element vectors |used anymore|don't have scalar FMA insns ||used anymore Last reconfirmed||2022-06-14 Ever confirmed|0 |1 Keywords||missed-optimization --- Comment #2 from Richard Biener --- The widen-mul pass now sees [local count: 1073741824]: _8 = VIEW_CONVERT_EXPR(a_3(D)); _9 = VIEW_CONVERT_EXPR(b_4(D)); _10 = _8 * _9; _1 = {_10}; _11 = VIEW_CONVERT_EXPR(_1); _12 = VIEW_CONVERT_EXPR(c_5(D)); _13 = _11 + _12; BIT_FIELD_REF <, 32, 0> = _13; return ; which confuses it. The above is the result from vector lowering which presumably sees that V1SFmode isn't supported. In GCC 8 the above is instead [local count: 1073741825]: _8 = BIT_FIELD_REF ; _9 = BIT_FIELD_REF ; _10 = _8 * _9; _11 = BIT_FIELD_REF ; _12 = _10 + _11; _2 = {_12}; = _2; that means we are at least missing a match.pd pattern to simplify _1 = {_10}; _11 = VIEW_CONVERT_EXPR(_1);
[Bug tree-optimization/105940] suggested_unroll_factor applying place looks wrong
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105940 --- Comment #4 from Kewen Lin --- (In reply to Richard Biener from comment #2) > (In reply to Kewen Lin from comment #1) > > Created attachment 53126 [details] > > move_applying > > LGTM (maybe the suggested unroll factor should be only applied if the > suggestion was from a matching with/without SLP analysis, or in fact > vect_analyze_loop_1 should communicate that down - disabling SLP when > the one suggesting unrolling did the re-analysis). Oops, just noticed the nice suggestion. Will make a follow up patch for this. It would looks like: when working out suggested unroll factor, save slp decision into one passed down variable from vect_analyze_loop_1. when applying suggested unroll factor, if the save slp is false, directly ignore slp handlings, otherwise, go the normal slp path but won't start over for slp off.
[Bug target/105930] [12/13 Regression] Excessive stack spill generation on 32-bit x86
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105930 --- Comment #16 from Jakub Jelinek --- Though, ix86_rot{l,r}di3_doubleword define_insn_and_split patterns were split only after reload both before and after Roger's change, so somehow whether we emit it as SImode from the beginning or only split it before reload affects the RA decisions. unsigned long long foo (unsigned long long x, int y, unsigned long long z) { x ^= z; return (x << 24) | (x >> (-24 & 63)); } is too simplified, the difference with that is just that we used to emit setting of the DImode pseudo to 0 before setting its halves with xor, while now we don't, so it must be something else. I believe as post-reload splitters the doubleword rotates have been introduced already in PR17886. Rewriting those into pre-reload splitters from post-reload splitters would be certainly possible, I will try that, the question is whether it would cure this and what effects it would have on other code.
[Bug c++/105968] New: GCC vectorizes but reports that it did not vectorize
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105968 Bug ID: 105968 Summary: GCC vectorizes but reports that it did not vectorize Product: gcc Version: 13.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: steveire at gmail dot com Target Milestone: --- I'm looking for a way to know if GCC autovectorizes some code. Starting with this testcase which I picked up somewhere: ``` #define N 1 #define NTIMES 10 double a[N] __attribute__ ((aligned (16))); double b[N] __attribute__ ((aligned (16))); double c[N] __attribute__ ((aligned (16))); double r[N] __attribute__ ((aligned (16))); int muladd (void) { int i, times; for (times = 0; times < NTIMES; times++) { #if 1 // count up for (i = 0; i < N; ++i) r[i] = (a[i] + b[i]) * c[i]; #else // count down (old gcc won't auto-vectorize) for (i = N-1; i >= 0; --i) r[i] = (a[i] + b[i]) * c[i]; #endif } return 0; } ``` the command ``` g++ -O2 -ftree-vectorize -fno-verbose-asm -mavx2 -fopt-info-vec-all -c test.cpp ``` reports ``` test.cpp:9:5: note: vectorized 1 loops in function. ``` However, with -O3, GCC reports that it did not vectorize: ``` g++ -O3 -ftree-vectorize -fno-verbose-asm -mavx2 -fopt-info-vec-all -c test.cpp ``` output: ``` test.cpp:9:5: note: vectorized 0 loops in function. ``` even though vector instructions are generated. Demo https://godbolt.org/z/3o41r7jWc
[Bug c++/105968] GCC vectorizes but reports that it did not vectorize
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105968 Hongtao.liu changed: What|Removed |Added CC||crazylht at gmail dot com --- Comment #1 from Hongtao.liu --- > even though vector instructions are generated. > No it's scalar instructions, but the issue here is why vectorizer is ok for -O2 -O2 -ftree-vectorize -mavx2 but not for -O3 -ftree-vectorize -mavx2 muladd(): xor eax, eax .L2: vmovsd xmm0, QWORD PTR a[rax] vaddsd xmm0, xmm0, QWORD PTR b[rax] add rax, 8 vmulsd xmm0, xmm0, QWORD PTR c[rax-8] vmovsd QWORD PTR r[rax-8], xmm0 cmp rax, 8 jne .L2 xor eax, eax ret
[Bug target/105966] x86: operations on certain few-element vectors yield very inefficient code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105966 --- Comment #2 from jbeulich at suse dot com --- (In reply to Hongtao.liu from comment #1) > What's interesting is extending slp vectorizer to handle non-pow2p elements > with vector mask. Well, for starters I think proper pow2 element counts (and especially "native" vector widths like 128- or 256-bit ones) want dealing with efficiently. But I agree the principle can be extended to non-pow2 ones.
[Bug c++/105946] [12/13 Regression] ICE in maybe_warn_pass_by_reference, at tree-ssa-uninit.cc:843
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105946 Richard Biener changed: What|Removed |Added Last reconfirmed||2022-06-14 Priority|P3 |P2 Status|UNCONFIRMED |ASSIGNED Ever confirmed|0 |1 Keywords||needs-reduction Target Milestone|--- |12.2 Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org --- Comment #1 from Richard Biener --- Confirmed. (gdb) p debug_gimple_stmt (stmt) # .MEM_8 = VDEF <.MEM_7(D)> _2 = std::__new_allocator >::allocate (__n_1(D), 0B); 842 tree arg = gimple_call_arg (stmt, argno - 1); (gdb) p argno $2 = 3 (gdb) p debug_generic_expr (fntype) struct vector * __new_allocator:: (struct __new_allocator *, size_type, const void *) so the number of actual arguments does not match the function type of the call. I have a simple patch.
[Bug c/105923] unsupported return type ‘complex double’ for simd
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105923 --- Comment #4 from Hongtao.liu --- Hmm, it's in i386.cc 23455/* Set CLONEI->vecsize_mangle, CLONEI->mask_mode, CLONEI->vecsize_int, 23456 CLONEI->vecsize_float and if CLONEI->simdlen is 0, also 23457 CLONEI->simdlen. Return 0 if SIMD clones shouldn't be emitted, 23458 or number of vecsize_mangle variants that should be emitted. */ 23459 23460static int 23461ix86_simd_clone_compute_vecsize_and_simdlen (struct cgraph_node *node, 23462 struct cgraph_simd_clone *clonei, 23463 tree base_type, int num) ... 23509case E_QImode: 23510case E_HImode: 23511case E_SImode: 23512case E_DImode: 23513case E_SFmode: 23514case E_DFmode: 23515/* case E_SCmode: */ 23516/* case E_DCmode: */ 23517 if (!AGGREGATE_TYPE_P (arg_type)) 23518break; 23519 /* FALLTHRU */ 23520default: 23521 if (clonei->args[i].arg_type == SIMD_CLONE_ARG_TYPE_UNIFORM) 23522break; 23523 warning_at (DECL_SOURCE_LOCATION (node->decl), 0, 23524 "unsupported argument type %qT for simd", arg_type); 23525 return 0; 23526} 23527}
[Bug c/105950] > O2 optimization causes runtime (SIGILL) during main initialization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105950 --- Comment #10 from Jonathan Wakely --- (In reply to John Kanapes from comment #8) > I hope, I have a couple of days before closing this ticket:) Yes, we usually let a bug sit in WAITING status for a couple of months before closing it, so you have plenty of time.
[Bug target/105966] x86: operations on certain few-element vectors yield very inefficient code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105966 Richard Biener changed: What|Removed |Added Blocks||88670 --- Comment #3 from Richard Biener --- Is not having AVX512VL relevant in the real world? Some operations (division) require different handling than zero-extending, masking might be a way out but that might turn out to be (way?) more expensive. I agree that it might be interesting to support SLP with not power-of-two or generally not fully populated lanes. The load/store side requires masking support for this (and the missed optimization would be that we do not define the contents of the masked elements for loads). Likewise vector lowering could avoid splitting vector ops into scalars when there's a wider supported vector mode by means of zero-extending. It might need a cost model for this. I think the reporter is aiming at this. Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88670 [Bug 88670] [meta-bug] generic vector extension issues
[Bug c/105950] > O2 optimization causes runtime (SIGILL) during main initialization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105950 --- Comment #11 from John Kanapes --- (In reply to Richard Biener from comment #9) > Note that GCC 9 is no longer supported. Note one common error resulting in > SIGILL is when you fall through to an unreachable place which could be > padding > (like when there's a missing return in a function). Hmmm. gcc 9.40 is the distro gcc for Ubuntu 20.04, which is LTS and still supported. Does this mean that no action will be taken upon resolving this ticket? I am trying to recreate this bug in a smaller, more concise context. It is not an obvious bug. This is valid code, and it takes a large chain of previous steps to get it wrong at runtime. It used to work with previous gccs, but it now seems broken:( It will remain broken in future releases unless we stop it here:( I hope you reconsider:) If that was a problem of a missing return in a function, it would have to be an internal function. The spot it happens is in main initialization, before it had a chance to call any of my functions.
[Bug libstdc++/105957] __n * sizeof(_Tp) might overflow under consteval context for std::allocator
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105957 Jonathan Wakely changed: What|Removed |Added Keywords||accepts-invalid Ever confirmed|0 |1 Status|UNCONFIRMED |NEW Last reconfirmed||2022-06-14 --- Comment #1 from Jonathan Wakely --- Testcase: #include constexpr auto f() { std::allocator a; auto n = std::size_t(-1) / (sizeof(long long) - 1); auto p = a.allocate(n); a.deallocate(p, n); return n; } static_assert( f() ); In practice if the arithmetic wraps around and a smaller buffer is allocated, any attempt to write beyond the allocated size would be detected in constant evaluation anyway. So you'd still get a compilation error in most cases.
[Bug target/105966] x86: operations on certain few-element vectors yield very inefficient code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105966 --- Comment #4 from jbeulich at suse dot com --- (In reply to Richard Biener from comment #3) > Is not having AVX512VL relevant in the real world? Wasn't the Xeon-Phi line of processors lacking VL? I have no idea how widespread their use (still) is, though. > Some operations (division) require different handling than zero-extending, > masking might be a way out but that might turn out to be (way?) more > expensive. By expensive, you mean in terms of compiler changes? I wouldn't expect execution to be severely affected by using masking, especially when it's zeroing-masking. Or if it is, then likely because there was not enough pressure to make this mode work efficiently (after all there were various performance quirks when AVX and AVX512F were first introduced).
[Bug c/105950] > O2 optimization causes runtime (SIGILL) during main initialization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105950 --- Comment #12 from Jakub Jelinek --- (In reply to John Kanapes from comment #11) > (In reply to Richard Biener from comment #9) > > Note that GCC 9 is no longer supported. Note one common error resulting in > > SIGILL is when you fall through to an unreachable place which could be > > padding > > (like when there's a missing return in a function). > > Hmmm. > gcc 9.40 is the distro gcc for Ubuntu 20.04, which is LTS and still > supported. In this case it is Ubuntu that supports it, so you'd need to ask Ubuntu to fix it (if it is a compiler bug of course), because upstream GCC 9.5 was the last release and there won't be any changes for the GCC 9 series. If it reproduces with a newer compiler, it can be fixed upstream in the still supported releases and perhaps Ubuntu could backport it if you ask them to. > Does this mean that no action will be taken upon resolving this ticket? Depends on if it is reproducible with a supported compiler. > I am trying to recreate this bug in a smaller, more concise context. > It is not an obvious bug. This is valid code, and it takes a large chain of > previous steps to get it wrong at runtime. It used to work with previous > gccs, but it now seems broken:( Claiming it is valid code until it is analyzed is premature. It can very well be undefined behavior in the code.
[Bug target/105930] [12/13 Regression] Excessive stack spill generation on 32-bit x86
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105930 Jakub Jelinek changed: What|Removed |Added Keywords|needs-bisection | --- Comment #17 from Jakub Jelinek --- So, I've tried: --- gcc/config/i386/i386.md.jj 2022-06-13 10:53:26.739290704 +0200 +++ gcc/config/i386/i386.md 2022-06-14 11:09:24.467024047 +0200 @@ -13734,14 +13734,13 @@ ;; shift instructions and a scratch register. (define_insn_and_split "ix86_rotl3_doubleword" - [(set (match_operand: 0 "register_operand" "=r") - (rotate: (match_operand: 1 "register_operand" "0") -(match_operand:QI 2 "" ""))) - (clobber (reg:CC FLAGS_REG)) - (clobber (match_scratch:DWIH 3 "=&r"))] - "" + [(set (match_operand: 0 "register_operand") + (rotate: (match_operand: 1 "register_operand") +(match_operand:QI 2 ""))) + (clobber (reg:CC FLAGS_REG))] + "ix86_pre_reload_split ()" "#" - "reload_completed" + "&& 1" [(set (match_dup 3) (match_dup 4)) (parallel [(set (match_dup 4) @@ -13764,6 +13763,7 @@ (match_dup 6 0))) (clobber (reg:CC FLAGS_REG))])] { + operands[3] = gen_reg_rtx (mode); operands[6] = GEN_INT (GET_MODE_BITSIZE (mode) - 1); operands[7] = GEN_INT (GET_MODE_BITSIZE (mode)); @@ -13771,14 +13771,13 @@ }) (define_insn_and_split "ix86_rotr3_doubleword" - [(set (match_operand: 0 "register_operand" "=r") - (rotatert: (match_operand: 1 "register_operand" "0") - (match_operand:QI 2 "" ""))) - (clobber (reg:CC FLAGS_REG)) - (clobber (match_scratch:DWIH 3 "=&r"))] - "" + [(set (match_operand: 0 "register_operand") + (rotatert: (match_operand: 1 "register_operand") + (match_operand:QI 2 ""))) + (clobber (reg:CC FLAGS_REG))] + "ix86_pre_reload_split ()" "#" - "reload_completed" + "&& 1" [(set (match_dup 3) (match_dup 4)) (parallel [(set (match_dup 4) @@ -13801,6 +13800,7 @@ (match_dup 6)))) 0))) (clobber (reg:CC FLAGS_REG))])] { + operands[3] = gen_reg_rtx (mode); operands[6] = GEN_INT (GET_MODE_BITSIZE (mode) - 1); operands[7] = GEN_INT (GET_MODE_BITSIZE (mode)); On the #c0 test with -O2 -m32 -mno-mmx -mno-sse it makes some difference, but not as much as one would hope for: Numbers from gcc 11.3.1 20220614, 11.3.1 20220614 with the patch, 13.0.0 20220610, 13.0.0 20220614 with the patch: sub on %esp428 2556 2620 2556 fn size in B 21657 23186 28413 23534 .s lines 6199 3942 7260 4198 So, trunk patched with the above patch results in significantly fewer instructions, but larger (more of them use 32-bit immediates, mostly in form of whatever(%esp) memory source operand). And the stack usage is high. I think the patch is still a good idea, it gives the RA more options, but we should investigate why it consumes so much more stack and results in larger code.
[Bug target/105930] [12/13 Regression] Excessive stack spill generation on 32-bit x86
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105930 --- Comment #18 from Jakub Jelinek --- Of course, size comparisons of -O2 code aren't the most important, for -O2 it is more important how fast the code is. When comparing -Os -m32 -mno-mmx -mno-sse, the numbers are sub on %esp412 2564 2620 2564 fn size in B 27535 20508 35036 20416 .s lines 5816 3590 7251 3544 So in the -Os case, the patched functions are both smaller and fewer instructions (significantly so), but compared to gcc 11 still significantly higher stack usage).
[Bug target/105966] x86: operations on certain few-element vectors yield very inefficient code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105966 Richard Biener changed: What|Removed |Added Status|UNCONFIRMED |NEW Ever confirmed|0 |1 Last reconfirmed||2022-06-14 CC||rguenth at gcc dot gnu.org --- Comment #5 from Richard Biener --- So "lowering" would turn _1 = _2 * _3; into _2' = { _2, {}, ... }; // vector-of-vector CTOR with zero filling _3' = { _3, {}, ... }; _1' = _2' * _3'; _1 = BIT_FIELD_REF <_1', 0, bitsizeof(_1)>; // lowpart little/big-endian needs some thoughts here. We currently require all elements explicitely specified for vector-of-vector CTORs, for scalar element CTORs we allow automatic zero-filling which would be convenient here as well. For division we'd use a vector of ones. Since lowering is on a per-stmt base we have to optimize the glues away, thus _2 = BIT_FIELD_REF <_3, 0, bitsizeof(_3)>; _1 = { _2, {}, ... }; should ideally become just _3 but then we have to know _3 is zero-filled or decide we can also have arbitrary values in the upper halves (signed integer overflow issues, FP with NaNs might be slow, etc.). The vector lowering process lacks something like a lattice so it doesn't re-use previously lowered intermediate results (boo).
[Bug other/12081] Gcc can't be compiled with -mregparm=3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=12081 --- Comment #35 from oyvind.harboe at zylin dot com --- SPEC 2017 added SPEC_GCC_VARIADIC_FUNCTIONS_MISMATCH_WORKAROUND to cope with this error.
[Bug c/105950] > O2 optimization causes runtime (SIGILL) during main initialization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105950 --- Comment #13 from John Kanapes --- (In reply to Jakub Jelinek from comment #12) > (In reply to John Kanapes from comment #11) > > (In reply to Richard Biener from comment #9) > > > I am trying to recreate this bug in a smaller, more concise context. > > It is not an obvious bug. This is valid code, and it takes a large chain of > > previous steps to get it wrong at runtime. It used to work with previous > > gccs, but it now seems broken:( > > Claiming it is valid code until it is analyzed is premature. It can very > well be undefined behavior in the code. True. Except that I have already analyzed it with my own tools. That means that the offending code, as reported by gdb, compiles and runs fine with -O6 optimization with a simpler code. I am not claiming anything, just stressing that this is not an obvious issue as reported by gdb, and requires a lot of previous steps to reproduce:(
[Bug c/105950] > O2 optimization causes runtime (SIGILL) during main initialization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105950 --- Comment #14 from John Kanapes --- (In reply to Jakub Jelinek from comment #12) > (In reply to John Kanapes from comment #11) > > (In reply to Richard Biener from comment #9) > > > Note that GCC 9 is no longer supported. Note one common error resulting > > > in > > > SIGILL is when you fall through to an unreachable place which could be > > > padding > > > (like when there's a missing return in a function). > > > > Hmmm. > > gcc 9.40 is the distro gcc for Ubuntu 20.04, which is LTS and still > > supported. > > In this case it is Ubuntu that supports it, so you'd need to ask Ubuntu to > fix it > (if it is a compiler bug of course), because upstream GCC 9.5 was the last > release and there won't be any changes for the GCC 9 series. If it > reproduces with a newer compiler, it can be fixed upstream in the still > supported releases and perhaps Ubuntu could backport it if you ask them to. > > > Does this mean that no action will be taken upon resolving this ticket? > > Depends on if it is reproducible with a supported compiler. That works for me. Both places could use my sources. My work won't be in vain:)
[Bug c++/105968] GCC vectorizes but reports that it did not vectorize
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105968 Richard Biener changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |WONTFIX --- Comment #2 from Richard Biener --- > ./cc1 -quiet t.c -O3 -mavx2 -fopt-info t.c:11:25: optimized: loops interchanged in loop nest > ./cc1 -quiet t.c -O2 -mavx2 -fopt-info t.c:14:19: optimized: loop vectorized using 32 byte vectors so we interchange the loop to for (i = 0; i < N; ++i) for (times = 0; times < NTIMES; times++) r[i] = (a[i] + b[i]) * c[i]; which is indeed good for memory locality (now, we should then eliminate the inner loop completely but we have no such facility - only unrolling and DSE/DCE would do this but nothing on the high-level loop form). "Benchmark" issue. The outer loop should have a memory clobber. Oh, and we should in theory be able to vectorize the outer loop if N is a multiple of the vector element count. But: t.c:11:25: note: === vect_analyze_data_ref_accesses === t.c:11:25: note: zero step in inner loop of nest t.c:11:25: missed: not vectorized: complicated access pattern. t.c:15:14: missed: not vectorized: complicated access pattern. t.c:11:25: missed: bad data access. so we don't handle this exact issue (maybe the offending check can simply be elided - assuming dependence checking handles zero steps correctly). Putting __asm__ volatile ("" : : : "memory"); at the end of the outer loop vectorizes with -O3 as well (but doesn't interchange). Not a bug I think unless you want to make it a bug about not vectorizing the outer loop after interchange.
[Bug c/105969] New: [12/13 Regression] ICE in Floating point exception
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105969 Bug ID: 105969 Summary: [12/13 Regression] ICE in Floating point exception Product: gcc Version: 13.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: gs...@t-online.de Target Milestone: --- Started between 20220522 and 20220529 : $ cat z1.c #include struct A { char a[0][0][0]; }; extern struct A b[][2]; void f (void) { sprintf (b[0][0].a[1][0], "%s", b[0][0].a[1][0]); } $ gcc-13-20220612 -c z1.c -Wall z1.c: In function 'f': during GIMPLE pass: warn-printf z1.c:9:1: internal compiler error: Floating point exception 9 | } | ^ 0xc2a33f crash_signal ../../gcc/toplev.cc:322 0x184c71e get_origin_and_offset_r ../../gcc/gimple-ssa-sprintf.cc:2322 0x184c749 get_origin_and_offset_r ../../gcc/gimple-ssa-sprintf.cc:2385 0x185267f get_origin_and_offset ../../gcc/gimple-ssa-sprintf.cc:2447 0x185267f handle_printf_call(gimple_stmt_iterator*, pointer_query&) ../../gcc/gimple-ssa-sprintf.cc:4714 0xdfd21d strlen_pass::check_and_optimize_call(bool*) ../../gcc/tree-ssa-strlen.cc:5461 0xdfdbe1 strlen_pass::check_and_optimize_stmt(bool*) ../../gcc/tree-ssa-strlen.cc:5665 0xdfdfb4 strlen_pass::before_dom_children(basic_block_def*) ../../gcc/tree-ssa-strlen.cc:5849 0x17e9284 dom_walker::walk(basic_block_def*) ../../gcc/domwalk.cc:309 0xdfe420 printf_strlen_execute ../../gcc/tree-ssa-strlen.cc:5908
[Bug c/105970] New: ICE in ix86_function_arg, at config/i386/i386.cc:3351
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105970 Bug ID: 105970 Summary: ICE in ix86_function_arg, at config/i386/i386.cc:3351 Product: gcc Version: 13.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: gs...@t-online.de Target Milestone: --- Affects versions down to r7, with files gcc.dg/torture/pr68037-*.c : $ gcc-13-20220612 -c pr68037-1.c -mx32 -mgeneral-regs-only $ $ gcc-13-20220612 -c pr68037-1.c -mx32 -mgeneral-regs-only -maddress-mode=long during RTL pass: expand pr68037-1.c: In function 'fn': pr68037-1.c:32:1: internal compiler error: in ix86_function_arg, at config/i386/i386.cc:3351 32 | fn (struct interrupt_frame *frame, uword_t error) | ^~ 0xf2c309 ix86_function_arg ../../gcc/config/i386/i386.cc:3351 0x9313c8 assign_parm_find_entry_rtl ../../gcc/function.cc:2535 0x9313c8 assign_parms ../../gcc/function.cc:3673 0x933607 expand_function_start(tree_node*) ../../gcc/function.cc:5161 0x7d7c21 execute ../../gcc/cfgexpand.cc:6695
[Bug c/105971] New: [12/13 Regression] ICE in bitmap_check_index, at sbitmap.h:104
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105971 Bug ID: 105971 Summary: [12/13 Regression] ICE in bitmap_check_index, at sbitmap.h:104 Product: gcc Version: 13.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: gs...@t-online.de Target Milestone: --- Started between 20211121 and 20211128, at -O1+ : (gcc configured with --enable-checking=yes) $ cat z1.c void a() { int b; int c; int d = a; _Complex float *e = a; for (;;) { (*e += d) / b ?: 0; } } $ gcc-13-20220612 -c z1.c -O2 z1.c: In function 'a': z1.c:5:11: warning: initialization of 'int' from 'void (*)()' makes integer from pointer without a cast [-Wint-conversion] 5 | int d = a; | ^ z1.c:6:23: warning: initialization of '_Complex float *' from incompatible pointer type 'void (*)()' [-Wincompatible-pointer-types] 6 | _Complex float *e = a; | ^ during GIMPLE pass: dse z1.c:10:1: internal compiler error: in bitmap_check_index, at sbitmap.h:104 10 | } | ^ 0x1e915d1 bitmap_check_index ../../gcc/sbitmap.h:104 0x1e915d1 bitmap_bit_in_range_p(simple_bitmap_def const*, unsigned int, unsigned int) ../../gcc/sbitmap.cc:336 0xf7f37c live_bytes_read ../../gcc/tree-ssa-dse.cc:786 0xf7f37c dse_classify_store(ao_ref*, gimple*, bool, simple_bitmap_def*, bool*, tree_node*) ../../gcc/tree-ssa-dse.cc:1007 0xf827f8 dse_optimize_stmt ../../gcc/tree-ssa-dse.cc:1421 0xf827f8 execute ../../gcc/tree-ssa-dse.cc:1527
[Bug c/105972] New: [12/13 Regression] ICE in lower_stmt, at gimple-low.cc:312
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105972 Bug ID: 105972 Summary: [12/13 Regression] ICE in lower_stmt, at gimple-low.cc:312 Product: gcc Version: 13.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: gs...@t-online.de Target Milestone: --- Started between 20211017 and 20211024 : (gcc configured with --enable-checking=yes) $ cat z1.c __attribute__((optimize(0))) int f () { int g () } $ gcc-13-20220612 -c z1.c -g -O2 z1.c: In function 'g': z1.c:5:1: error: expected declaration specifiers before '}' token 5 | } | ^ z1.c:6: error: expected '{' at end of input z1.c: In function 'f': z1.c:5:1: error: expected declaration or statement at end of input 5 | } | ^ during GIMPLE pass: lower z1.c:2:5: internal compiler error: in lower_stmt, at gimple-low.cc:312 2 | int f () | ^ 0x1c2901d lower_stmt ../../gcc/gimple-low.cc:312 0x1c2901d lower_sequence ../../gcc/gimple-low.cc:217 0x1c27d79 lower_gimple_bind ../../gcc/gimple-low.cc:475 0x1c291a8 lower_function_body ../../gcc/gimple-low.cc:110 0x1c291a8 execute ../../gcc/gimple-low.cc:195
[Bug target/105965] [10/11/12/13 Regression] x86: single-element vectors don't have scalar FMA insns used anymore
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105965 --- Comment #3 from CVS Commits --- The master branch has been updated by Richard Biener : https://gcc.gnu.org/g:90467f0ad649d0817f9e034596a0fb85605b55af commit r13-1085-g90467f0ad649d0817f9e034596a0fb85605b55af Author: Richard Biener Date: Tue Jun 14 10:59:49 2022 +0200 middle-end/105965 - add missing v_c_e <{ el }> simplification When we got the simplification of bit-field-ref to view-convert we lost the ability to detect FMAs since we cannot look through _1 = {_10}; _11 = VIEW_CONVERT_EXPR(_1); the following amends the (view_convert CONSTRUCTOR) pattern to handle this case. 2022-06-14 Richard Biener PR middle-end/105965 * match.pd (view_convert CONSTRUCTOR): Handle single-element CTOR case. * gcc.target/i386/pr105965.c: New testcase.
[Bug c++/105946] [12/13 Regression] ICE in maybe_warn_pass_by_reference, at tree-ssa-uninit.cc:843
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105946 --- Comment #2 from CVS Commits --- The master branch has been updated by Richard Biener : https://gcc.gnu.org/g:e07a876c07601e1f3a27420f7d055d20193c362c commit r13-1086-ge07a876c07601e1f3a27420f7d055d20193c362c Author: Richard Biener Date: Tue Jun 14 11:10:13 2022 +0200 tree-optimization/105946 - avoid accessing excess args from uninit diag uninit diagnostics uses passing via reference and access attributes but that iterates over function type arguments which can in some cases appearantly outrun the actual arguments leading to ICEs. The following simply ignores not present arguments. 2022-06-14 Richard Biener PR tree-optimization/105946 * tree-ssa-uninit.cc (maybe_warn_pass_by_reference): Do not look at arguments not specified in the function call.
[Bug target/105965] [10/11/12 Regression] x86: single-element vectors don't have scalar FMA insns used anymore
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105965 Richard Biener changed: What|Removed |Added Summary|[10/11/12/13 Regression]|[10/11/12 Regression] x86: |x86: single-element vectors |single-element vectors |don't have scalar FMA insns |don't have scalar FMA insns |used anymore|used anymore Known to work||13.0 --- Comment #4 from Richard Biener --- Fixed on trunk sofar.
[Bug c++/105946] [12 Regression] ICE in maybe_warn_pass_by_reference, at tree-ssa-uninit.cc:843
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105946 Richard Biener changed: What|Removed |Added Summary|[12/13 Regression] ICE in |[12 Regression] ICE in |maybe_warn_pass_by_referenc |maybe_warn_pass_by_referenc |e, at |e, at |tree-ssa-uninit.cc:843 |tree-ssa-uninit.cc:843 Known to work||13.0 Known to fail||12.1.0 --- Comment #3 from Richard Biener --- Fixed on trunk sofar.
[Bug tree-optimization/105832] [13 Regression] Dead Code Elimination Regression at -O3 (trunk vs. 12.1.0)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105832 Richard Biener changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org --- Comment #2 from Richard Biener --- Investigating.
[Bug c/105950] > O2 optimization causes runtime (SIGILL) during main initialization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105950 --- Comment #15 from Jonathan Wakely --- Just running in GDB doesn't find bugs (and there is no -O6 level, -O3 is the highest). Did you try it with -fsanitize=undefined yet?
[Bug tree-optimization/105973] New: Wrong branch prediction for if (COND) { if(x) noreturn1(); else noreturn2(); }
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105973 Bug ID: 105973 Summary: Wrong branch prediction for if (COND) { if(x) noreturn1(); else noreturn2(); } Product: gcc Version: 13.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: redi at gcc dot gnu.org Target Milestone: --- Given this code: __attribute__((noreturn)) void throw1(); __attribute__((noreturn)) void throw2(); typedef decltype(sizeof(0)) size_t; #if defined LIKELY # define PREDICT(C) __builtin_expect(C,1) #elif defined UNLIKELY # define PREDICT(C) __builtin_expect(C,0) #else # define PREDICT(C) (C) #endif template T* allocate(size_t n) { if (PREDICT(n > (__PTRDIFF_MAX__ / sizeof(T { if (n > (__SIZE_MAX__ / sizeof(T))) throw1(); throw2(); } return (T*) ::operator new(n * sizeof(T)); } int* alloc_int(size_t n) { return allocate(n); } The condition decorated with PREDICT is compiled to different code with -DLIKELY and -DUNLIKELY, as expected. However with neither macro defined, the result is the same as -DLIKELY (for any optimization level > -O0). i.e. the calls to throw1 and throw1 come first and the return statement requires a branch: _Z9alloc_intm: .LFB1: .cfi_startproc movq%rdi, %rax shrq$61, %rax je .L2 subq$8, %rsp .cfi_def_cfa_offset 16 shrq$62, %rdi je .L3 call_Z6throw1v .p2align 4,,10 .p2align 3 .L3: call_Z6throw2v .p2align 4,,10 .p2align 3 .L2: .cfi_def_cfa_offset 8 salq$2, %rdi jmp _Znwm .cfi_endproc Surely this is wrong? If calling a noreturn function is considered unlikely, then surely entering a block that always calls a noreturn function should also be unlikely? Clang gets this right, generating the same code as UNLIKELY by default, and only requiring a branch for the return value when LIKELY is defined. This code is reduced from std::allocator in libstdc++ and I thought I should be able to remove a redundant __builtin_expect, but it's needed due to this.
[Bug tree-optimization/105973] Wrong branch prediction for if (COND) { if(x) noreturn1(); else noreturn2(); }
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105973 --- Comment #1 from Jonathan Wakely --- In fact we get it wrong even if both branches call the same noreturn function: if (PREDICT(n > (__PTRDIFF_MAX__ / sizeof(T { if (n > (__SIZE_MAX__ / sizeof(T))) throw1(); throw1(); } This is not compiled to the same code as: if (PREDICT(n > (__PTRDIFF_MAX__ / sizeof(T { throw1(); } even though it has identical effects.
[Bug tree-optimization/105832] [13 Regression] Dead Code Elimination Regression at -O3 (trunk vs. 12.1.0)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105832 Richard Biener changed: What|Removed |Added Status|ASSIGNED|NEW Assignee|rguenth at gcc dot gnu.org |unassigned at gcc dot gnu.org --- Comment #3 from Richard Biener --- So the difference boils down to GCC 12 ending up with if (iftmp.0_9 == 1) { if (iftmp.1_10 != 0) { loop with call to foo (); } } while the new unswitching code swaps these and ends up with if (iftmp.1_10 != 0) { if (iftmp.0_9 == 1) { loop with call to foo (); } } the old code also created one pointless unreachable loop copy. GCC 12 manages to elide the loop calling foo() in thread2 after fre5. There's nothing wrong with unswitching here I think - we're at most unlucky with the order of unswitchings (but that might change from current 'random' to a cost based order).
[Bug tree-optimization/105973] Wrong branch prediction for if (COND) { if(x) noreturn1(); else noreturn2(); }
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105973 --- Comment #2 from Jonathan Wakely --- https://godbolt.org/z/asecWe6KK
[Bug c/105970] ICE in ix86_function_arg, at config/i386/i386.cc:3351
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105970 Uroš Bizjak changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2022-06-14 CC||hjl.tools at gmail dot com Ever confirmed|0 |1 --- Comment #1 from Uroš Bizjak --- Probably something like: diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc index 3d189e124e4..f158cc3aaea 100644 --- a/gcc/config/i386/i386.cc +++ b/gcc/config/i386/i386.cc @@ -3348,7 +3348,7 @@ ix86_function_arg (cumulative_args_t cum_v, const function_arg_info &arg) if (POINTER_TYPE_P (arg.type)) { /* This is the pointer argument. */ - gcc_assert (TYPE_MODE (arg.type) == Pmode); + gcc_assert (TYPE_MODE (arg.type) == ptr_mode); /* It is at -WORD(AP) in the current frame in interrupt and exception handlers. */ reg = plus_constant (Pmode, arg_pointer_rtx, -UNITS_PER_WORD); Pointer mode and Pmode can be distinct for x32 target. However, I have no idea what goes into interrupt frame for x32. Let's ask HJ.
[Bug c++/105838] g++ 12.1.0 runs out of memory or time when building const std::vector of std::strings
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105838 --- Comment #3 from Richard Biener --- Created attachment 53133 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53133&action=edit unincluded, and reduced This "reduced" testcase peaks at 3.8GB memory. > /usr/bin/time /space/rguenther/install/gcc-12.1/bin/g++ -S -O /tmp/t.C 8.68user 1.13system 0:10.03elapsed 97%CPU (0avgtext+0avgdata 3813480maxresident)k 17328inputs+2104outputs (28major+961476minor)pagefaults 0swaps simply doubling the initializer grows it to 14.8GB > /usr/bin/time /space/rguenther/install/gcc-12.1/bin/g++ -S -O /tmp/t.C 43.02user 4.49system 0:47.51elapsed 99%CPU (0avgtext+0avgdata 14861052maxresident)k 0inputs+4088outputs (0major+3727738minor)pagefaults 0swaps
[Bug c/105950] > O2 optimization causes runtime (SIGILL) during main initialization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105950 --- Comment #16 from John Kanapes --- Good to know (O3). I have posted my -fsanitize=undefined. Doesn't compile with it, but I need help to fix that,because I don't know what it means:( On Tuesday, June 14, 2022 at 02:35:05 PM GMT+3, redi at gcc dot gnu.org wrote: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105950 --- Comment #15 from Jonathan Wakely --- Just running in GDB doesn't find bugs (and there is no -O6 level, -O3 is the highest). Did you try it with -fsanitize=undefined yet?
[Bug tree-optimization/105739] [10 Regression] Miscompilation of Linux kernel update.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105739 --- Comment #11 from CVS Commits --- The master branch has been updated by Jan Hubicka : https://gcc.gnu.org/g:8f6c317b3a16350698f3c9e0accb43a9b4acb4ae commit r13-1089-g8f6c317b3a16350698f3c9e0accb43a9b4acb4ae Author: Jan Hubicka Date: Tue Jun 14 14:05:53 2022 +0200 Fix ipa-cp wrt volatile loads Check for volatile flag to ipa_load_from_parm_agg. gcc/ChangeLog: 2022-06-10 Jan Hubicka PR ipa/105739 * ipa-prop.cc (ipa_load_from_parm_agg): Punt on volatile loads. gcc/testsuite/ChangeLog: 2022-06-10 Jan Hubicka * gcc.dg/ipa/pr105739.c: New test.
[Bug c/105950] > O2 optimization causes runtime (SIGILL) during main initialization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105950 --- Comment #17 from Jakub Jelinek --- If you mean https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105950#c2 , no, you have just posted what is a user error in using the sanitizers and we've told you how to fix that. The -fsanitize=undefined option can't be just added to gcc command line where you compile object files (e.g. if you add it to CFLAGS or CXXFLAGS vars), but also when you link the program (or shared library), so e.g. in LDFLAGS, because when linking it takes care of adding -lubsan to the linker command line.
[Bug c/105950] > O2 optimization causes runtime (SIGILL) during main initialization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105950 --- Comment #18 from Jonathan Wakely --- Two of us have already explained that (comment 3 and comment 6, and now comment 17).
[Bug tree-optimization/105739] [10 Regression] Miscompilation of Linux kernel update.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105739 --- Comment #12 from Jakub Jelinek --- Thanks, I have verified that on the #c0 testcase on 10 branch it makes both __builtin_unreachable calls go away.
[Bug c/105950] > O2 optimization causes runtime (SIGILL) during main initialization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105950 --- Comment #19 from John Kanapes --- Aaaah. So it's different than the other gcc flags... I just linked libubsan... No compilation errors. At runtime it SIGILLS at the same gdb point as before... Same as the rest of the recommended flags. BTW since -O3 is the highest gcc optimization, gcc could print a warning: Warning -Ox is deprecated. Downgrading to -O3;-) Otherwise in a few years you will find code compiled with -O20 and then it is the sky. It just takes 1 coder to use it in open source, and since gcc seems to take it, all the other coders will copy it:(
[Bug c/105950] > O2 optimization causes runtime (SIGILL) during main initialization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105950 --- Comment #20 from John Kanapes --- (In reply to Jonathan Wakely from comment #18) > Two of us have already explained that (comment 3 and comment 6, and now > comment 17). I couldn't understand what you were talking about. It is listed with the other -f gcc flags:( To avoid confusion, you could update your in your description that this flag is special and needs to be linked with -lubsan and does that...
[Bug gcov-profile/101487] [GCOV] Wrong coverage of "switch" inside "while" loop
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101487 Yang Wang changed: What|Removed |Added Resolution|INVALID |FIXED
[Bug gcov-profile/101487] [GCOV] Wrong coverage of "switch" inside "while" loop
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101487 Yang Wang changed: What|Removed |Added Status|RESOLVED|UNCONFIRMED Resolution|FIXED |--- --- Comment #2 from Yang Wang --- it still exists in the latest version
[Bug gcov-profile/100980] [GCOV]The assignment statement in the “for” structure caused the wrong coverage
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100980 Yang Wang changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #2 from Yang Wang --- fixed in later version
[Bug libstdc++/105934] [10/11/12/13 Regression] C++11 pointer versions of atomic_fetch_add missing because of P0558
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105934 Jonathan Wakely changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |WONTFIX --- Comment #5 from Jonathan Wakely --- LWG consensus was that the breakage is OK for C++17, and there was no desire to support this code even for C++11 and C++14 modes. So I'm closing this as WONTFIX.
[Bug c/105950] > O2 optimization causes runtime (SIGILL) during main initialization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105950 --- Comment #21 from Jonathan Wakely --- What we said is to use -fsanitize=undefined when linking, not add -lubsan manually. I don't know how I could have said that more clearly than comment 6. This is not different to other flags, there are plenty of other flags that are needed both when compiling and linking.
[Bug c/105950] > O2 optimization causes runtime (SIGILL) during main initialization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105950 --- Comment #22 from John Kanapes --- OK. Removed -lubsan. Added -fsanitize=undefined to linking Same result as all the other flags. It took you 4 posts to explain me what to do. It took me 4 posts to understand what you were talking about. You should explain better.
[Bug gcov-profile/101618] [GCOV] Wrong coverage caused by call site in a "for" statement
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101618 Yang Wang changed: What|Removed |Added Resolution|--- |FIXED Status|UNCONFIRMED |RESOLVED --- Comment #1 from Yang Wang --- fixed in later version
[Bug c++/105838] [10/11/12/13 Regression] g++ 12.1.0 runs out of memory or time when building const std::vector of std::strings
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105838 Richard Biener changed: What|Removed |Added Summary|g++ 12.1.0 runs out of |[10/11/12/13 Regression] |memory or time when |g++ 12.1.0 runs out of |building const std::vector |memory or time when |of std::strings |building const std::vector ||of std::strings Blocks||93199 Target Milestone|--- |10.4 Priority|P3 |P2 CC||ebotcazou at gcc dot gnu.org, ||rguenth at gcc dot gnu.org --- Comment #4 from Richard Biener --- Memory usage is from cleanup_empty_eh_merge_phis which deals with a very large number of incoming edges, recording the edge/var mappings. This likely runs into /* The post-order traversal may lead to quadraticness in the redirection of incoming EH edges from inner LPs, so first try to walk the region tree from inner to outer LPs in order to eliminate these edges. */ where we end up re-directing more and more edges again and again. Still the peak memory use is odd, but it might be simply GC garbage piling up in the CFG manipulation odyssee. It's removal of MNT regions - with just 3 elements we go in ehcleanup1 from Before removal of unreachable regions: Eh tree: 25 must_not_throw 1 cleanup land:{12,} 24 cleanup 23 must_not_throw 2 cleanup land:{11,} 22 must_not_throw 3 cleanup land:{10,} 21 must_not_throw 4 cleanup land:{9,} 20 must_not_throw 5 cleanup land:{1,} 19 must_not_throw 6 cleanup land:{8,} 18 must_not_throw 7 cleanup land:{2,} 17 must_not_throw 8 cleanup land:{7,} 16 must_not_throw 9 cleanup land:{3,} 15 must_not_throw 10 cleanup land:{6,} 14 must_not_throw 11 cleanup land:{5,} 13 must_not_throw 12 cleanup land:{4,} to After removal of unreachable regions: Eh tree: 1 cleanup land:{12,} 2 cleanup land:{11,} 3 cleanup land:{10,} 4 cleanup land:{9,} 5 cleanup land:{1,} 6 cleanup land:{8,} 7 cleanup land:{2,} 8 cleanup land:{7,} 9 cleanup land:{3,} 10 cleanup land:{6,} 11 cleanup land:{5,} 12 cleanup land:{4,} but we do this in a sub-optimal order. Axing the first walk: for (i = vec_safe_length (cfun->eh->lp_array) - 1; i >= 1; --i) { lp = (*cfun->eh->lp_array)[i]; if (lp) changed |= cleanup_empty_eh (lp); } fixes this but it will go against the PR93199 fix in r10-5868-g5eaf0c498f718f, which the followup r11-3234-gaab6194d0898f5 preserved. I fear the optimal order is different for the clobber optimizations and the edge redirection overhead. In any case a fix should be evaluated against the PR93199 testcase as well. Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93199 [Bug 93199] [9 Regression] Compile time hog in sink_clobbers
[Bug c++/105838] [10/11/12/13 Regression] g++ 12.1.0 runs out of memory or time when building const std::vector of std::strings
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105838 Richard Biener changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #5 from Richard Biener --- btw, the unincluded testcase ended up too small, not matching the posted numbers (I had to hit reload and cut it further at that point ...).
[Bug c++/105838] [10/11/12/13 Regression] g++ 12.1.0 runs out of memory or time when building const std::vector of std::strings
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105838 Jakub Jelinek changed: What|Removed |Added CC||jason at gcc dot gnu.org, ||redi at gcc dot gnu.org --- Comment #6 from Jakub Jelinek --- Note, for say #include #include void foo (const std::vector &); int main () { const std::vector lst = { "aahing", "aaliis", "aarrgh", "abacas", "abacus", "abakas", "abamps", "abands", "abased", "abaser", "abases", "abasia" }; foo (lst); } one gets terrible code from both g++ and clang++, in both cases it is serial code calling many std::string ctors with the string literal arguments that perhaps later on are inlined. Over 21000 times in a row. That also means over 21000 memory allocations etc. For your game, the obvious first question would be if you really need std::vector of std::string in this case and if a normal array of const char * strings wouldn't be better, that can be initialized at compile time. Or, if you really need std::vector, if it wouldn't be better to use array of const char * and build the vector from it (sizeof (arr) / sizeof (arr[0]) to reserve that many elts in the vector, then a loop that will construct the std::string objects and move them into the list). On the compiler side, a question is if we shouldn't detect such kind of initializers and if they have over some param determined number of elements which have the same type / kind (or at least a large sequence of such), don't emit those std::allocator::allocator (&D.37541); try { std::__cxx11::basic_string::basic_string<> (_4, "aahing", &D.37541); D.37581 = D.37581 + 32; D.37582 = D.37582 + -1; _5 = D.37581; try { std::allocator::allocator (&D.37543); try { std::__cxx11::basic_string::basic_string<> (_5, "aaliis", &D.37543); D.37581 = D.37581 + 32; D.37582 = D.37582 + -1; _6 = D.37581; try { ... but a loop. Doesn't have to be just for the STL types, if we have struct S { S (int); ... }; const S s[] = { 1, 3, 22, 42, 132, -12, 18, 19, 32, 0, 25, ... }; then again there should be some upper limit over which we'd just emit: const S s[count]; static const int stemp[count] = { 1, 3, 22, 42, 132, -12, 18, 19, 32, 0, 25, ... }; for (size_t x = 0; x < count; ++x) S (&s[x], stemp[x]); or so (of course, with destruction possibility if some ctor may throw).
[Bug target/105920] __builtin_cpu_supports ("f16c") should check AVX
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105920 H.J. Lu changed: What|Removed |Added Resolution|--- |FIXED Status|NEW |RESOLVED Target Milestone|--- |11.4 --- Comment #2 from H.J. Lu --- Fixed for GCC 13 by https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;h=751f306688508b08842d0ab967dee8e6c3b91351 Fixed for GCC 12.2 by: https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=4b06b7304066fb1016e017d15e189f2e745dceae Fixed for GCC 11.4 by https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=30c1cde3adec938606cd49b1b4a262590b496719
[Bug middle-end/105638] Redundant stores aren't removed by DSE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105638 H.J. Lu changed: What|Removed |Added Resolution|--- |FIXED Status|NEW |RESOLVED Target Milestone|--- |13.0 --- Comment #3 from H.J. Lu --- Fixed.
[Bug target/105960] [12/13 Regression] Crash in 32-bit mode
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105960 --- Comment #6 from H.J. Lu --- This is caused by r12-5771.
[Bug target/105974] New: [13 Regression] ICE: RTL check: expected elt 0 type 'i' or 'n', have 'w' (rtx const_int) in arm_bfi_1_p, at config/arm/arm.cc:10214
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105974 Bug ID: 105974 Summary: [13 Regression] ICE: RTL check: expected elt 0 type 'i' or 'n', have 'w' (rtx const_int) in arm_bfi_1_p, at config/arm/arm.cc:10214 Product: gcc Version: 13.0 Status: UNCONFIRMED Keywords: build, ice-on-valid-code Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: zsojka at seznam dot cz Target Milestone: --- Host: x86_64-pc-linux-gnu Target: armv7a-hardfloat-linux-gnueabi Created attachment 53134 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53134&action=edit reduced testcase This currently breaks build with RTL checking enabled. Compiler output: $ /repo/build-gcc-trunk-armv7a-hardfloat/./gcc/cc1 -O2 -march=armv7-a+vfpv4 testcase.c __gnu_fractqquda Analyzing compilation unit Performing interprocedural optimizations <*free_lang_data> {heap 932k} {heap 932k} {heap 932k} {heap 1212k} {heap 1684k} {heap 1684k} {heap 1684k} {heap 1684k}Streaming LTO {heap 1684k} {heap 1684k} {heap 1684k} {heap 1684k} {heap 1684k} {heap 1684k} {heap 1684k} {heap 1684k} {heap 1684k} {heap 1684k} {heap 1684k} {heap 1684k} {heap 1684k} {heap 1684k}Assembling functions: __gnu_fractqqudaduring RTL pass: combine testcase.c: In function '__gnu_fractqquda': testcase.c:9:1: internal compiler error: RTL check: expected elt 0 type 'i' or 'n', have 'w' (rtx const_int) in arm_bfi_1_p, at config/arm/arm.cc:10214 9 | } | ^ 0x71d11e rtl_check_failed_type2(rtx_def const*, int, int, int, char const*, int, char const*) /repo/gcc-trunk/gcc/rtl.cc:907 0x7d01d3 arm_bfi_1_p /repo/gcc-trunk/gcc/config/arm/arm.cc:10214 0x14406d6 arm_bfi_p /repo/gcc-trunk/gcc/config/arm/arm.cc:10255 0x14406d6 arm_rtx_costs_internal /repo/gcc-trunk/gcc/config/arm/arm.cc:11027 0x14406d6 arm_rtx_costs /repo/gcc-trunk/gcc/config/arm/arm.cc:12058 0x102c33e rtx_cost(rtx_def*, machine_mode, rtx_code, int, bool) /repo/gcc-trunk/gcc/rtlanal.cc:4629 0x1b69e98 set_src_cost /repo/gcc-trunk/gcc/rtl.h:2943 0x1b69e98 distribute_and_simplify_rtx /repo/gcc-trunk/gcc/combine.cc:10013 0x1b77941 simplify_logical /repo/gcc-trunk/gcc/combine.cc:7103 0x1b77941 combine_simplify_rtx /repo/gcc-trunk/gcc/combine.cc:6330 0x1b79d19 subst /repo/gcc-trunk/gcc/combine.cc:5605 0x1b7d3d7 try_combine /repo/gcc-trunk/gcc/combine.cc:3288 0x1b85dd5 combine_instructions /repo/gcc-trunk/gcc/combine.cc:1266 0x1b85dd5 rest_of_handle_combine /repo/gcc-trunk/gcc/combine.cc:14976 0x1b85dd5 execute /repo/gcc-trunk/gcc/combine.cc:15021 Please submit a full bug report, with preprocessed source (by using -freport-bug). Please include the complete backtrace with any bug report. See <https://gcc.gnu.org/bugs/> for instructions. $ /repo/build-gcc-trunk-armv7a-hardfloat/./gcc/xgcc -v Using built-in specs. COLLECT_GCC=/repo/build-gcc-trunk-armv7a-hardfloat/./gcc/xgcc Target: armv7a-hardfloat-linux-gnueabi Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++ --enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra --with-cloog --with-ppl --with-isl --with-float=hard --with-fpu=vfpv4 --with-arch=armv7-a --with-sysroot=/usr/armv7a-hardfloat-linux-gnueabi --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu --target=armv7a-hardfloat-linux-gnueabi --with-ld=/usr/bin/armv7a-hardfloat-linux-gnueabi-ld --with-as=/usr/bin/armv7a-hardfloat-linux-gnueabi-as --disable-libstdcxx-pch --prefix=/repo/gcc-trunk//binary-trunk-r13-1089-20220614140553-g8f6c317b3a1-checking-yes-rtl-df-extra-armv7a-hardfloat Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 13.0.0 20220614 (experimental) (GCC)
[Bug target/105975] New: OpenMP/nvptx offloading: 'internal compiler error: in maybe_legitimize_operand, at optabs.cc:7785'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105975 Bug ID: 105975 Summary: OpenMP/nvptx offloading: 'internal compiler error: in maybe_legitimize_operand, at optabs.cc:7785' Product: gcc Version: 12.0 Status: UNCONFIRMED Keywords: openmp Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: tschwinge at gcc dot gnu.org CC: jakub at gcc dot gnu.org, rsandifo at gcc dot gnu.org, vries at gcc dot gnu.org Target Milestone: --- Target: nvptx The recent commit r13-1068-g1d205dbac1e1754c01c22a31bd1688126545401e "Factor out common internal-fn idiom" causes a class of ICEs in OpenMP/nvptx offloading compilation: 'during RTL pass: expand', 'internal compiler error: in maybe_legitimize_operand, at optabs.cc:7785', seen for a lot of libgomp OpenMP/nvptx offloading test cases (with '-O1' and higher). 0xb1b0b3 maybe_legitimize_operand [...]/source-gcc/gcc/optabs.cc:7785 0xb1b0b3 maybe_legitimize_operands(insn_code, unsigned int, unsigned int, expand_operand*) [...]/source-gcc/gcc/optabs.cc:7936 0xb1b139 maybe_gen_insn(insn_code, unsigned int, expand_operand*) [...]/source-gcc/gcc/optabs.cc:7955 0xb1a8b8 maybe_expand_insn(insn_code, unsigned int, expand_operand*) [...]/source-gcc/gcc/optabs.cc:7998 0xb1a8b8 expand_insn(insn_code, unsigned int, expand_operand*) [...]/source-gcc/gcc/optabs.cc:8029 0x95dcb3 expand_fn_using_insn [...]/source-gcc/gcc/internal-fn.cc:193 0x6d3ee7 expand_call_stmt [...]/source-gcc/gcc/cfgexpand.cc:2737 0x6d3ee7 expand_gimple_stmt_1 [...]/source-gcc/gcc/cfgexpand.cc:3869 For extra entertainment: when running with '-wrapper "$GDB",-q,--args', we get '[Inferior 1 (process [...]) exited normally]'... (Maybe Valgrind could help? Unless someone directly pinpoints the issue, of course.) I've not yet determined whether it's a latent problem just exposed by this commit, or whether the commit itself has an issue. It's not magically fixed by the related subsequent commit r13-1069-gf8baf4004ef965ce7a9edf6d2f5eb99adb15803a "Add a general mapping from internal fns to target insns". 'gcc/internal-fn.cc': 193expand_insn (icode, opno, ops); 'gcc/optabs.cc': 8026expand_insn (enum insn_code icode, unsigned int nops, 8027 class expand_operand *ops) 8028{ 8029 if (!maybe_expand_insn (icode, nops, ops)) 7995maybe_expand_insn (enum insn_code icode, unsigned int nops, 7996 class expand_operand *ops) 7997{ 7998 rtx_insn *pat = maybe_gen_insn (icode, nops, ops); 7951maybe_gen_insn (enum insn_code icode, unsigned int nops, 7952class expand_operand *ops) 7953{ 7954 gcc_assert (nops == (unsigned int) insn_data[(int) icode].n_generator_args); 7955 if (!maybe_legitimize_operands (icode, 0, nops, ops)) 7935 /* Otherwise try legitimizing the operand on its own. */ 7936 if (j == i && !maybe_legitimize_operand (icode, opno + i, &ops[i])) 7784case EXPAND_OUTPUT: 7785 gcc_assert (mode != VOIDmode);
[Bug c/105950] > O2 optimization causes runtime (SIGILL) during main initialization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105950 --- Comment #23 from John Kanapes --- Hi, I have not been able to recreate the issue with simpler programs that use the same resources. I will need to upload my sources. Is it OK to upload a tar.gz archive with a test directory with the sources and a makefile? What do you do with the sources after the ticket? TIA
[Bug libstdc++/62187] std::string==const char* could compare sizes first
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62187 Jonathan Wakely changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |redi at gcc dot gnu.org Status|NEW |ASSIGNED --- Comment #7 from Jonathan Wakely --- (In reply to Jonathan Wakely from comment #5) > I've also created an LWG issue about this, Rather than a new issue, this was added to https://wg21.link/lwg2852 The resolution was to confirm that operator== doesn't need to call compare if it can determine the result another way. That means we can do the length check unconditionally.
[Bug middle-end/101836] __builtin_object_size(P->M, 1) where M is an array and the last member of a struct fails
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101836 --- Comment #25 from qinzhao at gcc dot gnu.org --- So, based on all the discussion so far, how about the following: ** add the following gcc option: -fstrict-flex-arrays=[0|1|2|3] when -fstrict-flex-arrays=0: treat all trailing arrays as flexible arrays. the default behavior; when -fstrict-flex-arrays=1: Only treating [], [0], and [1] as flexible array; when -fstrict-flex-arrays=2: Only treating [] and [0] as flexible array; when -fstrict-flex-arrays=3: Only treating [] as flexible array; The strictest level. any comments?
[Bug target/105960] [12/13 Regression] Crash in 32-bit mode
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105960 H.J. Lu changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |hjl.tools at gmail dot com --- Comment #7 from H.J. Lu --- Created attachment 53135 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53135&action=edit A patch Try this.
[Bug c/105970] ICE in ix86_function_arg, at config/i386/i386.cc:3351
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105970 --- Comment #2 from H.J. Lu --- (In reply to Uroš Bizjak from comment #1) > Probably something like: > > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc > index 3d189e124e4..f158cc3aaea 100644 > --- a/gcc/config/i386/i386.cc > +++ b/gcc/config/i386/i386.cc > @@ -3348,7 +3348,7 @@ ix86_function_arg (cumulative_args_t cum_v, const > function_arg_info &arg) >if (POINTER_TYPE_P (arg.type)) > { > /* This is the pointer argument. */ > - gcc_assert (TYPE_MODE (arg.type) == Pmode); > + gcc_assert (TYPE_MODE (arg.type) == ptr_mode); This looks reasonable since pointer mode should be ptr_mode. > /* It is at -WORD(AP) in the current frame in interrupt and > exception handlers. */ > reg = plus_constant (Pmode, arg_pointer_rtx, -UNITS_PER_WORD); > > Pointer mode and Pmode can be distinct for x32 target. However, I have no > idea what goes into interrupt frame for x32. Let's ask HJ.
[Bug libstdc++/59048] operator== between std::string and const char* slower than strcmp
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59048 Jonathan Wakely changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |redi at gcc dot gnu.org Status|NEW |ASSIGNED
[Bug c/105950] > O2 optimization causes runtime (SIGILL) during main initialization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105950 Sam James changed: What|Removed |Added CC||sam at gentoo dot org --- Comment #24 from Sam James --- Please be polite on these bugs. There's a lot of documentation online about how to use UBsan. It's not ideal to upload a tarball with all of the bits, but if it's what's needed, then I guess so be it. Some build systems make it easier to enable sanitizers like Meson. GCC's bug tracker isn't for general support on how to use build systems and flags. The bug tracker is public and I don't think one can delete their own attachments. Are you saying that when you use -fsanitize=undefined and run your program, it gets SIGILL'd?
[Bug c/105950] > O2 optimization causes runtime (SIGILL) during main initialization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105950 --- Comment #25 from Jonathan Wakely --- (In reply to John Kanapes from comment #22) > It took you 4 posts to explain me what to do. > It took me 4 posts to understand what you were talking about. > You should explain better. You should read better. Comment 3 is perfectly clear. "For UBSan, you can't just compile with -fsanitize=undefined, you need to link with that flag as well." (In reply to John Kanapes from comment #23) > What do you do with the sources after the ticket? They will stay attached here. If you don't want them to be public, you need to reduce it to something smaller that still shows the bug (which you've said you can't) or put them somewhere online and persuade somebody here to download them and try to reproduce and reduce it for you.