[Bug c++/111286] [12/13/14 Regression] ICE on functional cast empty brace-init-list to const array reference
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111286 Gayathri Gottumukkala changed: What|Removed |Added CC||gayathri.gottumukkala.27@gm ||ail.com --- Comment #3 from Gayathri Gottumukkala --- I think this issue is related to the attempt to create a temporary array of references, which is not allowed in C++. In C++11 and later, you can find this information in section 8.3.2, paragraph 4 https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3690.pdf According to the above document, "There shall be no references to references, no arrays of references, and no pointers to references." To address the compilation error without modifying the code structure, we can make use of a temporary array of const A objects. struct A { A() noexcept {} }; void foo() { using T = const A[1]; T{}; }
[Bug libstdc++/111390] 'make check-compile' target is not useful
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111390 --- Comment #5 from Richard Biener --- Just to add a compile-only "override" would be useful to do bare testing of cross compilers where no (or an incomplete) runtime is available to reduce the amount of noise produced (you still get complaints about missing headers of course). Not sure if easily doable across all testsuites though. I agree not so useful for libstdc++
[Bug tree-optimization/111393] ICE: Segmentation fault src/gcc/toplev.cc:314
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111393 Richard Biener changed: What|Removed |Added Version|unknown |13.1.0 --- Comment #7 from Richard Biener --- You might also want to update the compiler, GCC 13.1.0 is old, 13.2.0 exists for quite a while.
[Bug libstdc++/111390] libstdc++-v3/scripts/check_compile script is not useful
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111390 Jonathan Wakely changed: What|Removed |Added Summary|'make check-compile' target |libstdc++-v3/scripts/check_ |is not useful |compile script is not ||useful --- Comment #6 from Jonathan Wakely --- I've updated the summary to be clear that this is about the current incarnation of the libstdc++ feature, which has been broken since the default changed from-std=gnu++98 to -std=gnu++14 many years ago, or even earlier.
[Bug tree-optimization/111397] Spurious warning "'({anonymous})' is used uninitialized" when calling a __returns_twice__ function (-Wuninitialized -O2)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111397 Richard Biener changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org Last reconfirmed||2023-09-13 Status|UNCONFIRMED |ASSIGNED Ever confirmed|0 |1 --- Comment #2 from Richard Biener --- (In reply to Andrew Pinski from comment #1) > Looks loop copy header change which allowed the warning not to happen. > > The warning is about the argument of test_setjmpex. Because GCC does not > realize __builtin_frame_address cannot jump to the test_setjmpex ... > > In the case of GCC 12-13, the copy of the loop header happens during > thread-full rather than earlier and inserts: > _4(ab) = _11(D); > > Which is what is warned about. > _11(D) does not get proped into the phi ... We can't propagate because /* Similarly if DEST flows in from an abnormal edge then the copy cannot be propagated. If we know we do not propagate into a PHI argument this does not apply. */ else if (!dest_not_phi_arg_p && TREE_CODE (dest) == SSA_NAME && SSA_NAME_OCCURS_IN_ABNORMAL_PHI (dest)) return false; that's still not fine-grained enough - the case we cannot propagate is when we propagate into a PHI argument for an abnormal edge. The diagnostic doesn't happen on trunk, I still have a patch doing the propagation.
[Bug target/111320] RISC-V: Failed combine extend + vfwredosum
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111320 --- Comment #1 from JuzheZhong --- Not only inorder reduction. But also un-order reduction: https://godbolt.org/z/sn5jbWPbd #include int foo (int16_t * __restrict a, int n, int * __restrict cond) { int r = 0; for (int i = 0; i < 8; i++) if (cond[i]) r += a[i]; return r; } int foo2 (int16_t * __restrict a, int n, int * __restrict cond) { int r = 0; for (int i = 0; i < 8; i++) r += a[i]; return r; } Failed to combine into widen reduction
[Bug c/111398] GCC should warn if a struct with flexible array member is declared static or onstack
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111398 Richard Biener changed: What|Removed |Added Keywords||diagnostic Version|unknown |14.0 Severity|normal |enhancement --- Comment #1 from Richard Biener --- The C standard explicitely makes these cases valid. I agree it's probably unintended in most cases.
[Bug c++/111399] New: Sanitizer code generation smarter than warnings
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111399 Bug ID: 111399 Summary: Sanitizer code generation smarter than warnings Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: david at westcontrol dot com Target Milestone: --- Given this code : int sign(int x) { if (x < 0) return -1; if (x == 0) return 0; if (x > 0) return 1; } and compiled with "-O2 -Wall", gcc is unable to see that all possible cases for "x" are covered, so it generates a "control reaches end of non-void function [-Wreturn-type]" warning. It would be nice if gcc could see this is a false positive, but analysis and warnings can't be perfect. However, if I add the flag "-fsanitize=undefined", the compiler is smart enough to see that all cases are covered, and there is no call to __ubsan_handle_missing_return generated. If the sanitizer code generation can see that all cases are covered, why can't the -Wreturn-type warning detection? I'm guessing it comes down to the ordering of compiler passes and therefore the level of program analysis information available at that point. But perhaps the -Wreturn-type pass could be done later when the information is available?
[Bug target/110751] RISC-V: Suport undefined value that allows VSETVL PASS use TA/MA
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110751 --- Comment #31 from Richard Biener --- On GIMPLE an "undefined" operand representation would be the default definition of an SSA name with the appropriate type. That's a somewhat "heavy" representation and it also doesn't fit the target hook return value nicely, but we could handle a NULL_TREE return value from the target hook in the way to create such SSA name.
[Bug ada/110488] [13/14 regression] legal deferred constant rejected
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110488 Eric Botcazou changed: What|Removed |Added Summary|Legal deferred constant |[13/14 regression] legal |rejected|deferred constant rejected Target Milestone|--- |13.3 Ever confirmed|0 |1 Status|UNCONFIRMED |NEW Last reconfirmed||2023-09-13 CC||ebotcazou at gcc dot gnu.org --- Comment #1 from Eric Botcazou --- This compiles up to GCC 12.
[Bug ada/110488] [13/14 regression] legal deferred constant rejected
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110488 Eric Botcazou changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |ebotcazou at gcc dot gnu.org Status|NEW |ASSIGNED --- Comment #2 from Eric Botcazou --- Investigating.
[Bug c/111400] New: Missing return sanitization only works in C++
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111400 Bug ID: 111400 Summary: Missing return sanitization only works in C++ Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: david at westcontrol dot com Target Milestone: --- With C++ and -fsanitize=return, the code : int foo(void) { } generates a call to __ubsan_handle_missing_return. For C, there is no sanitizer call - just a simple "ret" instruction. This is, of course, because in C (unlike C++), falling off the end of a non-void function is legal and defined behaviour, as long as caller code does not try to use the non-existent return value. But just like in C++, it is almost certainly an error in the C code if control flow ever falls off the end of a non-void function. Could -fsanitize=return be added to C? It should not be included by -fsanitize=undefined in C, since the behaviour is actually allowed, but it would still be a useful option that could be enabled individually.
[Bug tree-optimization/111397] Spurious warning "'({anonymous})' is used uninitialized" when calling a __returns_twice__ function (-Wuninitialized -O2)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111397 --- Comment #3 from CVS Commits --- The master branch has been updated by Richard Biener : https://gcc.gnu.org/g:92ea12ea99fce546772a40b7bbc2ea850db9b1be commit r14-3916-g92ea12ea99fce546772a40b7bbc2ea850db9b1be Author: Richard Biener Date: Wed Sep 13 09:28:34 2023 +0200 tree-optimization/111397 - missed copy propagation involving abnormal dest The following extends the previous enhancement to copy propagation involving abnormals. We can easily replace abnormal uses by not abnormal uses and only need to preserve the abnormals in PHI arguments flowing in from abnormal edges. This changes the may_propagate_copy argument indicating we are not propagating into a PHI node to indicate whether we know we are not propagating into a PHI argument from an abnormal PHI instead. PR tree-optimization/111397 * tree-ssa-propagate.cc (may_propagate_copy): Change optional argument to specify whether the PHI destination doesn't flow in from an abnormal PHI. (propagate_value): Adjust. * tree-ssa-forwprop.cc (pass_forwprop::execute): Indicate abnormal PHI dest. * tree-ssa-sccvn.cc (eliminate_dom_walker::before_dom_children): Likewise. (process_bb): Likewise. * gcc.dg/uninit-pr111397.c: New testcase.
[Bug tree-optimization/111397] [12/13 Regression] Spurious warning "'({anonymous})' is used uninitialized" when calling a __returns_twice__ function (-Wuninitialized -O2)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111397 Richard Biener changed: What|Removed |Added Known to fail||12.3.1, 13.2.1 Summary|Spurious warning|[12/13 Regression] Spurious |"'({anonymous})' is used|warning "'({anonymous})' is |uninitialized" when calling |used uninitialized" when |a __returns_twice__ |calling a __returns_twice__ |function (-Wuninitialized |function (-Wuninitialized |-O2)|-O2) Priority|P3 |P2 Target Milestone|--- |12.4 Blocks||24639 Known to work||11.4.1, 14.0 --- Comment #4 from Richard Biener --- Fixed on trunk, I know it works on the 13 branch, eventually will backport. Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=24639 [Bug 24639] [meta-bug] bug to track all Wuninitialized issues
[Bug target/110751] RISC-V: Suport undefined value that allows VSETVL PASS use TA/MA
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110751 --- Comment #32 from JuzheZhong --- (In reply to Richard Biener from comment #31) > On GIMPLE an "undefined" operand representation would be the default > definition of an SSA name with the appropriate type. That's a somewhat > "heavy" representation and it also doesn't fit the target hook return value > nicely, > but we could handle a NULL_TREE return value from the target hook in the > way to create such SSA name. Thanks Richi. How does this special "SSA" represent in RTX or How could I recognize this is a "undefine" value in "expand" stage ? I wondering whether my approach (passing a scalar 0) to the ELSE value which is easily recognized in RTL backend ("expand stage") is suitable ? Since you could see there will be one more move instruction inside the loop which hurt vector performance a lot, I want to find a quick way to fix it for now.
[Bug c++/111379] comparison between unequal pointers to void should be illegal during constant evaluation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111379 Xi Ruoyao changed: What|Removed |Added CC||xry111 at gcc dot gnu.org --- Comment #2 from Xi Ruoyao --- (In reply to Jiang An from comment #1) > There's (or will be) a new DR CWG2749 which tentatively ready now. > https://cplusplus.github.io/CWG/issues/2749.html > > It seems that the old resolution in CWG2526 was wrong, and the comparison > should be constexpr-friendly. > > BTW I don't think there was anything specifying that "the comparison would > have *undefined* behaviour" before CWG2526. It is (or was) unspecified, not undefined. And the standard explicitly disallows "a relational operator where the result is unspecified" in [expr.const].
[Bug target/110751] RISC-V: Suport undefined value that allows VSETVL PASS use TA/MA
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110751 --- Comment #33 from JuzheZhong --- Is it reasonable this way ? ELSE VALUE = make_temp_ssa_name (vectype, NULL, "undefine_"); Then in the later "expand" stage: defind_expand "cond_len_xxx" ... if (REG_EXPR (operand) == "undefine") { gen rvv insns with no else value } Is it reasonable? Thanks.
[Bug c++/111379] comparison between unequal pointers to void should be illegal during constant evaluation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111379 --- Comment #3 from Xi Ruoyao --- If CWG 2749 is accepted we should just close this as WONTFIX.
[Bug middle-end/111324] More optimization about "(X * Y) / Y"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111324 --- Comment #4 from Jiu Fu Guo --- (In reply to Jiu Fu Guo from comment #3) > A patch is posted: > https://gcc.gnu.org/pipermail/gcc-patches/2023-September/629534.html It is not for this PR. Sorry for typo.
[Bug middle-end/111324] More optimization about "(X * Y) / Y"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111324 --- Comment #5 from Jiu Fu Guo --- (In reply to Andrew Pinski from comment #2) > Confirmed. > > So using the local range in this case is ok. There might be only a few times > we don't want to use it though in match. Agree, "get_range_query" would be more useful for most cases. Through a quick look at match.pd, there are another two patterns that use "get_global_range_query". Some concerns about those patterns, so those patterns may not need to be updated. * (T)(A)+cst -->(T)(A+cst): I'm wondering if this transformation is really in favor of PPC. e.g. "return (long) x1 + 40;" could save one "extend-insn" less than "return (long)(x1 + 40);" * For pattern "((x * cst) + cst1) * cst2": it seems this pattern does not affect any cases. I mean this optimization is done by other parts (before match.pd).
[Bug tree-optimization/111387] ICE on valid code at -O2 and -O3: verify_ssa failed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111387 --- Comment #2 from Richard Biener --- The main issue is that when BB SLP splits the function when we already entered a cycle we have to go back and split at the containing cycles (recursively) because otherwise we can end up seeing "externals" from backedges. This particular testcase can be fixed by iterating over blocks in a better way but the general issue remains that the defensive code in vect_get_and_check_slp_defs relies on dominance info which doesn't work well for irreducible regions we are faced with here. The testcase also shows the recent honoring of ->dont_vectorize pessimizes the cases where we applied if-conversion for loop vectorization since that sets ->dont_vectorize on the original loop and that never gets cleared.
[Bug c/111400] Missing return sanitization only works in C++
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111400 Richard Biener changed: What|Removed |Added Last reconfirmed||2023-09-13 Ever confirmed|0 |1 CC||jsm28 at gcc dot gnu.org Status|UNCONFIRMED |NEW Version|unknown |14.0 Severity|normal |enhancement --- Comment #1 from Richard Biener --- Confirmed. Note C17 disallows a return wotihout an expression for a funcion that returns a value, not sure if that means falling off the function without a return (value) is still OK, it at least feels inconsistent.
[Bug c++/111399] Bogus -Wreturn-type diagnostic
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111399 Richard Biener changed: What|Removed |Added Ever confirmed|0 |1 Status|UNCONFIRMED |NEW Keywords||diagnostic Summary|Sanitizer code generation |Bogus -Wreturn-type |smarter than warnings |diagnostic Last reconfirmed||2023-09-13 Version|unknown |14.0 --- Comment #1 from Richard Biener --- We do instrument the missed return but it gets later optimized away.
[Bug c/111401] New: Middle-end: Missed optimization of MASK_LEN_FOLD_LEFT_PLUS
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111401 Bug ID: 111401 Summary: Middle-end: Missed optimization of MASK_LEN_FOLD_LEFT_PLUS Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: juzhe.zhong at rivai dot ai Target Milestone: --- There is a case I think I missed the optimization in the loop vectorizer: https://godbolt.org/z/x5sjdenhM double foo2 (double *__restrict a, double init, int *__restrict cond, int n) { for (int i = 0; i < n; i++) if (cond[i]) init += a[i]; return init; } It generate the GIMPLE IR as follows: _60 = .SELECT_VL (ivtmp_58, 4); ... vect__ifc__35.14_56 = .VCOND_MASK (mask__23.10_50, vect__8.13_54, { 0.0, 0.0, 0.0, 0.0 }); _36 = .MASK_LEN_FOLD_LEFT_PLUS (init_20, vect__ifc__35.14_56, { -1, -1, -1, -1 }, _60, 0); The mask of MASK_LEN_FOLD_LEFT_PLUS is the dummy mask {-1.-1,...-1} I think we should forward the mask of VCOND_MASK into the MASK_LEN_FOLD_LEFT_PLUS. Then we can eliminate the VCOND_MASK. I don't where is the optimal place to do the optimization. Should be the match.pd ? or the loop vectorizer code? Thanks.
[Bug c/111400] Missing return sanitization only works in C++
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111400 --- Comment #2 from David Brown --- (In reply to Richard Biener from comment #1) > Confirmed. Note C17 disallows a return wotihout an expression for a funcion > that returns a value, not sure if that means falling off the function > without a return (value) is still OK, it at least feels inconsistent. This has all remained unchanged from C99 to C23 (draft), I believe, which makes things easier! As far as I can tell, the relevant point in the standards is 6.9.1p12, "Function definitions", which says "Unless otherwise specified, if the } that terminates a function is reached, and the value of the function call is used by the caller, the behaviour is undefined". So while a non-void function cannot have a return statement without an expression (6.8.6.4p1), control flow /can/ run off the terminating }. I think this is perhaps a concession to older pre-void C code, when a function that does not have a return value would still be declared to return "int". Thus I think gcc's lack of a sanitizer here is technically accurate - but not helpful, unless you are working with 35 year old code!
[Bug target/111354] [7/10/12 regression] The instructions of the DPDK demo program are different and run time increases.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111354 Hongtao.liu changed: What|Removed |Added CC||crazylht at gmail dot com --- Comment #5 from Hongtao.liu --- void rte_mov128blocks(uint8_t *dst, const uint8_t *src, size_t n) { __m256i ymm0, ymm1, ymm2, ymm3; while (n >= 128) { ymm0 = _mm256_loadu_si256((const __m256i *)(const void *) ((const uint8_t *)src + 0 * 32)); n -= 128; ymm1 = _mm256_loadu_si256((const __m256i *)(const void *) ((const uint8_t *)src + 1 * 32)); ymm2 = _mm256_loadu_si256((const __m256i *)(const void *) ((const uint8_t *)src + 2 * 32)); ymm3 = _mm256_loadu_si256((const __m256i *)(const void *) ((const uint8_t *)src + 3 * 32)); src = (const uint8_t *)src + 128; _mm256_storeu_si256((__m256i *)(void *) ((uint8_t *)dst + 0 * 32), ymm0); _mm256_storeu_si256((__m256i *)(void *) ((uint8_t *)dst + 1 * 32), ymm1); _mm256_storeu_si256((__m256i *)(void *) ((uint8_t *)dst + 2 * 32), ymm2); _mm256_storeu_si256((__m256i *)(void *) ((uint8_t *)dst + 3 * 32), ymm3); dst = (uint8_t *)dst + 128; } } I'm curious if we can distribute the uppper as an memmove?(of course, compiler needs to know 2 array don't alias each other.
[Bug target/110751] RISC-V: Suport undefined value that allows VSETVL PASS use TA/MA
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110751 --- Comment #34 from Richard Biener --- The ELSE value of type TYPE would be constructed like tree var = create_tmp_var (type); tree else_val = get_or_create_ssa_default_def (cfun, var); I'm not sure const0_rtx is a good representation on RTL - how would you distinguish that from a conditional operation on an integer vector with else value zero? Say for an integer division? for (i) if (f[i]) y[i] = x[i] / z[i]; else y[i] = 0; we don't have a separate "else" value for elements cut off via 'len' vs. elements cut off via 'mask'. On RTL there are "special" RTXen used for this kind of stuff, like (use:mode ..) or (clobber const0_rtx), but I'm the wrong person to ask which one would be most appropriate for a general operand in an otherwise generic instruction. Maybe Richard has a guess.
[Bug target/110751] RISC-V: Suport undefined value that allows VSETVL PASS use TA/MA
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110751 --- Comment #35 from Richard Biener --- (In reply to Richard Biener from comment #34) > The ELSE value of type TYPE would be constructed like > > tree var = create_tmp_var (type); > tree else_val = get_or_create_ssa_default_def (cfun, var); Oh, and you recognize that at expansion by TREE_CODE (else_val) == SSA_NAAME && SSA_NAME_IS_DEFAULT_DEF (else_val) && VAR_P (SSA_NAME_VAR (else_val))
[Bug c/111400] Missing return sanitization only works in C++
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111400 --- Comment #3 from Andreas Schwab --- You already have -W[error=]return-type.
[Bug c/111401] Middle-end: Missed optimization of MASK_LEN_FOLD_LEFT_PLUS
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111401 Richard Biener changed: What|Removed |Added Ever confirmed|0 |1 Status|UNCONFIRMED |NEW Last reconfirmed||2023-09-13 --- Comment #1 from Richard Biener --- The vectorizer sees if-converted code like [local count: 955630224]: # init_20 = PHI <_36(8), init_12(D)(18)> # i_22 = PHI _1 = (long unsigned int) i_22; _2 = _1 * 4; _3 = cond_15(D) + _2; _4 = *_3; _23 = _4 != 0; _6 = _1 * 8; _38 = _37 + _6; _7 = (double *) _38; _8 = .MASK_LOAD (_7, 64B, _23); _ifc__35 = _23 ? _8 : 0.0; _36 = init_20 + _ifc__35; i_18 = i_22 + 1; if (n_13(D) > i_18) so what it produces matches up here. There's the possibility to modify the if-conversion handling to use a COND_ADD instead of the COND_EXPR plus ADD, I think that would be the best thing here. See tree-if-conv.cc:is_cond_scalar_reduction/convert_scalar_cond_reduction I think this is also wrong code when signed zeros are involved.
[Bug middle-end/111402] New: Loop distribution fail to optimize memmove for multiple consecutive moves within a loop
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111402 Bug ID: 111402 Summary: Loop distribution fail to optimize memmove for multiple consecutive moves within a loop Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: crazylht at gmail dot com Target Milestone: --- cat test.c typedef long long v4di __attribute__((vector_size(32))); void foo (v4di* __restrict a, v4di *b, int n) { for (int i = 0; i != n; i++) a[i] = b[i]; } void foo1 (v4di* __restrict a, v4di *b, int n) { for (int i = 0; i != n; i+=2) { a[i] = b[i]; a[i+1] = b[i+1]; } } gcc -O2 -S test.c GCC can optimize loop in foo to memmove, but not for loop in foo1. This is from PR111354
[Bug target/110751] RISC-V: Suport undefined value that allows VSETVL PASS use TA/MA
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110751 --- Comment #36 from JuzheZhong --- (In reply to Richard Biener from comment #34) > The ELSE value of type TYPE would be constructed like > > tree var = create_tmp_var (type); > tree else_val = get_or_create_ssa_default_def (cfun, var); > > I'm not sure const0_rtx is a good representation on RTL - how would > you distinguish that from a conditional operation on an integer vector > with else value zero? Say for an integer division? My current approach is that I passed scalar 0 to the ELSE VALUE. So in the I relax the operand predicate of the cond_len else operand: it can be either a register_operand has VECTOR_MODE or a const_int 0 (Note that it can't be the CONST_VECTOR). So, I can distinguish the else operand. If it is a scalar const_int 0, it is undefine. Otherwise, it is always a register operand with a vector mode.
[Bug target/110751] RISC-V: Suport undefined value that allows VSETVL PASS use TA/MA
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110751 --- Comment #37 from JuzheZhong --- (In reply to Richard Biener from comment #35) > (In reply to Richard Biener from comment #34) > > The ELSE value of type TYPE would be constructed like > > > > tree var = create_tmp_var (type); > > tree else_val = get_or_create_ssa_default_def (cfun, var); > > Oh, and you recognize that at expansion by > > TREE_CODE (else_val) == SSA_NAAME > && SSA_NAME_IS_DEFAULT_DEF (else_val) > && VAR_P (SSA_NAME_VAR (else_val)) Oh. Sounds good. I will have a try.
[Bug target/110751] RISC-V: Suport undefined value that allows VSETVL PASS use TA/MA
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110751 --- Comment #38 from rguenther at suse dot de --- On Wed, 13 Sep 2023, juzhe.zhong at rivai dot ai wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110751 > > --- Comment #36 from JuzheZhong --- > (In reply to Richard Biener from comment #34) > > The ELSE value of type TYPE would be constructed like > > > > tree var = create_tmp_var (type); > > tree else_val = get_or_create_ssa_default_def (cfun, var); > > > > I'm not sure const0_rtx is a good representation on RTL - how would > > you distinguish that from a conditional operation on an integer vector > > with else value zero? Say for an integer division? > > My current approach is that I passed scalar 0 to the ELSE VALUE. > > So in the I relax the operand predicate of the cond_len else operand: > > it can be either a register_operand has VECTOR_MODE or a const_int 0 (Note > that > it > can't be the CONST_VECTOR). I see.
[Bug c/111400] Missing return sanitization only works in C++
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111400 --- Comment #4 from David Brown --- (In reply to Andreas Schwab from comment #3) > You already have -W[error=]return-type. Yes, and that is what I normally use - I am a big fan of gcc's static warnings. Sometimes, however, there are false positives, or perhaps other reasons why the programmer thinks it is safe to ignore the warning in a particular case. Then sanitizers can be a useful run-time fault-finding aid. There's certainly a lot of overlap in the kinds of mistakes that can be found with -Wreturn-type and with -fsanitizer=return-type, but there are still benefits in have both. (You have both in C++, just not in C.)
[Bug target/111403] New: LoongArch: Wrong code with -O -mlasx -fopenmp-simd
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111403 Bug ID: 111403 Summary: LoongArch: Wrong code with -O -mlasx -fopenmp-simd Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: xry111 at gcc dot gnu.org Target Milestone: --- Testcase: struct S { int s; S () : s (0) {} ~S () {} S (const S &x) { s = x.s; } S & operator= (const S &x) { s = x.s; return *this; } }; S r, a[1024], b[1024]; #pragma omp declare reduction(+ : S : omp_out.s += omp_in.s) __attribute__ ((noipa)) void foo (S *a, S *b) { #pragma omp simd reduction(inscan, + : r) for (int i = 0; i < 1024; i++) { r.s += a[i].s; #pragma omp scan inclusive(r) b[i] = r; } } int main () { S s; for (int i = 0; i < 1024; ++i) { a[i].s = i; b[i].s = -1; } foo (a, b); if (r.s != 1024 * 1023 / 2) __builtin_abort (); return 0; } $ g++ t.cc -O -mlasx -fopenmp-simd $ ./a.out Aborted (core dumped)
[Bug target/111403] LoongArch: Wrong code with -O -mlasx -fopenmp-simd
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111403 Xi Ruoyao changed: What|Removed |Added Keywords||wrong-code CC||chenglulu at loongson dot cn, ||chenxiaolong at loongson dot cn Target||loongarch*-*-* --- Comment #1 from Xi Ruoyao --- FWIW the test case is reduced from g++.dg/vect/simd-2.cc. And interestingly if we remove the definition S::operator= the issue no longer happens.
[Bug middle-end/111402] Loop distribution fail to optimize memmove for multiple consecutive moves within a loop
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111402 Richard Biener changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2023-09-13 Ever confirmed|0 |1 --- Comment #1 from Richard Biener --- I think we have a duplicate bugreport for this. Confirmed.
[Bug tree-optimization/111387] ICE on valid code at -O2 and -O3: verify_ssa failed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111387 --- Comment #3 from CVS Commits --- The master branch has been updated by Richard Biener : https://gcc.gnu.org/g:04238615bba435f0b0ca7b263ad2c6bdb596e865 commit r14-3920-g04238615bba435f0b0ca7b263ad2c6bdb596e865 Author: Richard Biener Date: Wed Sep 13 11:04:31 2023 +0200 tree-optimization/111387 - BB SLP and irreducible regions When we split an irreducible region for BB vectorization analysis the defensive handling of external backedge defs in vect_get_and_check_slp_defs doesn't work since that relies on dominance info to identify a backedge. The testcase also shows we are iterating over the function in a sub-optimal way which is why we split the irreducible region in the first place. The fix is to mark backedges and use EDGE_DFS_BACK to identify them and to use the region RPO compute which can produce a RPO order keeping cycles in a better order (and as side effect marks backedges). PR tree-optimization/111387 * tree-vect-slp.cc (vect_get_and_check_slp_defs): Check EDGE_DFS_BACK when doing BB vectorization. (vect_slp_function): Use rev_post_order_and_mark_dfs_back_seme to compute RPO and mark backedges. * gcc.dg/torture/pr111387.c: New testcase.
[Bug tree-optimization/111387] ICE on valid code at -O2 and -O3: verify_ssa failed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111387 Richard Biener changed: What|Removed |Added Known to work||14.0 --- Comment #4 from Richard Biener --- Fixed for trunk. The issue is latent but more difficult to trigger on the branch(es), a change less likely to change code generation would be to call mark_dfs_back_edges () and not change the iteration order.
[Bug tree-optimization/111345] `X % Y is smaller than Y.` pattern could be simpified
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111345 --- Comment #2 from CVS Commits --- The trunk branch has been updated by Andrew Pinski : https://gcc.gnu.org/g:635a34e2be67d709088c31573732dfdf733e4cec commit r14-3921-g635a34e2be67d709088c31573732dfdf733e4cec Author: Andrew Pinski Date: Tue Sep 12 10:43:23 2023 -0700 MATCH: Simplify `(X % Y) < Y` pattern. This merges the two patterns to catch `(X % Y) < Y` and `Y > (X % Y)` into one by using :c on the comparison operator. It does not change any code generation nor anything else. It is more to allow for better maintainability of this pattern. OK? Bootstrapped and tested on x86_64-linux-gnu. gcc/ChangeLog: PR tree-optimization/111345 * match.pd (`Y > (X % Y)`): Merge into ... (`(X % Y) < Y`): Pattern by adding `:c` on the comparison.
[Bug tree-optimization/111345] `X % Y is smaller than Y.` pattern could be simpified
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111345 Andrew Pinski changed: What|Removed |Added Target Milestone|--- |14.0 Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #3 from Andrew Pinski --- Fixed.
[Bug tree-optimization/111364] `MAX_EXPR <= a` is not optimized to `a >= b`
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111364 --- Comment #6 from CVS Commits --- The trunk branch has been updated by Andrew Pinski : https://gcc.gnu.org/g:06bedc3860d3e61857d72ffe699f79ed5c92855f commit r14-3922-g06bedc3860d3e61857d72ffe699f79ed5c92855f Author: Andrew Pinski Date: Tue Sep 12 05:16:06 2023 + MATCH: [PR111364] Add some more minmax cmp operand simplifications This adds a few more minmax cmp operand simplifications which were missed before. `MIN(a,b) < a` -> `a > b` `MIN(a,b) >= a` -> `a <= b` `MAX(a,b) > a` -> `a < b` `MAX(a,b) <= a` -> `a >= b` OK? Bootstrapped and tested on x86_64-linux-gnu. Note gcc.dg/pr96708-negative.c needed to updated to remove the check for MIN/MAX as they have been optimized (correctly) away. PR tree-optimization/111364 gcc/ChangeLog: * match.pd (`MIN (X, Y) == X`): Extend to min/lt, min/ge, max/gt, max/le. gcc/testsuite/ChangeLog: * gcc.c-torture/execute/minmaxcmp-1.c: New test. * gcc.dg/tree-ssa/minmaxcmp-2.c: New test. * gcc.dg/pr96708-negative.c: Update testcase. * gcc.dg/pr96708-positive.c: Add comment about `return 0`.
[Bug tree-optimization/111364] `MAX_EXPR <= a` is not optimized to `a >= b`
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111364 Andrew Pinski changed: What|Removed |Added Status|ASSIGNED|RESOLVED Target Milestone|--- |14.0 Resolution|--- |FIXED --- Comment #7 from Andrew Pinski --- Fixed.
[Bug c/111153] RISC-V: Incorrect Vector cost model for reduction
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53 --- Comment #2 from Robin Dapp --- With the current trunk we don't spill anymore: (VLS) .L4: vle32.v v2,0(a5) vadd.vv v1,v1,v2 addia5,a5,16 bne a5,a4,.L4 Considering just that loop I'd say costing works as designed. Even though the epilog and boilerplate code seems "crude" the main loop is as short as it can be and is IMHO preferable. .L3: vsetvli a5,a1,e32,m1,tu,ma sllia4,a5,2 sub a1,a1,a5 vle32.v v2,0(a0) add a0,a0,a4 vadd.vv v1,v2,v1 bne a1,zero,.L3 This has 6 instructions (disregarding the jump) and can't be faster than the 3 instructions for the VLS loop. Provided we iterate often enough the VLS loop should always be a win. Regarding "looking slow" - I think ideally we would have the VLS loop followed directly by the VLA loop for the residual iterations and next to no additional statements. That would require changes in the vectorizer, though. In total: I think the current behavior is reasonable.
[Bug c/111153] RISC-V: Incorrect Vector cost model for reduction
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53 --- Comment #3 from JuzheZhong --- (In reply to Robin Dapp from comment #2) > With the current trunk we don't spill anymore: > > (VLS) > .L4: > vle32.v v2,0(a5) > vadd.vv v1,v1,v2 > addia5,a5,16 > bne a5,a4,.L4 > > Considering just that loop I'd say costing works as designed. Even though > the epilog and boilerplate code seems "crude" the main loop is as short as > it can be and is IMHO preferable. > > .L3: > vsetvli a5,a1,e32,m1,tu,ma > sllia4,a5,2 > sub a1,a1,a5 > vle32.v v2,0(a0) > add a0,a0,a4 > vadd.vv v1,v2,v1 > bne a1,zero,.L3 > > This has 6 instructions (disregarding the jump) and can't be faster than the > 3 instructions for the VLS loop. Provided we iterate often enough the VLS > loop should always be a win. > > Regarding "looking slow" - I think ideally we would have the VLS loop > followed directly by the VLA loop for the residual iterations and next to no > additional statements. That would require changes in the vectorizer, though. > > In total: I think the current behavior is reasonable. Oh. I see. I just checked it now. .L4: vle32.v v2,0(a5) addia5,a5,16 vadd.vv v1,v1,v2 bne a5,a4,.L4 lui a4,%hi(.LC0) lui a5,%hi(.LC1) addia4,a4,%lo(.LC0) vlm.v v0,0(a4) addia5,a5,%lo(.LC1) andia1,a1,-4 vmv1r.v v2,v3 vlm.v v4,0(a5) vcompress.vmv2,v1,v0 vmv1r.v v0,v4 vadd.vv v1,v2,v1 vcompress.vmv3,v1,v0 vadd.vv v3,v3,v1 vmv.x.s a0,v3 sext.w a0,a0 beq a3,a1,.L12 It seems that the codegen will be even better if we support VLS mode reduction. I aggree that we first take VLS reduction choice then move to VLA reduction choice. But I wonder ARM SVE doesn't use this approach since they also has VLS mode (NEON/ADVSIMD).
[Bug c/111153] RISC-V: Incorrect Vector cost model for reduction
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53 --- Comment #4 from Robin Dapp --- Yes, with VLS reduction this will improve. On aarch64 + sve I see loop inside costs: 2 This is similar to our VLS costs. And their loop is indeed short: ld1wz30.s, p7/z, [x0, x2, lsl 2] add x2, x2, x3 add z31.s, p7/m, z31.s, z30.s whilelo p7.s, w2, w1 b.any .L3 Not much to be squeezed out with a VLS approach. I guess that's why.
[Bug target/105928] [AArch64] 64-bit constants with same high/low halves can use ADD lsl 32 (-Os at least)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105928 Wilco changed: What|Removed |Added CC||wilco at gcc dot gnu.org Assignee|unassigned at gcc dot gnu.org |wilco at gcc dot gnu.org --- Comment #2 from Wilco --- Shifted logical operations are single cycle on all recent cores.
[Bug tree-optimization/111294] [14 Regression] Missed Dead Code Elimination since r14-573-g69f1a8af45d
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111294 --- Comment #6 from Richard Biener --- So the issue is that forwprop & folding has a hard time in cleaning up dead code afterwards but it would also benefit from doing that more aggressively (and early) because of single_use () and friends. I'm thinking of hooking into update_stmt to discover candidates for simple-dce-from-worklist (likely not early and aggressive enough though).
[Bug jit/111396] Segfault when using -flto with libgccjit
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111396 David Malcolm changed: What|Removed |Added Ever confirmed|0 |1 Last reconfirmed||2023-09-13 Status|UNCONFIRMED |ASSIGNED --- Comment #1 from David Malcolm --- Thanks; I can reproduce the ICE with trunk, both with and without the patch you reference. Taking a look...
[Bug libstdc++/111390] libstdc++-v3/scripts/check_compile script is not useful
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111390 --- Comment #7 from joseph at codesourcery dot com --- Stubbing out execution of tests can be done with a suitable board file (a board file to stub out linking as well is a bit more complicated). https://gcc.gnu.org/pipermail/gcc/2017-September/224422.html
[Bug target/111404] New: [AArch64] 128-bit __sync_val_compare_and_swap is not atomic
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111404 Bug ID: 111404 Summary: [AArch64] 128-bit __sync_val_compare_and_swap is not atomic Product: gcc Version: 8.5.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: wilco at gcc dot gnu.org Target Milestone: --- This compiles __int128 f(__int128 *p, __int128 *q, __int128 x) { return __sync_val_compare_and_swap (p, *q, x); } into: f: ldp x6, x7, [x1] mov x4, x0 .L3: ldxpx0, x1, [x4] cmp x0, x6 ccmpx1, x7, 0, eq bne .L4 stlxp w5, x2, x3, [x4] cbnzw5, .L3 .L4: dmb ish ret This means if the compare fails, we return the value loaded via LDXP. However unless the STXP succeeds, this returned value is not single-copy atomic. So on failure we still need to execute STLXP.
[Bug c/111405] New: Problem with incorrect optimizion for "constexpr" function with possible overflow
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111405 Bug ID: 111405 Summary: Problem with incorrect optimizion for "constexpr" function with possible overflow Product: gcc Version: 13.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: 3180104919 at zju dot edu.cn Target Milestone: --- Created attachment 55891 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55891&action=edit A demo file contains a funciton that will be wrongly optimized using -O2 I happened to find this problem when I did the CSAPP lab. int isTmax(int x) { // make it all of 1 // it's quite strange that the results of x + 1 + x and x + x + 1 are different int c = x + x + 1; // check if it's all of 1 int flag_all_ones = !(~c); // avoid -1 int flag_not_neg1 = !!(x + 1); return flag_all_ones & flag_not_neg1; } This function will be incorrectly optimized to return zero only with "-O2" compiler flag. But in fact isTmax(0x7fff) should return 1. Here's the disassembly code using coredump: 12ac : // check if it's all of 1 int flag_all_ones = !(~c); // avoid -1 int flag_not_neg1 = !!(x + 1); return flag_all_ones & flag_not_neg1; } 12ac: b8 00 00 00 00 mov$0x0,%eax 12b1: c3 ret This function can be correctly compiled with no compiler optimization (-O0). And this behaviour always occurs using the latest 2 version gcc compiler (from 11.0 to 12.0). But using clang or msvc, everything works well. Thank you for your time.
[Bug other/111406] New: libiberty build produces errors with CC=clang, unsupported option '-print-multi-os-directory'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111406 Bug ID: 111406 Summary: libiberty build produces errors with CC=clang, unsupported option '-print-multi-os-directory' Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: other Assignee: unassigned at gcc dot gnu.org Reporter: dilfridge at gentoo dot org Target Milestone: --- This is a clone of https://bugs.gentoo.org/913750 With CC=clang, the build of libiberty (as part of gnu binutils) produces errors clang-16: error: unsupported option '-print-multi-os-directory' clang-16: error: no input files However, the build continues apparently fine... This stems from libiberty/Makefile.am: 385 # This is tricky. Even though CC in the Makefile contains 386 # multilib-specific flags, it's overridden by FLAGS_TO_PASS from the 387 # default multilib, so we have to take CFLAGS into account as well, 388 # since it will be passed the multilib flags. 389 MULTIOSDIR = `$(CC) $(CFLAGS) -print-multi-os-directory`
[Bug c/111405] Problem with incorrect optimizion for "constexpr" function with possible overflow
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111405 Andrew Pinski changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |INVALID --- Comment #1 from Andrew Pinski --- Signed integer overflow is undefined behavior. Use -fwrapv or unsigned to do the addition to get the behavior you want.
[Bug other/111406] libiberty build produces errors with CC=clang, unsupported option '-print-multi-os-directory'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111406 --- Comment #1 from Andrew Pinski --- >This stems from libiberty/Makefile.am: You mean Makefile.in, libiberty does not use automake.
[Bug c/111405] Problem with incorrect optimizion for "constexpr" function with possible overflow
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111405 --- Comment #2 from Wang Chenyu <3180104919 at zju dot edu.cn> --- (In reply to Andrew Pinski from comment #1) > Signed integer overflow is undefined behavior. > > Use -fwrapv or unsigned to do the addition to get the behavior you want. I see.. Thank you for your explanation
[Bug other/111406] libiberty build produces errors with CC=clang, unsupported option '-print-multi-os-directory'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111406 --- Comment #2 from Andreas K. Huettel --- Indeed, sorry https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=libiberty/Makefile.in#l388
[Bug tree-optimization/111294] [14 Regression] Missed Dead Code Elimination since r14-573-g69f1a8af45d
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111294 --- Comment #7 from Andrew Pinski --- Created attachment 55892 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55892&action=edit version of using simple_dce_from_worklist in forwprop This is a version of using simple_dce_from_worklist in forwprop I had tried at one point, but I don't remember why I did finish up this patch.
[Bug tree-optimization/111393] ICE: Segmentation fault src/gcc/toplev.cc:314
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111393 --- Comment #8 from AK --- > this does seem like a HW issue. Are you sure you have a decent RISCV machine > without any memory issues? > I suspect ninja is building with all of the cores which pushes the memory > usage high. possible. I have the https://sipeed.com/licheepi4a (licheepi 4a board) > Maybe lower the clock speed of the CPU you are using. will do. thanks
[Bug c/111400] Missing return sanitization only works in C++
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111400 --- Comment #5 from Andrew Pinski --- To be able to detect this, an ABI change would be needed as you need to pass back if the function fell through or not. Now for (non-address taken) static functions that should be ok. The check should happen on the caller side rather than the callee side as it is only undefined if the caller uses the value ...
[Bug tree-optimization/111407] New: ICE: SSA corruption due to widening_mul opt on conflict across an abnormal edge
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111407 Bug ID: 111407 Summary: ICE: SSA corruption due to widening_mul opt on conflict across an abnormal edge Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: qinzhao at gcc dot gnu.org Target Milestone: --- this bug was originally reported against GCC8.5 with profiling feedback. there were multiple similar failures due to this issue for our large application. Although we reduced the testing case to a very small size, and changed the variable names. the failure can only be repeated with -fprofile-use and the .gcda files. As a result, we cannot expose the testing case. With the small testing case, and debugging into GCC8, I finally locate the issue is: this is a bug in tree-ssa-math-opts.cc, when applying the widening mul optimization, The compiler needs to check whether the operand is in a ABNORMAL PHI, if YES, we should avoid the transformation. the following patch against GCC8 can fix the failure very well: diff -u -r -N -p gcc-8.5.0-20210514-org/gcc/tree-ssa-math-opts.c gcc-8.5.0-20210514/gcc/tree-ssa-math-opts.c --- gcc-8.5.0-20210514-org/gcc/tree-ssa-math-opts.c 2023-09-11 21:04:17.891403319 + +++ gcc-8.5.0-20210514/gcc/tree-ssa-math-opts.c 2023-09-13 15:35:44.962336530 + @@ -2346,6 +2346,14 @@ convert_mult_to_widen (gimple *stmt, gim if (!is_widening_mult_p (stmt, &type1, &rhs1, &type2, &rhs2)) return false; + /* if any one of rhs1 and rhs2 is subjust to abnormal coalescing + * avoid the tranform. */ + if ((TREE_CODE (rhs1) == SSA_NAME + && SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rhs1)) + || (TREE_CODE (rhs2) == SSA_NAME + && SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rhs2))) + return false; + to_mode = SCALAR_INT_TYPE_MODE (type); from_mode = SCALAR_INT_TYPE_MODE (type1); if (to_mode == from_mode) I checked the latest upstream GCC14, and found that the function "convert_mult_to_widen" has the same issue, need to be patched as well.
[Bug middle-end/111401] Middle-end: Missed optimization of MASK_LEN_FOLD_LEFT_PLUS
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111401 Robin Dapp changed: What|Removed |Added CC||rdapp at gcc dot gnu.org --- Comment #2 from Robin Dapp --- I played around with this a bit. Emitting a COND_LEN in if-convert is easy: _ifc__35 = .COND_ADD (_23, init_20, _8, init_20); However, during reduction handling we rely on the reduction being a gimple assign and binary operation, though so I needed to fix some places and indices as well as the proper mask. What complicates things a bit is that we assume that "init_20" (i.e. the reduction def) occurs once when we have it twice in the COND_ADD. I just special cased that for now. Is this the proper thing to do? diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc index 23c6e8259e7..e99add3cf16 100644 --- a/gcc/tree-vect-loop.cc +++ b/gcc/tree-vect-loop.cc @@ -3672,7 +3672,7 @@ vect_analyze_loop (class loop *loop, vec_info_shared *shared) static bool fold_left_reduction_fn (code_helper code, internal_fn *reduc_fn) { - if (code == PLUS_EXPR) + if (code == PLUS_EXPR || code == IFN_COND_ADD) { *reduc_fn = IFN_FOLD_LEFT_PLUS; return true; @@ -4106,8 +4106,11 @@ vect_is_simple_reduction (loop_vec_info loop_info, stmt_vec_info phi_info, return NULL; } - nphi_def_loop_uses++; - phi_use_stmt = use_stmt; + if (use_stmt != phi_use_stmt) + { + nphi_def_loop_uses++; + phi_use_stmt = use_stmt; + } @@ -7440,6 +7457,9 @@ vectorizable_reduction (loop_vec_info loop_vinfo, if (i == STMT_VINFO_REDUC_IDX (stmt_info)) continue; + if (op.ops[i] == op.ops[STMT_VINFO_REDUC_IDX (stmt_info)]) + continue; + Apart from that I think what's mainly missing is making the added code nicer. Going to attach a tentative patch later.
[Bug c/111398] GCC should warn if a struct with flexible array member is declared static or onstack
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111398 --- Comment #2 from Thorsten Glaser --- Right, which is why I suggested a -Wextra level option to warn about these. Thanks!
[Bug tree-optimization/111407] ICE: SSA corruption due to widening_mul opt on conflict across an abnormal edge
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111407 Andrew Pinski changed: What|Removed |Added Ever confirmed|0 |1 Last reconfirmed||2023-09-13 Status|UNCONFIRMED |WAITING --- Comment #1 from Andrew Pinski --- Do you have a testcase?
[Bug tree-optimization/111407] ICE: SSA corruption due to widening_mul opt on conflict across an abnormal edge
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111407 --- Comment #2 from Andrew Pinski --- Also what target is this for? I suspect aarch64 since x86_64 does not have widening multiply ...
[Bug tree-optimization/111407] ICE: SSA corruption due to widening_mul opt on conflict across an abnormal edge
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111407 --- Comment #3 from qinzhao at gcc dot gnu.org --- (In reply to Andrew Pinski from comment #1) > Do you have a testcase? I have, but I cannot expose it to public. I can provide the Bad IR and Good IR if you think it's okay.
[Bug tree-optimization/111407] ICE: SSA corruption due to widening_mul opt on conflict across an abnormal edge
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111407 --- Comment #4 from qinzhao at gcc dot gnu.org --- (In reply to Andrew Pinski from comment #2) > Also what target is this for? > I suspect aarch64 since x86_64 does not have widening multiply ... you are right, it's aarch64.
[Bug tree-optimization/111407] ICE: SSA corruption due to widening_mul opt on conflict across an abnormal edge
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111407 Andrew Pinski changed: What|Removed |Added Status|WAITING |NEW --- Comment #5 from Andrew Pinski --- Testcase: ``` enum { SEND_TOFILE } __sigsetjmp(); void fclose(); void foldergets(); void sendpart_stats(int *p1, int a1, int b1) { int *a = p1; fclose(); p1 = 0; long t = b1; if (__sigsetjmp()) { { long t1 = a1; a1+=1; fclose(a1*(long)t1); } } if (p1) fclose(); } ```
[Bug tree-optimization/111407] ICE: SSA corruption due to widening_mul opt on conflict across an abnormal edge
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111407 --- Comment #6 from Andrew Pinski --- (In reply to Andrew Pinski from comment #5) > Testcase: The way I figured out this testcase was trial and error and starting with the testcase from PR 69167 .
[Bug tree-optimization/94589] Optimize (i<=>0)>0 to i>0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94589 --- Comment #26 from Andrew Pinski --- Created attachment 55893 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55893&action=edit Here is my idea around the patch for prototype for doing the constant prop idea This is like the one in comment #13 but handles it in phiopt rather than forwprop.
[Bug tree-optimization/111407] ICE: SSA corruption due to widening_mul opt on conflict across an abnormal edge
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111407 --- Comment #7 from qinzhao at gcc dot gnu.org --- (In reply to Andrew Pinski from comment #5) > Testcase: thanks a lot for the testing case. GCC8 failed with this, disable tree-widening_mul fixed the failure. and my patch for GCC8 also fixed the failure. will test GCC14 as well.
[Bug tree-optimization/111407] ICE: SSA corruption due to widening_mul opt on conflict across an abnormal edge
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111407 --- Comment #8 from qinzhao at gcc dot gnu.org --- the latest GCC14 has the same issue. with the patch proposed in comment #1, the failure has been fixed.
[Bug driver/86030] specs file processing does not create response files for input directories
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86030 --- Comment #14 from John Soo --- > Here though it seems that you are dealing with another sort of limit which is > much larger (I have seen 128K being mentioned on the GH page).If this > somehow corrupts the command line, it wouldn't help if that command line went > into a response file because it would still be wrong.To my knowledge, > Linux-based systems don't have a command line length limitation, so I can't > see how a response file approach would be useful at the point where the > subprocess is spawned.Whether something similar can be used at an earlier > point to save it from the 128K limit, whatever it is, is unknown to me. It is a much larger limit (ARG_MAX resulting in E2BIG), but it is fundamentally the same problem. I think we should assume that the command line is correct and still respect ARG_MAX on linux/unix systems, too. It seems to me that the temporary response file is the best way to do this.
[Bug tree-optimization/78512] [7 Regression] r242674 miscompiles Linux kernel
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78512 Nick Desaulniers changed: What|Removed |Added CC||ndesaulniers at google dot com --- Comment #10 from Nick Desaulniers --- I'm not super happy that GCC has false-negatives when %p is encountered. Bugs do exist outside of the Linux kernel with the usage of %p that could be flagged. Clang-18 has recently added -Wno-format-overflow-non-kprintf and -Wformat-truncation-non-kprintf to emulate this behavior in https://github.com/llvm/llvm-project/pull/65969, which we will use in the kernel https://github.com/ClangBuiltLinux/linux/issues/1923#issuecomment-1718144462. At the least, I think this behavior wrt. %p should either be documented, or -Wno-format-overflow-non-kprintf and -Wformat-truncation-non-kprintf implemented in GCC. That said, this diagnostic catches real bugs! Linus turned them off, but we will work through fixing the instances identified towards the goal of getting them re-enabled for the Linux kernel. https://github.com/KSPP/linux/issues/343
[Bug c++/59526] [C++11] Defaulted special member functions don't accept noexcept if a member has a non-trivial noexcept operator in the corresponding special member function
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59526 --- Comment #4 from CVS Commits --- The master branch has been updated by Francois Dumont : https://gcc.gnu.org/g:92456291849fe88303bbcab366f41dcd4a885ad5 commit r14-3926-g92456291849fe88303bbcab366f41dcd4a885ad5 Author: François Dumont Date: Wed Aug 23 19:15:43 2023 +0200 libstdc++: [_GLIBCXX_INLINE_VERSION] Fix friend declaration GCC do not consider the inline namespace in friend function declarations. This is PR c++/59526, we need to explicit this namespace. libstdc++-v3/ChangeLog: * include/std/format (std::__format::_Arg_store): Explicit version namespace on make_format_args friend declaration.
[Bug tree-optimization/111408] New: [14 Regression] Wrong code at -O2/3 on x86_64-linux-gnu since r14-2866-ge68a31549d9
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111408 Bug ID: 111408 Summary: [14 Regression] Wrong code at -O2/3 on x86_64-linux-gnu since r14-2866-ge68a31549d9 Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: shaohua.li at inf dot ethz.ch Target Milestone: --- gcc at -O2/s produced the wrong code. Bisected to r14-2866-ge68a31549d9 Compiler explorer: https://godbolt.org/z/secjqP8ao $ cat a.c int printf(const char *, ...); int a, b, c, d; short e; int f() { c = a % (sizeof(int) * 8); if (b & 1 << c) return -1; return 0; } int main() { for (; e != 1; e++) { int g = f(); if (g + d - 9 + d) continue; for (;;) __builtin_abort(); } } $ $ gcc -O0 a.c && ./a.out $ $ gcc -O2 a.c && ./a.out [2]1281121 abort ./a.out $
[Bug middle-end/111401] Middle-end: Missed optimization of MASK_LEN_FOLD_LEFT_PLUS
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111401 --- Comment #3 from Robin Dapp --- Several other things came up, so I'm just going to post the latest status here without having revised or tested it. Going to try fixing it and testing tomorrow. --- a/gcc/tree-vect-loop.cc +++ b/gcc/tree-vect-loop.cc @@ -3672,7 +3672,7 @@ vect_analyze_loop (class loop *loop, vec_info_shared *shared) static bool fold_left_reduction_fn (code_helper code, internal_fn *reduc_fn) { - if (code == PLUS_EXPR) + if (code == PLUS_EXPR || code == IFN_COND_ADD) { *reduc_fn = IFN_FOLD_LEFT_PLUS; return true; @@ -4106,8 +4106,13 @@ vect_is_simple_reduction (loop_vec_info loop_info, stmt_vec_info phi_info, return NULL; } - nphi_def_loop_uses++; - phi_use_stmt = use_stmt; + /* We might have two uses in the same instruction, only count them as +one. */ + if (use_stmt != phi_use_stmt) + { + nphi_def_loop_uses++; + phi_use_stmt = use_stmt; + } } tree latch_def = PHI_ARG_DEF_FROM_EDGE (phi, loop_latch_edge (loop)); @@ -6861,7 +6866,7 @@ vectorize_fold_left_reduction (loop_vec_info loop_vinfo, gimple **vec_stmt, slp_tree slp_node, gimple *reduc_def_stmt, tree_code code, internal_fn reduc_fn, - tree ops[3], tree vectype_in, + tree *ops, int num_ops, tree vectype_in, int reduc_index, vec_loop_masks *masks, vec_loop_lens *lens) { @@ -6883,11 +6888,24 @@ vectorize_fold_left_reduction (loop_vec_info loop_vinfo, gcc_assert (known_eq (TYPE_VECTOR_SUBPARTS (vectype_out), TYPE_VECTOR_SUBPARTS (vectype_in))); - tree op0 = ops[1 - reduc_index]; + /* The operands either come from a binary operation or a COND_ADD operation. + The former is a gimple assign and the latter is a gimple call with four + arguments. */ + gcc_assert (num_ops == 2 || num_ops == 4); + bool is_cond_add = num_ops == 4; + tree op0, opmask; + if (!is_cond_add) +op0 = ops[1 - reduc_index]; + else +{ + op0 = ops[2]; + opmask = ops[0]; + gcc_assert (!slp_node); +} int group_size = 1; stmt_vec_info scalar_dest_def_info; - auto_vec vec_oprnds0; + auto_vec vec_oprnds0, vec_opmask; if (slp_node) { auto_vec > vec_defs (2); @@ -6903,9 +6921,18 @@ vectorize_fold_left_reduction (loop_vec_info loop_vinfo, vect_get_vec_defs_for_operand (loop_vinfo, stmt_info, 1, op0, &vec_oprnds0); scalar_dest_def_info = stmt_info; + if (is_cond_add) + { + vect_get_vec_defs_for_operand (loop_vinfo, stmt_info, 1, +opmask, &vec_opmask); + gcc_assert (vec_opmask.length() == 1); + } } - tree scalar_dest = gimple_assign_lhs (scalar_dest_def_info->stmt); + gimple *sdef = scalar_dest_def_info->stmt; + tree scalar_dest = is_gimple_call (sdef) + ? gimple_call_lhs (sdef) + : gimple_assign_lhs (scalar_dest_def_info->stmt); tree scalar_type = TREE_TYPE (scalar_dest); tree reduc_var = gimple_phi_result (reduc_def_stmt); @@ -6945,7 +6972,11 @@ vectorize_fold_left_reduction (loop_vec_info loop_vinfo, i, 1); signed char biasval = LOOP_VINFO_PARTIAL_LOAD_STORE_BIAS (loop_vinfo); bias = build_int_cst (intQI_type_node, biasval); - mask = build_minus_one_cst (truth_type_for (vectype_in)); + /* If we have a COND_ADD take its mask. Otherwise use {-1, ...}. */ + if (is_cond_add) + mask = vec_opmask[0]; + else + mask = build_minus_one_cst (truth_type_for (vectype_in)); } /* Handle MINUS by adding the negative. */ @@ -7440,6 +7471,9 @@ vectorizable_reduction (loop_vec_info loop_vinfo, if (i == STMT_VINFO_REDUC_IDX (stmt_info)) continue; + if (op.ops[i] == op.ops[STMT_VINFO_REDUC_IDX (stmt_info)]) + continue; + /* There should be only one cycle def in the stmt, the one leading to reduc_def. */ if (VECTORIZABLE_CYCLE_DEF (dt)) @@ -8211,8 +8245,21 @@ vect_transform_reduction (loop_vec_info loop_vinfo, vec_num = 1; } - code_helper code = canonicalize_code (op.code, op.type); - internal_fn cond_fn = get_conditional_internal_fn (code, op.type); + code_helper code (op.code); + internal_fn cond_fn; + + if (code.is_internal_fn ()) +{ + internal_fn ifn = internal_fn (op.code); + code = canonicalize_code (conditional_internal_fn_code (ifn), op.type); + cond_fn = ifn; +} + else +{ + code = canonicalize_code (op.code, op.type); + cond_fn = get_conditional_internal_fn (code, op.type); +} + vec_loop_masks *masks = &LOOP_
[Bug debug/111409] New: Invalid .debug_macro.dwo macro information for split DWARF
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111409 Bug ID: 111409 Summary: Invalid .debug_macro.dwo macro information for split DWARF Product: gcc Version: 13.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: debug Assignee: unassigned at gcc dot gnu.org Reporter: osandov at osandov dot com Target Milestone: --- When using -g3 -gsplit-dwarf, the generated macro information has a couple of issues. I'm using the following trivial source file: $ cat test.c #define ZERO 0 int main(void) { return ZERO; } First, GCC emits DW_MACRO_import entries, but they always have an offset of 0: $ gcc -g3 -gsplit-dwarf test.c $ readelf --debug-dump=macro a-test.dwo | head -n 14 Contents of the .debug_macro.dwo section: Offset: 0x0 Version: 5 Offset size: 4 Offset into .debug_line: 0x0 DW_MACRO_import - offset : 0x0 DW_MACRO_start_file - lineno: 0 filenum: 1 DW_MACRO_start_file - lineno: 0 filenum: 2 DW_MACRO_import - offset : 0x0 DW_MACRO_end_file DW_MACRO_define_strx lineno : 1 macro : DW_MACRO_end_file Second, each macro unit is in its own .debug_macro.dwo section: $ readelf -S -W a-test.dwo | grep -F .debug_macro [ 3] .debug_macro.dwo PROGBITS b2 1e 00 E 0 0 1 [ 4] .debug_macro.dwo PROGBITS d0 00059e 00 0 0 1 [ 5] .debug_macro.dwo PROGBITS 00066e 1b 00 0 0 1 As far as I can tell, the DWARF specification doesn't allow this, and tools seem to only use either the first or last section (gdb only finds the first one, and dwp only copies the last one into the .dwp file). These seem to have the same underlying cause: when not using split DWARF, the linker deduplicates units and relocates the import offsets appropriately, but this is not possible with split DWARF. I imagine that the fix would be to not use imports for split DWARF and only generate one macro unit per .dwo file containing everything. (P.S., -g3 -gdwarf-4 -gstrict-dwarf -gsplit-dwarf generates a valid .debug_macinfo.dwo because it doesn't have a notion of imports.)
[Bug target/111408] [14 Regression] Wrong code at -O2/3 on x86_64-linux-gnu since r14-2866-ge68a31549d9
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111408 Andrew Pinski changed: What|Removed |Added Target Milestone|--- |14.0 Target||x86_64-linux-nug Keywords||wrong-code Component|tree-optimization |target
[Bug target/111408] [14 Regression] Wrong code at -O2/3 on x86_64-linux-gnu since r14-2866-ge68a31549d9
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111408 Andrew Pinski changed: What|Removed |Added Last reconfirmed||2023-09-13 Status|UNCONFIRMED |NEW Ever confirmed|0 |1 --- Comment #1 from Andrew Pinski --- Confirmed.
[Bug target/111408] [14 Regression] Wrong code at -O2/3 on x86_64-linux-gnu since r14-2866-ge68a31549d9
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111408 --- Comment #2 from Andrew Pinski --- GCC 13.2: sarl%cl, %eax movld(%rip), %ecx andl$1, %eax andl$31, %edx leal-9(%rcx,%rcx), %ecx cmpl%eax, %ecx While the trunk: movla(%rip), %eax movlb(%rip), %ecx movl%eax, %edx andl$31, %edx btl %eax, %ecx The trunk somehow missed the whole 2*d - 9 part ...
[Bug target/110751] RISC-V: Suport undefined value that allows VSETVL PASS use TA/MA
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110751 rsandifo at gcc dot gnu.org changed: What|Removed |Added CC|richard.sandiford at arm dot com | --- Comment #39 from rsandifo at gcc dot gnu.org --- (In reply to Richard Biener from comment #34) > On RTL there are "special" RTXen used for this kind of stuff, like > (use:mode ..) or (clobber const0_rtx), but I'm the wrong person to > ask which one would be most appropriate for a general operand in > an otherwise generic instruction. Maybe Richard has a guess. I think the best bet with existing RTL is (scratch:). It's not an exact fit for current usage (or for the documentation), but it's similar in spirit to the cratch in (mem:BLK (scratch:P)) (which also isn't an exact match for the documentation). I don't expect this to work out of the box. Some changes to target-independent code will be needed. But if we restrict this use to expanders for now, the changes should be relatively small. I think the main thing would be to make maybe_legitimize_operand turn a scratch rtx into a fresh pseudo if the predicate doesn't accept a scratch. We'd then be restoring the semantics of an uninitialised SSA_NAME. If we did that, I think we could convert uninitialised SSA_NAMEs into SCRATCHes for everything that goes through expand_fn_using_insn. There should be no need to restrict it to COND_* functions.
[Bug tree-optimization/106164] (a > b) & (a >= b) does not get optimized until reassoc1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106164 --- Comment #14 from Andrew Pinski --- Created attachment 55894 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55894&action=edit Third patch to support the constants that are off by one This patch adds what I mentioned was missing in comment #9.
[Bug tree-optimization/106164] (a > b) & (a >= b) does not get optimized until reassoc1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106164 --- Comment #15 from Andrew Pinski --- Created attachment 55895 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55895&action=edit testcases for the constants off by one
[Bug target/111334] [14 regression] ICE is reported during the combine pass optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111334 --- Comment #20 from CVS Commits --- The master branch has been updated by LuluCheng : https://gcc.gnu.org/g:9a033b9feffc9d97d5acfe8ca3cd16359f4b714b commit r14-3974-g9a033b9feffc9d97d5acfe8ca3cd16359f4b714b Author: Lulu Cheng Date: Mon Sep 11 16:20:29 2023 +0800 LoongArch: Fix bug of 'di3_fake'. PR target/111334 gcc/ChangeLog: * config/loongarch/loongarch.md: Fix bug of 'di3_fake'. gcc/testsuite/ChangeLog: * gcc.target/loongarch/pr111334.c: New test.
[Bug c++/111410] New: Bogus Wdangling-reference warning with ranges pipe expression in for loop
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111410 Bug ID: 111410 Summary: Bogus Wdangling-reference warning with ranges pipe expression in for loop Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: hewillk at gmail dot com Target Milestone: --- #include #include #include int main() { std::vector v{1, 2, 3, 4, 5}; for (auto i : std::span{v} | std::views::take(1)) std::cout << i << '\n'; } GCC-trunk reports the following warning when the -Wall flag is used: :8:51: warning: possibly dangling reference to a temporary [-Wdangling-reference] 8 | for ( auto i : std::span{v} | std::views::take(1)) | ^ https://godbolt.org/z/5jhnTTej9
[Bug tree-optimization/111402] Loop distribution fail to optimize memmove for multiple consecutive moves within a loop
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111402 --- Comment #2 from Hongtao.liu --- Adjust code in foo1, use < n instead of != n, the issue remains. void foo1 (v4di* __restrict a, v4di *b, int n) { for (int i = 0; i < n; i+=2) { a[i] = b[i]; a[i+1] = b[i+1]; } }
[Bug target/111334] [14 regression] ICE is reported during the combine pass optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111334 chenglulu changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #21 from chenglulu --- fixed
[Bug target/111411] New: [14 regression] ICE when building opus-1.4 (celt_decoder.c:1182:1: internal compiler error: in extract_insn, at recog.cc:2791)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111411 Bug ID: 111411 Summary: [14 regression] ICE when building opus-1.4 (celt_decoder.c:1182:1: internal compiler error: in extract_insn, at recog.cc:2791) Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: sjames at gcc dot gnu.org Target Milestone: --- Created attachment 55896 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55896&action=edit celt_decoder.c.i ``` gcc (Gentoo 14.0.0 p, commit d0b55776a4e1d2f293db5ba0e4a04aefed055ec4) 14.0.0 20230913 (experimental) 3af2af15798cb6243e2643f98f62c9270b1ca5d2 Copyright (C) 2023 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. ``` ``` FAILED: celt/libopus-celt.a.p/celt_decoder.c.o aarch64-unknown-linux-gnu-gcc -Icelt/libopus-celt.a.p -Icelt -I../opus-1.4/celt -I. -I../opus-1.4 -Iinclude -I../opus-1.4/include -Isilk -I../opus-1.4/silk -fdiagnostics-color=always -D_FILE_OFFSET_BITS=64 -W all -Winvalid-pch -Wextra -std=gnu99 -DOPUS_BUILD -DHAVE_CONFIG_H -fvisibility=hidden -Wcast-align -Wnested-externs -Wshadow -Wstrict-prototypes -fstack-protector-strong -O3 -pipe -mcpu=native -fdiagnostics-c olor=always -ggdb3 -fPIC -MD -MQ celt/libopus-celt.a.p/celt_decoder.c.o -MF celt/libopus-celt.a.p/celt_decoder.c.o.d -o celt/libopus-celt.a.p/celt_decoder.c.o -c ../opus-1.4/celt/celt_decoder.c ../opus-1.4/celt/celt_decoder.c: In function ‘celt_decode_with_ec’: ../opus-1.4/celt/celt_decoder.c:1182:1: error: unrecognizable insn: 1182 | } | ^ (insn 5312 42 41 42 (parallel [ (set (mem/c:SI (plus:DI (reg/f:DI 29 x29) (const_int -260 [0xfefc])) [18 %sfp+-260 S4 A32]) (const_int 0 [0])) (set (mem/c:SI (plus:DI (reg/f:DI 29 x29) (const_int -256 [0xff00])) [18 %sfp+-256 S4 A32]) (const_int 0 [0])) ]) "../opus-1.4/celt/celt_decoder.c":978:22 -1 (nil)) during RTL pass: cprop_hardreg ../opus-1.4/celt/celt_decoder.c:1182:1: internal compiler error: in extract_insn, at recog.cc:2791 0xe5b09473 _fatal_insn(char const*, rtx_def const*, char const*, int, char const*) /usr/src/debug/sys-devel/gcc-14.0.0./gcc-14.0.0./gcc/rtl-error.cc:108 0xe5b0951b _fatal_insn_not_found(rtx_def const*, char const*, int, char const*) /usr/src/debug/sys-devel/gcc-14.0.0./gcc-14.0.0./gcc/rtl-error.cc:116 0xe626d693 extract_insn(rtx_insn*) /usr/src/debug/sys-devel/gcc-14.0.0./gcc-14.0.0./gcc/recog.cc:2791 0xe6273fb3 extract_constrain_insn(rtx_insn*) /usr/src/debug/sys-devel/gcc-14.0.0./gcc-14.0.0./gcc/recog.cc:2690 0xe6278d5b copyprop_hardreg_forward_1 /usr/src/debug/sys-devel/gcc-14.0.0./gcc-14.0.0./gcc/regcprop.cc:836 0xe627a7e3 execute /usr/src/debug/sys-devel/gcc-14.0.0./gcc-14.0.0./gcc/regcprop.cc:1423 Please submit a full bug report, with preprocessed source (by using -freport-bug). Please include the complete backtrace with any bug report. See <https://bugs.gentoo.org/> for instructions. ``` gcc -c celt_decoder.c.i -O3 is enough to repro.
[Bug target/111411] [14 regression] ICE when building opus-1.4 (celt_decoder.c:1182:1: internal compiler error: in extract_insn, at recog.cc:2791)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111411 Andrew Pinski changed: What|Removed |Added Target Milestone|--- |14.0 Keywords||ice-on-valid-code
[Bug target/111411] [14 regression] ICE when building opus-1.4, unrecognizable insn with -fstack-protector-strong
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111411 Andrew Pinski changed: What|Removed |Added Summary|[14 regression] ICE when|[14 regression] ICE when |building opus-1.4 |building opus-1.4, |(celt_decoder.c:1182:1: |unrecognizable insn with |internal compiler error: in |-fstack-protector-strong |extract_insn, at| |recog.cc:2791) | --- Comment #1 from Andrew Pinski --- I am 99% sure this was caused by the patch set that Richard S. did a few days ago.
[Bug target/111411] [14 regression] ICE when building opus-1.4, unrecognizable insn with -fstack-protector-strong
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111411 --- Comment #2 from Sam James --- Created attachment 55897 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55897&action=edit reduced.i I've attached a reduced version but the memcpy bit could do with cleaning up for it to be a bit more sensible.
[Bug tree-optimization/106164] (a > b) & (a >= b) does not get optimized until reassoc1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106164 --- Comment #16 from Andrew Pinski --- Next patch: https://gcc.gnu.org/pipermail/gcc-patches/2023-September/630241.html
[Bug target/111372] libgcc: RISCV C++ exception handling stack usage grew in 13.1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111372 --- Comment #5 from Kito Cheng --- > Ok, but it's better to have configure option or something else just > for toolchains that definitely do not use vector extension I can understand that there would be such a demand in the embedded world, but that's not critical issue, so this won't get high priority to most RISC-V GCC developer, it would be appreciate if you could send a patch for that.
[Bug middle-end/111401] Middle-end: Missed optimization of MASK_LEN_FOLD_LEFT_PLUS
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111401 --- Comment #4 from rguenther at suse dot de --- On Wed, 13 Sep 2023, rdapp at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111401 > > Robin Dapp changed: > >What|Removed |Added > > CC||rdapp at gcc dot gnu.org > > --- Comment #2 from Robin Dapp --- > I played around with this a bit. Emitting a COND_LEN in if-convert is easy: > > _ifc__35 = .COND_ADD (_23, init_20, _8, init_20); > > However, during reduction handling we rely on the reduction being a gimple > assign and binary operation, though so I needed to fix some places and indices > as well as the proper mask. > > What complicates things a bit is that we assume that "init_20" (i.e. the > reduction def) occurs once when we have it twice in the COND_ADD. I just > special cased that for now. Is this the proper thing to do? I think so - we should ignore a use in the else value when the other use is in that same stmt. > diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc > index 23c6e8259e7..e99add3cf16 100644 > --- a/gcc/tree-vect-loop.cc > +++ b/gcc/tree-vect-loop.cc > @@ -3672,7 +3672,7 @@ vect_analyze_loop (class loop *loop, vec_info_shared > *shared) > static bool > fold_left_reduction_fn (code_helper code, internal_fn *reduc_fn) > { > - if (code == PLUS_EXPR) > + if (code == PLUS_EXPR || code == IFN_COND_ADD) > { >*reduc_fn = IFN_FOLD_LEFT_PLUS; >return true; > @@ -4106,8 +4106,11 @@ vect_is_simple_reduction (loop_vec_info loop_info, > stmt_vec_info phi_info, >return NULL; > } > > - nphi_def_loop_uses++; > - phi_use_stmt = use_stmt; > + if (use_stmt != phi_use_stmt) > + { > + nphi_def_loop_uses++; > + phi_use_stmt = use_stmt; > + } > > @@ -7440,6 +7457,9 @@ vectorizable_reduction (loop_vec_info loop_vinfo, >if (i == STMT_VINFO_REDUC_IDX (stmt_info)) > continue; > > + if (op.ops[i] == op.ops[STMT_VINFO_REDUC_IDX (stmt_info)]) > + continue; > + > > Apart from that I think what's mainly missing is making the added code nicer. > Going to attach a tentative patch later. > >
[Bug middle-end/111401] Middle-end: Missed optimization of MASK_LEN_FOLD_LEFT_PLUS
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111401 --- Comment #5 from rguenther at suse dot de --- On Wed, 13 Sep 2023, rdapp at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111401 > > --- Comment #3 from Robin Dapp --- > Several other things came up, so I'm just going to post the latest status here > without having revised or tested it. Going to try fixing it and testing > tomorrow. I think what's important to do is make sure targets without masking are still getting the cond-reduction code generation (but with the signed-zero issue fixed). Using a cond_add is probably better than the vec_cond + add even for the not fold-left reduction case.
[Bug debug/111409] Invalid .debug_macro.dwo macro information for split DWARF
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111409 --- Comment #1 from Omar Sandoval --- Patch sent: https://gcc.gnu.org/pipermail/gcc-patches/2023-September/630242.html