[Bug target/118945] RISC-V: VSETL pass: Don't promote Vectors ops from Tail agnostic to Tail Undisturbed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118945 Jeffrey A. Law changed: What|Removed |Added Ever confirmed|0 |1 Status|UNCONFIRMED |NEW Last reconfirmed||2025-02-20
[Bug jit/117047] [15 regression] Segfault in libgccjit garbage collection when compiling GNU Emacs with Native Compilation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117047 --- Comment #15 from Sam James --- (In reply to David Malcolm from comment #14) > FWIW I tried again building emacs (from git) with gcc trunk with > --with-native-compilation=aot on x86_64 and, annoyingly, "make" completed > successfully; I see lots of >./native-lisp/31.0.50-677d9325/*.eln > which are "ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), > statically linked, not stripped" > I'll get back to trying to find how to configure GCC s.t. it happens. It seems like in the right environment, it always happens, I just don't know what the condition is yet. > How clean is Emacs under valgrind normally? It's clean "enough" if you... a) pass -DUSE_VALGRIND in CFLAGS or CPPFLAGS when building, and b) use a suppression file (like https://lists.gnu.org/archive/html/bug-gnu-emacs/2022-01/txtaJC0QpICF7.txt, which isn't perfect, but it made the output mostly clean for me) When I ran the crasher under Valgrind, the only output I saw besides GC noise at the beginning was the invalid access on the null deref. I didn't see anything that looked useful or around the time of the crash, and the bit I did see seemed like the usual innocent GC noise for Emacs.
[Bug target/118945] RISC-V: VSETL pass: Don't promote Vectors ops from Tail agnostic to Tail Undisturbed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118945 --- Comment #9 from JuzheZhong --- (In reply to Andrew Waterman from comment #8) > > In fact, I'd be rather surprised to see anything preferring tail > > undisturbed. > > Right. To be precise, microarchitectures without register renaming > absolutely do prefer to leave the tail undisturbed. But that's why the ISA > defines the agnostic mode in such a way that undisturbed is a valid > implementation of agnostic. (The in-order microarchitectures I've worked on > simply ignore the tail-/mask-agnostic setting; the state bits that control > the mode are essentially vestigial.) > > Since no plausible implementation will benefit from being in undisturbed > mode, we don't need to consider that aspect of the problem, but... > > > I prefer fewer "vsetvli" (which allows more fusion) by default. > > ...but here's the rub. Implementations that don't benefit from the agnostic > setting would definitely prefer to avoid the extra setvl instructions, not > because they're expensive, but because they're not free. > > > Some designs aren't sensitive to the number of vsetvls and I would expect > > that over time that's where high performance designs will land over time. > > Low-performance ones, too. (Making vset[i]vli fast is more of an > engineering cost than a silicon cost.) But the instructions still have to > be fetched and decoded, and registers have to be read and written, so the > perf cost will converge on that of, say, an ADDI instruction, which is to > say cheap but not zero. For narrow-issue machines, this does matter. > > > Obviously for your design you'll want to set the knob which says "minimize > > vsetvls" as opposed to "avoid false dependencies by preferring tail > > agnostic". That's easily handled by putting the data in the tuning > > structure for each design. > > And so this is the right answer :) In my uarch, "vsetvli" is cheap but is not zero-cost which is pretty similar ADDI. As andrew's said, for in-order microarchitecture, you can't ignore the cost of "vsetvli" that's why I prefer keep original "vsetvli" strategy (which is fusing "vsetvli" as many as possible) by default. For example, you should test it in K1 banana which is better ("keep agnostic but more vsetvli" vs "allow aggressive fusion into single undisturbed". Also, the example shows in the PR is not appropriate to make us to make a decision here since it just produce 1 vsetvli when you disable aggressive fusion into undisturbed which seems to not to be very costly. I think we should consider many more different situation and consider it carefully. Like: vsetvli ... e8,mf8 ta ma (demand ratio) ... vservli zero zero e32 mf2 tu ma (demand ratio) ... vservli zero zero e64 m1 ta ma (demand SEW and LMUL) ... vservli zero zero e64 m1 ta mu (demand ratio) ... vservli zero zero e16 mf4 tu mu(demand ratio) ... vservli zero zero e32 mf2 ta ma(demand ratio) ... vservli zero zero e8 mf8 ta ma(demand ratio) In current strategy, 7 "vsetvli" will be fused into 1 single "vsetvli": vservli ... e64 m1 tu mu However, if you just keep agnostic not allow to fuse it, you will end up with 6 more "vsetvli"s. I don't think this codegen can better in any micro-architecture design.
[Bug rtl-optimization/118946] Missed optimization: GCC reserves stack space for optimized-out variable
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118946 --- Comment #4 from Andrew Pinski --- (In reply to Jeffrey A. Law from comment #2) > Marking as a duplicate of one I happen to know about. I suspect there are > others. > > *** This bug has been marked as a duplicate of bug 94713 *** I think this is unrelated. The issue here is the tree level can't optimize away the memcpy into a memset. This is a dup of bug 117634 really. *** This bug has been marked as a duplicate of bug 117634 ***
[Bug target/118934] [15 Regression] RISC-V: ICE: output_operand: invalid expression as operand
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118934 Jeffrey A. Law changed: What|Removed |Added Last reconfirmed||2025-02-20 Status|UNCONFIRMED |WAITING Ever confirmed|0 |1 --- Comment #1 from Jeffrey A. Law --- We need a testcase. If you use cvise to reduce the testcase I bet the final result will be trivial in nature and probably trivial to obfuscate if you want. Note that WRF sources are generally available (http://www.wrf-model.org), so obfuscation may not be strictly necessary.
[Bug tree-optimization/118947] Missed optimization: GCC forgets stack buffer contents across function call
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118947 --- Comment #4 from Andrew Pinski --- Created attachment 60551 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=60551&action=edit Fixes bbb's stack usage This patch fixes the stack usage of bbb function in comment #0.
[Bug tree-optimization/118963] [13/14/15 regression] Miscompile at -O2/3 since r13-6945-g429a7a88438cc8
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118963 Andrew Pinski changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |DUPLICATE --- Comment #1 from Andrew Pinski --- Dup. *** This bug has been marked as a duplicate of bug 118922 ***
[Bug tree-optimization/118922] [13/14/15 regression] Miscompile at -O2/3 since r13-6945-g429a7a88438cc8
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118922 --- Comment #9 from Andrew Pinski --- *** Bug 118963 has been marked as a duplicate of this bug. ***
[Bug preprocessor/118860] [15 Regression] ICE Segfault with --param=file-cache-files= since r15-7431-g66af77cbed6c5b
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118860 Heiko Eißfeldt changed: What|Removed |Added CC||heiko at hexco dot de --- Comment #2 from Heiko Eißfeldt --- $ g++ pr31078.C --param=file-cache-files=16 -Wunused works $ g++ pr31078.C --param=file-cache-files=17 -Wunused does not Looks like file_cache::tune() needs to reallocate when num_file_slots_ is greater than the default size_t file_cache::num_file_slots = 16; used by the constructor file_cache::file_cache () : m_file_slots (new file_cache_slot[num_file_slots]) { initialize_input_context (nullptr, false); }
[Bug rtl-optimization/118947] Missed optimization: GCC forgets stack buffer contents across function call
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118947 Andrew Pinski changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED See Also||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=117634 Severity|normal |enhancement Assignee|unassigned at gcc dot gnu.org |pinskia at gcc dot gnu.org Last reconfirmed||2025-02-20 Ever confirmed|0 |1 --- Comment #2 from Andrew Pinski --- Related to PR 117634. Currerntly optimize_memcpy_to_memset does not skip over vdefs that can't clobber `buf` as it is supposed to be a simple analysis. But I suspect we could extend it to skip over vdefs that don't clobber the buf.
[Bug tree-optimization/118963] [13/14/15 regression] Miscompile at -O2/3 since r13-6945-g429a7a88438cc8
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118963 Sam James changed: What|Removed |Added Target Milestone|--- |13.4 Summary|Miscompile at -O2/3 |[13/14/15 regression] ||Miscompile at -O2/3 since ||r13-6945-g429a7a88438cc8 Keywords||wrong-code See Also||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=118922 Component|c |tree-optimization
[Bug target/80878] -mcx16 (enable 128 bit CAS) on x86_64 seems not to work on 7.1.0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80878 --- Comment #47 from LIU Hao --- (In reply to Luke Dalessandro from comment #46) > But if 104688 isn't related to this issue, and thus Jakub's comment was in > error, I definitely don't understand the underlying problem and why clang is > fine doing it. Issue here is that if atomic load is implemented with a call to libatomic routines then it's incorrect to implement CAS without a call.
[Bug target/118949] [15 regression] RISC-V: Extra FRM writes since GCC-14.2 since r15-5943-gdc0dea98c96e02
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118949 --- Comment #5 from Li Pan --- Thanks Vineet, update another case with explicit convert. It is unrelated to the global_reg change. 1 │ #define T float 2 │ 3 │ void func(const T * restrict a, const T * restrict b, 4 │ T * restrict c) 5 │ { 6 │ for (long i = 0; i < 1024; ++i) { 7 │ double a_d = (double)a[i]; 8 │ double b_d = (double)b[i]; 9 │ 10 │ long a_l = __builtin_lround(a_d); 11 │ long b_l = __builtin_lround(b_d); 12 │ 13 │ c[i] = (T)(a_l + b_l); 14 │ } 15 │ } The diff almost occurs after vect pass. from: vect__4.9_36 = .MASK_LEN_LOAD (vectp_a.7_38, 32B, { -1, ... }, _11, 0); vect__6.12_32 = .MASK_LEN_LOAD (vectp_b.10_34, 32B, { -1, ... }, _11, 0) vect_a_l_15.13_31 = .LROUND (vect__4.9_36); vect_b_l_16.14_30 = .LROUND (vect__6.12_32); vect__7.15_29 = vect_a_l_15.13_31 + vect_b_l_16.14_30; vect__9.16_28 = (vector([2,2]) float) vect__7.15_29; .MASK_LEN_STORE (vectp_c.17_26, 32B, { -1, ... }, _11, 0, vect__9.16_28); to: vect__4.9_43 = .MASK_LEN_LOAD (vectp_a.7_46, 32B, { -1, ... }, _44(D), _23, 0); vect_a_d_14.10_42 = (vector([2,2]) double) vect__4.9_43; // Only in GCC-15 vect_a_l_17.11_41 = .LROUND (vect_a_d_14.10_42); vect__6.14_36 = .MASK_LEN_LOAD (vectp_b.12_39, 32B, { -1, ... }, _37(D), _23, 0); vect_b_d_16.15_35 = (vector([2,2]) double) vect__6.14_36; // Only in GCC-15 vect_b_l_18.16_34 = .LROUND (vect_b_d_16.15_35); vect__7.17_33 = vect_a_l_17.11_41 + vect_b_l_18.16_34; vect__9.18_32 = (vector([2,2]) float) vect__7.17_33; .MASK_LEN_STORE (vectp_c.19_30, 32B, { -1, ... }, _23, 0, vect__9.18_32); looks like have more convert after load...
[Bug tree-optimization/107263] Memcpy not elided when initializing struct
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107263 --- Comment #4 from Andrew Pinski --- (In reply to AK from comment #3) > Seems like a duplicate of #59863 ? No different issue . There we have an array which is all the way constant but here we have a non-constant part.
[Bug middle-end/23782] SRA pessimizes passing structures by value at -Os (+22% code size)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=23782 Andrew Pinski changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |pinskia at gcc dot gnu.org Component|target |middle-end --- Comment #8 from Andrew Pinski --- .
[Bug target/118955] Fortran uses vector math functions without -ffast-math
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118955 --- Comment #9 from Richard Biener --- I have also always wondered about that glibc guard, esp. it being the kitchen-sink fast-math guard rather than sth more specific (yep, we don't have anything for -funsafe-math-optimizations). That is, I suppose glibc does not set FP exception flags "correctly" either. Is there documentation on what you can expect from the glibc vector math functions with regard to IEEE conformance?
[Bug libfortran/118935] Segmentation fault in 'libgomp.fortran/rwlock_1.f90' when compiling libgfortran with '-O0'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118935 --- Comment #10 from Thomas Koenig --- What does the OpenMP standard say about I/O in partallel exexution?
[Bug target/118952] AArch64 get_fpcr and set_fpcr builtins don't block reordering of operations past them
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118952 Richard Sandiford changed: What|Removed |Added See Also||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=34678 CC||rsandifo at gcc dot gnu.org --- Comment #1 from Richard Sandiford --- I think this is essentially the same problem as PR34678.
[Bug libstdc++/118559] [15 Regression] __array_rank is broken for clang so need workaround in libstdc++
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118559 Jonathan Wakely changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #2 from Jonathan Wakely --- Fixed
[Bug libstdc++/118559] [15 Regression] __array_rank is broken for clang so need workaround in libstdc++
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118559 --- Comment #1 from GCC Commits --- The master branch has been updated by Jonathan Wakely : https://gcc.gnu.org/g:c0e865f73ddee2e7247a23a7d57ad80261861d35 commit r15-7650-gc0e865f73ddee2e7247a23a7d57ad80261861d35 Author: Jonathan Wakely Date: Wed Feb 19 14:46:32 2025 + libstdc++: Workaround Clang bug with __array_rank built-in [PR118559] We started using the __array_rank built-in with r15-1252-g6f0dfa6f1acdf7 but that built-in is buggy in versions of Clang up to and including 19. libstdc++-v3/ChangeLog: PR libstdc++/118559 * include/std/type_traits (rank, rank_v): Do not use __array_rank for Clang 19 and older.
[Bug libstdc++/118855] Simplify when __builtin_*g builtins are available
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118855 --- Comment #4 from GCC Commits --- The master branch has been updated by Jonathan Wakely : https://gcc.gnu.org/g:e8ad697a75b0870a833366daf687668a57cabb6e commit r15-7648-ge8ad697a75b0870a833366daf687668a57cabb6e Author: Jonathan Wakely Date: Wed Feb 19 14:48:04 2025 + libstdc++: Use new type-generic built-ins in [PR118855] This makes several functions in faster to compile, with fewer expressions to parse and fewer instantiations of __numeric_traits required. libstdc++-v3/ChangeLog: PR libstdc++/118855 * include/std/bit (__count_lzero, __count_rzero, __popcount): Use type-generic built-ins when available.
[Bug libstdc++/118855] Simplify when __builtin_*g builtins are available
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118855 Jonathan Wakely changed: What|Removed |Added Target Milestone|--- |15.0 Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #5 from Jonathan Wakely --- Done
[Bug libstdc++/104928] std::counting_semaphore on Linux can sleep forever
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104928 Jonathan Wakely changed: What|Removed |Added Target Milestone|15.0|16.0
[Bug c++/118951] New: __FILE__ inserts the filename as array, __builtin_FILE as pointer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118951 Bug ID: 118951 Summary: __FILE__ inserts the filename as array, __builtin_FILE as pointer Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: fabian_kessler at gmx dot de Target Milestone: --- __builtin_FILE should return the FILE_PATH as array, not as pointer. This wouldn't have any drawbacks, since an array can be decayed as a pointer, but it is currently not possible, to do it the other way round for 0-terminated strings. Returning __builtin_FILE as array will allow code like the following: ``` template struct strong_alias<>{/*...*/}; ``` In the current situation, it is only possible, to do this via macros. Also add __builtin_FILE_NAME, which is supported by clang.
[Bug target/118952] New: AArch64 get_fpcr and set_fpcr builtins don't block reordering of operations past them
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118952 Bug ID: 118952 Summary: AArch64 get_fpcr and set_fpcr builtins don't block reordering of operations past them Product: gcc Version: 15.0 Status: UNCONFIRMED Keywords: wrong-code Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: ktkachov at gcc dot gnu.org Target Milestone: --- Target: aarch64 The __builtin_aarch64_set_fpcr and __builtin_aarch64_get_fpcr builtins are not as useful in practice as we'd like. For the input code: #include #include uint64_t foo (uint32_t *in_fpcr, uint32_t src) { uint64_t dst; uint32_t saved_fpcr; float fsrc; saved_fpcr = __builtin_aarch64_get_fpcr(); __builtin_aarch64_set_fpcr(*in_fpcr); memcpy(&fsrc, &src, 4); double d = (double) fsrc; memcpy(&dst, &d, 8); *in_fpcr = __builtin_aarch64_get_fpcr(); __builtin_aarch64_set_fpcr(saved_fpcr); return dst; } at -O2 we get: foo: fmovs31, w1 mrs x1, fpcr ldr w2, [x0] msr fpcr, x2 mrs x2, fpcr str w2, [x0] msr fpcr, x1 fcvtd31, s31 fmovx0, d31 ret The problem is that the fcvt is moved outside the region that has a modified FPCR, defeating the purpose of the builtins. I initially thought this was the RTL insn scheduler moving the operations but the RTL patterns for the builtins do use unspec_volatile that is supposed to prevent such movement. But the problem seems to be at expand-time. The GIMPLE looks correct: saved_fpcr_6 = __builtin_aarch64_get_fpcr (); _1 = *in_fpcr_7(D); __builtin_aarch64_set_fpcr (_1); _14 = VIEW_CONVERT_EXPR(src_9(D)); _2 = (double) _14; _10 = VIEW_CONVERT_EXPR(_2); _3 = __builtin_aarch64_get_fpcr (); *in_fpcr_7(D) = _3; __builtin_aarch64_set_fpcr (saved_fpcr_6); return _10; but the RTL generation is: (insn 2 5 3 2 (set (reg/v/f:DI 107 [ in_fpcr ]) (reg:DI 0 x0 [ in_fpcr ])) "fpcr.c":4:48 -1 (nil)) (insn 3 2 4 2 (set (reg/v:SI 108 [ src ]) (reg:SI 1 x1 [ src ])) "fpcr.c":4:48 -1 (nil)) (note 4 3 7 2 NOTE_INSN_FUNCTION_BEG) (insn 7 4 8 2 (set (reg/v:SI 104 [ saved_fpcr ]) (unspec_volatile:SI [ (const_int 0 [0]) ] UNSPECV_GET_FPCR)) "fpcr.c":9:18 -1 (nil)) (insn 8 7 9 2 (set (reg:SI 109) (mem:SI (reg/v/f:DI 107 [ in_fpcr ]) [1 *in_fpcr_7(D)+0 S4 A32])) "fpcr.c":10:5 -1 (nil)) (insn 9 8 10 2 (unspec_volatile [ (reg:SI 109) ] UNSPECV_SET_FPCR) "fpcr.c":10:5 -1 (nil)) (insn 10 9 11 2 (set (reg:SI 103 [ _3 ]) (unspec_volatile:SI [ (const_int 0 [0]) ] UNSPECV_GET_FPCR)) "fpcr.c":16:16 -1 (nil)) (insn 11 10 12 2 (set (mem:SI (reg/v/f:DI 107 [ in_fpcr ]) [1 *in_fpcr_7(D)+0 S4 A32]) (reg:SI 103 [ _3 ])) "fpcr.c":16:14 discrim 1 -1 (nil)) (insn 12 11 13 2 (unspec_volatile [ (reg/v:SI 104 [ saved_fpcr ]) ] UNSPECV_SET_FPCR) "fpcr.c":17:5 -1 (nil)) (insn 13 12 14 2 (set (reg:DF 111 [ _2 ]) (float_extend:DF (subreg:SF (reg/v:SI 108 [ src ]) 0))) "fpcr.c":13:16 -1 (nil)) (insn 14 13 18 2 (set (reg:DI 106 [ ]) (subreg:DI (reg:DF 111 [ _2 ]) 0)) "fpcr.c":18:12 -1 (nil)) insn 13 has been moved past the GET_FPCR and SET_FPCR builtins. Is that something the out-of-ssa code is doing?
[Bug tree-optimization/118521] [15 regression] std::vector Wstringop-overflow false positive since r15-4473
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118521 Richard Biener changed: What|Removed |Added See Also||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=118817 --- Comment #9 from Richard Biener --- Looks similar to PR118817 btw. Like that we're diagnosing from the strlen pass which has a somewhat unfortunate pass position. [local count: 131235111]: _53 = operator new (2); [local count: 131235111]: MEM [(char * {ref-all})_53] = MEM [(char * {ref-all})&C.0]; __result_46 = _53 + 2; _150 = operator new (4); goto ; [100.00%] [local count: 131235112]: _97 = _150 + 2; __builtin_memset (_97, 0, 2); MEM [(char * {ref-all})_150] = 513; __result_274 = _150 + 1; __new_finish_106 = __result_274 + 3; operator delete (_53, 2); _115 = _150 + 4; if (__new_finish_106 != _115) goto ; [82.57%] else goto ; [17.43%] [local count: 108360832]: MEM[(char *)_97 + 2B] = 1; like in the other PR we are missing the power of forwprop which would have accumulated the constant adjustments, eliding the BB6 enter condition. As you say it's SCCP exposing the opportunity. Neither FRE nor DOM have the ability to prove equivalence on larger expressions like this, aka (_150 + 4) == ((_150 + 1) + 3) but they instead rely on instruction combinations. Now, FRE does "fold" each stmt, but tries to simplify it down to a constant/copy and if that's not possible goes with the original stmt for further processing rather than using the simplified expression. That's wasteful. It also folds at elimination time, so this early folding is supposedly redundant iff we think the IL should be always fully folded (which is isn't, obviously). For PR118817 I've addressed this case in PRE. For the more general VN case it's a bit more difficult to do cleanly and definitely out-of-scope for stage4. I'll see what the fallout is when moving forwprop4 earlier (the late passes are oddly ordered IMO). There's also the pragmatic way of dealing with this in VN which is replacing the simplification attempt with in-place folding, but that's only OK when not iterating (or we're first-time visiting a stmt, but I'd rather not go there). There's unfortunately a difference between what fold_stmt and gimple_fold_stmt_to_constant does ... but maybe it does not matter ... turns out it does. All of the attempts have testsuite fallout, of course. Before r5-1495-g24314386b32b93 strlen was even earlier, but it was specifically placed before VRP. forwprop is currently specifically after DSE/DCE because the single-use gates benefit from DCEd IL. strlen OTOH is a source of constants and pruned memory ops so placing it before DSE/DCE makes sense. At r5-1495-g24314386b32b93 there wasn't a CCP after VRP, so it might be possible to move strlen a bit later (but then it will be after another jump threading...). Moving forwprop between pass_thread_jumps and pass_dominator does have quite some diagnostic fallout. Doing diff --git a/gcc/passes.def b/gcc/passes.def index 9fd85a35a63..c02fd0e186d 100644 --- a/gcc/passes.def +++ b/gcc/passes.def @@ -346,9 +346,10 @@ along with GCC; see the file COPYING3. If not see form if possible. */ NEXT_PASS (pass_thread_jumps, /*first=*/false); NEXT_PASS (pass_dominator, false /* may_peel_loop_headers_p */); - NEXT_PASS (pass_strlen); NEXT_PASS (pass_thread_jumps_full, /*first=*/false); NEXT_PASS (pass_vrp, true /* final_p */); + NEXT_PASS (pass_forwprop, /*last=*/true); + NEXT_PASS (pass_strlen); /* Run CCP to compute alignment and nonzero bits. */ NEXT_PASS (pass_ccp, true /* nonzero_p */); NEXT_PASS (pass_warn_restrict); @@ -356,7 +357,6 @@ along with GCC; see the file COPYING3. If not see NEXT_PASS (pass_dce, true /* update_address_taken_p */, true /* remove_unused_locals */); /* After late DCE we rewrite no longer addressed locals into SSA form if possible. */ - NEXT_PASS (pass_forwprop, /*last=*/true); NEXT_PASS (pass_sink_code, true /* unsplit edges */); NEXT_PASS (pass_phiopt, false /* early_p */); NEXT_PASS (pass_fold_builtins); An even more pragmatic approach is a single-level of folding uses (for changed defs) from SCCP. For full effect it would use a worklist and re-fold uses of defs of folded uses as well. Similar like simple_dce_from_worklist (which could also re-fold uses of defs that become single-use for example). diff --git a/gcc/tree-scalar-evolution.cc b/gcc/tree-scalar-evolution.cc index 0ba85917d41..a0d1c2f3d86 100644 --- a/gcc/tree-scalar-evolution.cc +++ b/gcc/tree-scalar-evolution.cc @@ -284,6 +284,7 @@ along with GCC; see the file COPYING3. If not see #include "tree-into-ssa.h" #include "builtins.h" #include "case-cfn-macros.h" +#include "tree-eh.h" static tree analyze_sc
[Bug target/109780] [12/13/14/15 Regression] csmith: runtime crash with -O2 -march=znver1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109780 Bug 109780 depends on bug 118936, which changed state. Bug 118936 Summary: [15 Regression] ICE in ix86_finalize_stack_frame_flags, at config/i386/i386.cc:8683 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118936 What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED
[Bug target/118936] [15 Regression] ICE in ix86_finalize_stack_frame_flags, at config/i386/i386.cc:8683
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118936 Sam James changed: What|Removed |Added CC||sjames at gcc dot gnu.org --- Comment #15 from Sam James --- Patch was reverted in r15-7634-g0312d11be3f666 and r15-7635-g6921c93d205203 to try again in GCC 15. The revert fixes this PR. Testcase was added, so fixed.
[Bug target/109093] [15 regression] csmith: a February runtime bug ?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109093 Bug 109093 depends on bug 118936, which changed state. Bug 118936 Summary: [15 Regression] ICE in ix86_finalize_stack_frame_flags, at config/i386/i386.cc:8683 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118936 What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED
[Bug fortran/107635] [Coarray] Allocatable components of types defined in module's interface are not handled correctly when used in coarrays.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107635 --- Comment #9 from GCC Commits --- The master branch has been updated by Andre Vehreschild : https://gcc.gnu.org/g:69eb02682b80b84dd0f562f19821c8c8c37ad243 commit r15-7642-g69eb02682b80b84dd0f562f19821c8c8c37ad243 Author: Andre Vehreschild Date: Wed Jan 29 12:42:18 2025 +0100 Fortran: Add send_to_remote [PR107635] Refactor to use send_to_remote instead of the slow send_by_ref. gcc/fortran/ChangeLog: PR fortran/107635 * coarray.cc (move_coarray_ref): Move the coarray reference out of the given one. Especially when there is a regular array ref. (fixup_comp_refs): Move components refs to a derived type where the codim has been removed, aka a new type. (split_expr_at_caf_ref): Correctly split the reference chain. (remove_caf_ref): Simplify. (create_get_callback): Fix some deficiencies. (create_allocated_callback): Adapt to new signature of split. (create_send_callback): New function. (rewrite_caf_send): Rewrite a call to caf_send to caf_send_to_remote. (coindexed_code_callback): Treat caf_send and caf_sendget correctly. * gfortran.h (enum gfc_isym_id): Add SENDGET-isym. * gfortran.texi: Add documentation for send_to_remote. * resolve.cc (gfc_resolve_code): No longer generate send_by_ref when allocatable coarray (component) is on the lhs. * trans-decl.cc (gfc_build_builtin_function_decls): Add caf_send_to_remote decl. * trans-intrinsic.cc (conv_caf_func_index): Ensure the static variables created are not in a block-scope. (conv_caf_send_to_remote): Translate caf_send_to_remote calls. (conv_caf_send): Renamed to conv_caf_sendget. (conv_caf_sendget): Renamed from conv_caf_send. (gfc_conv_intrinsic_subroutine): Branch correctly for conv_caf_send and sendget. * trans.h: Correct decl. libgfortran/ChangeLog: * caf/libcaf.h: Add/Correct prototypes for caf_get_from_remote, caf_send_to_remote. * caf/single.c (struct accessor_hash_t): Rename accessor_t to getter_t. (_gfortran_caf_register_accessor): Use new name of getter_t. (_gfortran_caf_send_to_remote): New function for sending data to coarray on a remote image. gcc/testsuite/ChangeLog: * gfortran.dg/coarray/send_char_array_1.f90: Extend test to catch more cases. * gfortran.dg/coarray_42.f90: Invert tests use, because no longer a send is needed when local memory in a coarray is allocated.
[Bug fortran/107635] [Coarray] Allocatable components of types defined in module's interface are not handled correctly when used in coarrays.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107635 --- Comment #11 from GCC Commits --- The master branch has been updated by Andre Vehreschild : https://gcc.gnu.org/g:d3244675441faf9c2d3949821f7deee34705e9c8 commit r15-7644-gd3244675441faf9c2d3949821f7deee34705e9c8 Author: Andre Vehreschild Date: Fri Feb 7 12:09:53 2025 +0100 Fortran: Remove deprecated coarray routines [PR107635] gcc/fortran/ChangeLog: PR fortran/107635 * gfortran.texi: Remove deprecated functions from documentation. * trans-decl.cc (gfc_build_builtin_function_decls): Remove decprecated function decls. * trans-intrinsic.cc (gfc_conv_intrinsic_exponent): Remove deprecated/no longer needed routines. * trans.h: Remove unused decls. libgfortran/ChangeLog: * caf/libcaf.h (_gfortran_caf_get): Removed because deprecated. (_gfortran_caf_send): Same. (_gfortran_caf_sendget): Same. (_gfortran_caf_send_by_ref): Same. * caf/single.c (assign_char4_from_char1): Same. (assign_char1_from_char4): Same. (convert_type): Same. (defined): Same. (_gfortran_caf_get): Same. (_gfortran_caf_send): Same. (_gfortran_caf_sendget): Same. (copy_data): Same. (get_for_ref): Same. (_gfortran_caf_get_by_ref): Same. (send_by_ref): Same. (_gfortran_caf_send_by_ref): Same. (_gfortran_caf_sendget_by_ref): Same.
[Bug fortran/107635] [Coarray] Allocatable components of types defined in module's interface are not handled correctly when used in coarrays.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107635 --- Comment #8 from GCC Commits --- The master branch has been updated by Andre Vehreschild : https://gcc.gnu.org/g:15847252648ede9d2ad9eea398b7b870f62a2b30 commit r15-7641-g15847252648ede9d2ad9eea398b7b870f62a2b30 Author: Andre Vehreschild Date: Wed Jan 22 15:12:29 2025 +0100 Fortran: Add caf_is_present_on_remote. [PR107635] Replace caf_is_present by caf_is_present_on_remote which is using a dedicated callback for each object to test on the remote image. gcc/fortran/ChangeLog: PR fortran/107635 * coarray.cc (create_allocated_callback): Add creating remote side procedure for checking allocation status of coarray. (rewrite_caf_allocated): Rewrite ALLOCATED on coarray to use caf routine. (coindexed_expr_callback): Exempt caf_is_present_on_remote from being rewritten again. * gfortran.h (enum gfc_isym_id): Add caf_is_present_on_remote id. * gfortran.texi: Add documentation for caf_is_present_on_remote. * intrinsic.cc (add_functions): Add caf_is_present_on_remote symbol. * trans-decl.cc (gfc_build_builtin_function_decls): Define interface of caf_is_present_on_remote. * trans-intrinsic.cc (gfc_conv_intrinsic_caf_is_present_remote): Translate caf_is_present_on_remote. (trans_caf_is_present): Remove. (caf_this_image_ref): Remove. (gfc_conv_allocated): Take out coarray treatment, because that is rewritten to caf_is_present_on_remote now. (gfc_conv_intrinsic_function): Handle caf_is_present_on_remote calls. * trans.h: Add symbol for caf_is_present_on_remote and remove old one. libgfortran/ChangeLog: * caf/libcaf.h (_gfortran_caf_is_present_on_remote): Add new function. (_gfortran_caf_is_present): Remove deprecated one. * caf/single.c (struct accessor_hash_t): Add function ptr access for remote side call. (_gfortran_caf_is_present_on_remote): Added. (_gfortran_caf_is_present): Removed. gcc/testsuite/ChangeLog: * gfortran.dg/coarray/coarray_allocated.f90: Adapt to new method of checking on remote image. * gfortran.dg/coarray_lib_alloc_4.f90: Same.
[Bug fortran/107635] [Coarray] Allocatable components of types defined in module's interface are not handled correctly when used in coarrays.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107635 --- Comment #5 from GCC Commits --- The master branch has been updated by Andre Vehreschild : https://gcc.gnu.org/g:90ba8291c31f2cfb6a8c7bf0c0d6a9d93bbbacc9 commit r15-7638-g90ba8291c31f2cfb6a8c7bf0c0d6a9d93bbbacc9 Author: Andre Vehreschild Date: Wed Jan 8 12:33:27 2025 +0100 Fortran: Move caf_get-rewrite to coarray.cc [PR107635] Add a rewriter to keep all expression tree that is not optimization together. At the moment this is just a move from resolve.cc, but will be extended to handle more cases where rewriting the expression tree may be easier. The first use case is to extract accessors for coarray remote image data access. gcc/fortran/ChangeLog: PR fortran/107635 * Make-lang.in: Add coarray.cc. * coarray.cc: New file. * gfortran.h (gfc_coarray_rewrite): New procedure. * parse.cc (rewrite_expr_tree): Add entrypoint for rewriting expression trees. * resolve.cc (gfc_resolve_ref): Remove caf_lhs handling. (get_arrayspec_from_expr): Moved to rewrite.cc. (remove_coarray_from_derived_type): Same. (convert_coarray_class_to_derived_type): Same. (split_expr_at_caf_ref): Same. (check_add_new_component): Same. (create_get_parameter_type): Same. (create_get_callback): Same. (add_caf_get_intrinsic): Same. (resolve_variable): Remove caf_lhs handling. libgfortran/ChangeLog: * caf/single.c (_gfortran_caf_finalize): Free memory preventing leaks. (_gfortran_caf_get_by_ct): Fix constness. * caf/libcaf.h (_gfortran_caf_register_accessor): Fix constness.
[Bug fortran/107635] [Coarray] Allocatable components of types defined in module's interface are not handled correctly when used in coarrays.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107635 --- Comment #7 from GCC Commits --- The master branch has been updated by Andre Vehreschild : https://gcc.gnu.org/g:abbfeb2ecbb5e90aa5d68e489ac283348ee6b8d5 commit r15-7640-gabbfeb2ecbb5e90aa5d68e489ac283348ee6b8d5 Author: Andre Vehreschild Date: Wed Jan 22 13:36:21 2025 +0100 Fortran: Allow to use non-pure/non-elemental functions in coarray indexes [PR107635] Extract calls to non-pure or non-elemental functions from index expressions on a coarray. gcc/fortran/ChangeLog: PR fortran/107635 * coarray.cc (get_arrayspec_from_expr): Treat array result of function calls correctly. (remove_coarray_from_derived_type): Prevent memory loss. (add_caf_get_from_remote): Correct locus. (find_comp): New function to find or create a new component in a derived type. (check_add_new_comp_handle_array): Handle allocatable arrays or non-pure/non-elemental functions in indexes of coarrays. (check_add_new_component): Use above function. (create_get_parameter_type): Rename to create_caf_add_data_parameter_type. (create_caf_add_data_parameter_type): Renaming of variable and make the additional data a coarray. (remove_caf_ref): Factor out to reuse in other caf-functions. (create_get_callback): Use function factored out, set locus correctly and ensure a kind is set for parameters. (add_caf_get_intrinsic): Rename to add_caf_get_from_remote and rename some variables. (coindexed_expr_callback): Skip over function created by the rewriter. (coindexed_code_callback): Filter some intrinsics not to process. (gfc_coarray_rewrite): Rewrite also contained functions. * trans-intrinsic.cc (gfc_conv_intrinsic_caf_get): Reflect changed order on caf_get_from_remote (). libgfortran/ChangeLog: * caf/libcaf.h (_gfortran_caf_register_accessor): Reflect changed parameter order. * caf/single.c (struct accessor_hash_t): Same. (_gfortran_caf_register_accessor): Call accessor using a token for accessing arrays with a descriptor on the source side. gcc/testsuite/ChangeLog: * gfortran.dg/coarray_lib_comm_1.f90: Adapt scan expression. * gfortran.dg/coarray/get_with_fn_parameter.f90: New test. * gfortran.dg/coarray/get_with_scalar_fn.f90: New test.
[Bug fortran/107635] [Coarray] Allocatable components of types defined in module's interface are not handled correctly when used in coarrays.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107635 --- Comment #10 from GCC Commits --- The master branch has been updated by Andre Vehreschild : https://gcc.gnu.org/g:8bf0ee8d62b8a08e808344d31354ab713157e15d commit r15-7643-g8bf0ee8d62b8a08e808344d31354ab713157e15d Author: Andre Vehreschild Date: Fri Feb 7 11:25:31 2025 +0100 Fortran: Add transfer_between_remotes [PR107635] Add the last missing coarray data manipulation routine using remote accessors. gcc/fortran/ChangeLog: PR fortran/107635 * coarray.cc (rewrite_caf_send): Rewrite to transfer_between_remotes when both sides of the assignment have a coarray. (coindexed_code_callback): Prevent duplicate rewrite. * gfortran.texi: Add documentation for transfer_between_remotes. * intrinsic.cc (add_subroutines): Add intrinsic symbol for caf_sendget to allow easy rewrite to transfer_between_remotes. * trans-decl.cc (gfc_build_builtin_function_decls): Add prototype for transfer_between_remotes. * trans-intrinsic.cc (conv_caf_vector_subscript_elem): Mark as deprecated. (conv_caf_vector_subscript): Same. (compute_component_offset): Same. (conv_expr_ref_to_caf_ref): Same. (conv_stat_and_team): Extract stat and team from expr. (gfc_conv_intrinsic_caf_get): Use conv_stat_and_team. (conv_caf_send_to_remote): Same. (has_ref_after_cafref): Mark as deprecated. (conv_caf_sendget): Translate to transfer_between_remotes. * trans.h: Add prototype for transfer_between_remotes. libgfortran/ChangeLog: * caf/libcaf.h: Add prototype for transfer_between_remotes. * caf/single.c: Implement transfer_between_remotes. gcc/testsuite/ChangeLog: * gfortran.dg/coarray_lib_comm_1.f90: Fix up scan_trees.
[Bug bootstrap/118802] [15 regression] Bootstrap comparison failure on libphobos/libdruntime/core/internal/gc/impl/conservative/gc.o
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118802 Sam James changed: What|Removed |Added Component|other |bootstrap --- Comment #15 from Sam James --- Reproduced manually on another machine (phew). Reducing the options needed first in the script. Will try your trick next to debug.
[Bug fortran/107635] [Coarray] Allocatable components of types defined in module's interface are not handled correctly when used in coarrays.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107635 --- Comment #6 from GCC Commits --- The master branch has been updated by Andre Vehreschild : https://gcc.gnu.org/g:b114312bbaae51567bc0436d07990c4fbaa3c81d commit r15-7639-gb114312bbaae51567bc0436d07990c4fbaa3c81d Author: Andre Vehreschild Date: Wed Jan 8 12:33:36 2025 +0100 Fortran: Prepare for more caf-rework. [PR107635] Factor out generation of code to get remote function index and to create the additional data structure. Rename caf_get_by_ct to caf_get_from_remote. gcc/fortran/ChangeLog: PR fortran/107635 * gfortran.texi: Rename caf_get_by_ct to caf_get_from_remote. * trans-decl.cc (gfc_build_builtin_function_decls): Rename intrinsic. * trans-intrinsic.cc (conv_caf_func_index): Factor out functionality to be reused by other caf-functions. (conv_caf_add_call_data): Same. (gfc_conv_intrinsic_caf_get): Use functions factored out. * trans.h: Rename intrinsic symbol. libgfortran/ChangeLog: * caf/libcaf.h (_gfortran_caf_get_by_ref): Remove from ABI. This function is replaced by caf_get_from_remote (). (_gfortran_caf_get_remote_function_index): Use better name. * caf/single.c (_gfortran_caf_finalize): Free internal data. (_gfortran_caf_get_by_ref): Remove from public interface, but keep it, because it is still used by sendget (). gcc/testsuite/ChangeLog: * gfortran.dg/coarray_lib_comm_1.f90: Adapt to renamed ABI function. * gfortran.dg/coarray_stat_function.f90: Same. * gfortran.dg/coindexed_1.f90: Same.
[Bug ipa/118318] [15 regression] ICE when building firefox-134.0 with PGO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118318 --- Comment #15 from Jan Hubicka --- > Breakpoint 5.2, profile_count::operator+= (this=0x76e7e888, other=...) at > /usr/src/debug/sys-devel/gcc-15.0./gcc-15.0./gcc/profile-count.h:932 > 932 gcc_checking_assert (compatible_p (other)); > (gdb) p other > $38 = (const profile_count &) @0x7fff72a0: { > static n_bits = 61, > static max_count = 2305843009213693950, > static uninitialized_count = 2305843009213693951, > m_val = 3694, > m_quality = ADJUSTED > } > (gdb) p *this > $39 = { > static n_bits = 61, > static max_count = 2305843009213693950, > static uninitialized_count = 2305843009213693951, > m_val = 14776, > m_quality = GUESSED_GLOBAL0 > } > (gdb) Thanks a lot! So what happened is that we cloned function and conlcuded it executes 0 times, while we want to make call inside that function to execute 3694 times.I don't think we should clone function we think will be executed 0 times. What probably can happen is that we are updating count of the non-specialized (original function) and this comes out as an roundoff error... Honza > > -- > You are receiving this mail because: > You are on the CC list for the bug.
[Bug target/109093] [15 regression] csmith: a February runtime bug ?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109093 H.J. Lu changed: What|Removed |Added Attachment #60462|0 |1 is obsolete|| --- Comment #42 from H.J. Lu --- Created attachment 60539 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=60539&action=edit A patch for GCC 16
[Bug target/109780] [12/13/14/15 Regression] csmith: runtime crash with -O2 -march=znver1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109780 Sam James changed: What|Removed |Added Known to work|15.0| Summary|[12/13/14 Regression] |[12/13/14/15 Regression] |csmith: runtime crash with |csmith: runtime crash with |-O2 -march=znver1 |-O2 -march=znver1 Depends on||118936 --- Comment #43 from Sam James --- Patch was reverted in r15-7634-g0312d11be3f666 and r15-7635-g6921c93d205203 to try again in GCC 15. Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118936 [Bug 118936] [15 Regression] ICE in ix86_finalize_stack_frame_flags, at config/i386/i386.cc:8683
[Bug target/109093] [15 regression] csmith: a February runtime bug ?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109093 Sam James changed: What|Removed |Added Status|RESOLVED|NEW Depends on||118936 Resolution|FIXED |--- Keywords|needs-bisection | --- Comment #40 from Sam James --- Patch was reverted in r15-7634-g0312d11be3f666 and r15-7635-g6921c93d205203 to try again in GCC 15. Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118936 [Bug 118936] [15 Regression] ICE in ix86_finalize_stack_frame_flags, at config/i386/i386.cc:8683
[Bug target/109093] [15 regression] csmith: a February runtime bug ?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109093 Sam James changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |hjl.tools at gmail dot com
[Bug sarif-replay/96032] RFE: add a way to use output from --fdiagnostics-format=json or sarif as input
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96032 --- Comment #12 from Kamil Dudka --- I confirm that sarif-replay is available on f42+ and it seems to work as expected. Thanks!
[Bug target/118936] [15 Regression] ICE in ix86_finalize_stack_frame_flags, at config/i386/i386.cc:8683
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118936 Sam James changed: What|Removed |Added Resolution|--- |FIXED Assignee|unassigned at gcc dot gnu.org |hjl.tools at gmail dot com Status|NEW |RESOLVED --- Comment #14 from Sam James --- .
[Bug target/109093] [15 regression] csmith: a February runtime bug ?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109093 Sam James changed: What|Removed |Added Priority|P1 |P2 --- Comment #41 from Sam James --- I think we want to make it P2 instead for now. It's the same issue that goes back way further than trunk, just this testcase is 15-only.
[Bug libstdc++/98749] No precondition checks in , and
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98749 Jonathan Wakely changed: What|Removed |Added Target Milestone|15.0|16.0
[Bug libstdc++/118395] Constructor of std::barrier is not constexpr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118395 Jonathan Wakely changed: What|Removed |Added Target Milestone|--- |16.0
[Bug target/94173] Superfluous stackpointer manipulation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94173 Sam James changed: What|Removed |Added CC||blubban at gmail dot com --- Comment #6 from Sam James --- *** Bug 118946 has been marked as a duplicate of this bug. ***
[Bug rtl-optimization/118946] Missed optimization: GCC reserves stack space for optimized-out variable
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118946 --- Comment #3 from Sam James --- *** This bug has been marked as a duplicate of bug 94173 ***
[Bug target/118540] RISC-V: ICE for unsupported target attribute
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118540 Jeffrey A. Law changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED --- Comment #3 from Jeffrey A. Law --- Fixed on the trunk.
[Bug target/118057] RISC-V: Can't vectorize load and store with zvl128b
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118057 --- Comment #8 from Jeffrey A. Law --- This is really a costing issue. Some designs (such as Ventana's) strided access can be very profitable, particularly for a relatively small stride. On others it may be considerably worse. Point being someone will have to build a cost model for each design to describe the costs in a sane manner.
[Bug tree-optimization/117634] memset/struct copy transformation into memset/memset is not done if not address
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117634 Andrew Pinski changed: What|Removed |Added CC||blubban at gmail dot com --- Comment #4 from Andrew Pinski --- *** Bug 118946 has been marked as a duplicate of this bug. ***
[Bug target/116662] The value of __GCC_DESTRUCTIVE_SIZE for riscv64 could be improved
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116662 Jeffrey A. Law changed: What|Removed |Added Last reconfirmed||2025-02-21 Status|UNCONFIRMED |WAITING Ever confirmed|0 |1
[Bug target/118734] RISC-V: Vector broadcast via strided load.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118734 Jeffrey A. Law changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2025-02-20 Ever confirmed|0 |1
[Bug tree-optimization/118954] [15 regression] Miscompile at -O3 since r15-1757-g4d24159a1fcb15
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118954 Sam James changed: What|Removed |Added Summary|[15 regression] Miscompile |[15 regression] Miscompile |at -O3 |at -O3 since ||r15-1757-g4d24159a1fcb15 Keywords|needs-bisection | --- Comment #10 from Sam James --- OK, it is r15-1757-g4d24159a1fcb15 with dumps too.
[Bug bootstrap/118802] [15 regression] Bootstrap comparison failure on libphobos/libdruntime/core/internal/gc/impl/conservative/gc.o since r15-7400-gd3ff498c478ace
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118802 --- Comment #22 from Hongtao Liu --- (In reply to Sam James from comment #16) > Bisected to r15-7400-gd3ff498c478ace (not CCing anyone yet as not enough > useful information). There's a new patch in [1] which will revert the commit and may fix it(or make it latent again). [1] https://gcc.gnu.org/pipermail/gcc-patches/2025-February/675714.html
[Bug c/118963] New: Miscompile at -O2/3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118963 Bug ID: 118963 Summary: Miscompile at -O2/3 Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: yunboni at smail dot nju.edu.cn Target Milestone: --- This code times out at -O2/3 and prints 0 at -O0/1/s: ```c int printf(const char *, ...); int a = -104, b, c, e; void g(int h) { int f = 0; while (!f + a - -104) { f = h == 0; if (f) h = 1; } } int main() { int d = 8; for (; e;) d = 0; c = d; g(81 - 81); printf("%X\n", b); } ``` Compiler Explorer: https://godbolt.org/z/q3WajnYPj Bisected to https://github.com/gcc-mirror/gcc/commit/429a7a88438cc80e7c58d9f63d44838089899b12
[Bug bootstrap/118802] [15 regression] Bootstrap comparison failure on libphobos/libdruntime/core/internal/gc/impl/conservative/gc.o since r15-7400-gd3ff498c478ace
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118802 --- Comment #21 from Sam James --- I understand, thanks. I'll keep whittling it down.
[Bug target/117544] Lack of vsetvli after function call for whole register move
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117544 --- Comment #4 from Jeffrey A. Law --- Fixed on the trunk.
[Bug target/118934] [15 Regression] RISC-V: ICE: output_operand: invalid expression as operand
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118934 --- Comment #2 from Anton Blanchard --- This reproduces the issue. Build without optimisation to avoid all the code disappearing: #define INSNS_1 x = x + 1; #define INSNS_2 INSNS_1 INSNS_1 #define INSNS_4 INSNS_2 INSNS_2 #define INSNS_8 INSNS_4 INSNS_4 #define INSNS_16 INSNS_8 INSNS_8 #define INSNS_32 INSNS_16 INSNS_16 #define INSNS_64 INSNS_32 INSNS_32 #define INSNS_128 INSNS_64 INSNS_64 #define INSNS_256 INSNS_128 INSNS_128 #define INSNS_512 INSNS_256 INSNS_256 #define INSNS_1024 INSNS_512 INSNS_512 #define INSNS_2048 INSNS_1024 INSNS_1024 #define INSNS_4096 INSNS_2048 INSNS_2048 #define INSNS_8192 INSNS_4096 INSNS_4096 #define INSNS_16384 INSNS_8192 INSNS_8192 #define INSNS_32768 INSNS_16384 INSNS_16384 #define INSNS_65536 INSNS_32768 INSNS_32768 #define INSNS_131072 INSNS_65536 INSNS_65536 #define INSNS_262144 INSNS_131072 INSNS_131072 #define INSNS_524288 INSNS_262144 INSNS_262144 #define INSNS_1048576 INSNS_524288 INSNS_524288 int foo(int x) { if (x) goto out; // > 1MB of code INSNS_524288 out: return x; } # riscv64-unknown-linux-gnu-gcc -c large-function.c during RTL pass: final large-function.c: In function 'foo': large-function.c:33:1: internal compiler error: output_operand: invalid expression as operand
[Bug target/80878] -mcx16 (enable 128 bit CAS) on x86_64 seems not to work on 7.1.0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80878 --- Comment #49 from LIU Hao --- (In reply to Luke Dalessandro from comment #48) > So my understanding is that 104688 basically determined that it's correct to > implement atomic load with movdqa for aligned addresses on architectures > with AVX support. And hence gcc could inline that in the same way clang > does, and inline cmpxchg16b for > compare_exchange/__atomic_compare_exchange{_n} as well. And thus there no > longer has to be a libatomic call for any of these. Yes. However I suspect it might be an ABI break. > I can support the fact that -mcx16 is maybe the wrong flag to use to force > inlining here given it's cmpxchg-style name, but it really feels like a > sophisticated user that's willing to live in implementation-defined land > should be able to get the same performance for lock-free code out of gcc > that it does out of clang in this situation. May I remind you about https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80878#c42 ? First CMPXCHG16B can be much slower than CMPXCHG: https://quick-bench.com/q/MZioNHkbBn0soH_KSDyYcKmrrxU Second not all x86-64 processors support CMPXCHG16B, so `-mcx16` is required, like `-mavx`.
[Bug target/117544] Lack of vsetvli after function call for whole register move
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117544 Jeffrey A. Law changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #5 from Jeffrey A. Law --- Forgot to change state to closed...
[Bug fortran/118932] Testcase gfortran.dg/binding_label_tests_34.f90 needs standard checking
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118932 --- Comment #4 from Thomas Koenig --- Hm, maybe I am misunderstanding the standard here, or it says something that was not intentional... We accept program memain interface subroutine lower () bind(c,name="foo") end subroutine lower subroutine upper () bind(c,name="FOO") end subroutine upper end interface call lower call upper end program memain but probably due to error rather than design, as -fdump-fortran-global shows: name=FOO name=foo, sym_name=upper, binding_label=FOO name=memain
[Bug target/94173] Superfluous stackpointer manipulation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94173 Jeffrey A. Law changed: What|Removed |Added Summary|[RISCV] Superfluous |Superfluous stackpointer |stackpointer manipulation |manipulation Target|riscv | --- Comment #5 from Jeffrey A. Law --- Making this more generic as it's not specific to any target.
[Bug tree-optimization/118947] Missed optimization: GCC forgets stack buffer contents across function call
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118947 Andrew Pinski changed: What|Removed |Added Component|rtl-optimization|tree-optimization --- Comment #3 from Andrew Pinski --- Oh optimize_memcpy_to_memset does not understand "" either. I will add that support too. Here is a testcase to show the vdef issue and not related to the "" issue: ``` void* aaa(); void* bbb() { aaa(); static int buf2[32]; int buf[32] = {}; __builtin_memcpy(buf2, buf, sizeof(buf2)); return buf2; } void* ccc() { int buf[32] = {}; aaa(); static int buf2[32]; __builtin_memcpy(buf2, buf, sizeof(buf2)); return buf2; } ```
[Bug analyzer/94713] Analyzer is buggy on uninitialized pointer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94713 Jeffrey A. Law changed: What|Removed |Added CC||blubban at gmail dot com --- Comment #5 from Jeffrey A. Law --- *** Bug 118946 has been marked as a duplicate of this bug. ***
[Bug rtl-optimization/118946] Missed optimization: GCC reserves stack space for optimized-out variable
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118946 Jeffrey A. Law changed: What|Removed |Added Resolution|--- |DUPLICATE Status|UNCONFIRMED |RESOLVED --- Comment #2 from Jeffrey A. Law --- Marking as a duplicate of one I happen to know about. I suspect there are others. *** This bug has been marked as a duplicate of bug 94713 ***
[Bug target/118950] [14/15 regression] RISC-V: rv64gcv runtime mismatch at -O3 since r14-4038-gb975c0dc3be
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118950 Jeffrey A. Law changed: What|Removed |Added Ever confirmed|0 |1 Last reconfirmed||2025-02-20 Status|UNCONFIRMED |NEW
[Bug target/118945] RISC-V: VSETL pass: Don't promote Vectors ops from Tail agnostic to Tail Undisturbed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118945 --- Comment #10 from Vineet Gupta --- (In reply to JuzheZhong from comment #9) > > I think we should consider many more different situation and consider it > carefully. Like: > > vsetvli ... e8,mf8 ta ma (demand ratio) > ... > vservli zero zero e32 mf2 tu ma (demand ratio) > ... > vservli zero zero e64 m1 ta ma (demand SEW and LMUL) > ... > vservli zero zero e64 m1 ta mu (demand ratio) > ... > vservli zero zero e16 mf4 tu mu(demand ratio) > ... > vservli zero zero e32 mf2 ta ma(demand ratio) > ... > vservli zero zero e8 mf8 ta ma(demand ratio) > > In current strategy, 7 "vsetvli" will be fused into 1 single "vsetvli": > > vservli ... e64 m1 tu mu > > However, if you just keep agnostic not allow to fuse it, you will end up > with 6 more "vsetvli"s. I don't think this codegen can better in any > micro-architecture design. While the orig test was too simple and contrived, this is too complex and contrived :-) I'd argue that if there's such toggling of tail and mask policies then yeah its fine to have so many vsetvls. We all agree this will be a cpu tune to retain the existing behavior while providing new behavior as opt-in for uarches that deem fit.
[Bug target/117955] GCC generate illegal riscv instruction `vsetvli zero,zero,e64,mf4,ta,ma` with -O2 and -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117955 --- Comment #6 from Jeffrey A. Law --- As I feared, this has just gone latent. If you revert: bdbbe5d4b6d495ac06ee762540a1277498f2a7a0 7bef3482f27ce13ba7e6c4f43943f28a49e63a40 This can be triggered again on the trunk. Given the sensitivity to scheduling changes I suspect it's ultimately a vsetvl optimization issue.
[Bug tree-optimization/14295] [tree-ssa] copy propagation for aggregates
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=14295 Andrew Pinski changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |pinskia at gcc dot gnu.org Status|NEW |ASSIGNED --- Comment #11 from Andrew Pinski --- .
[Bug tree-optimization/14295] [tree-ssa] copy propagation for aggregates
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=14295 --- Comment #12 from Andrew Pinski --- optimize_memcpy_to_memset does some simple copy prop but with zeroing. A similar method could be done for non zeroing and i am going to try that.
[Bug jit/117047] [15 regression] Segfault in libgccjit garbage collection when compiling GNU Emacs with Native Compilation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117047 --- Comment #16 from Sam James --- Created attachment 60552 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=60552&action=edit emacs.log.xz So far, not got anywhere with attempting to copy our packaging into a script. I've attached a build log from building Emacs from git (just ./autogen.sh && ./configure && make V=1 -j$(nproc) -l$(nproc)) using Gentoo's GCC in case you can spot some difference with your own. I'm going to see if I can reproduce in a Docker container using Gentoo's GCC and go from there.
[Bug target/115763] RISC-V: Use wrong SEW for vfmv.v.f when -march only has zvfhmin
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115763 Jeffrey A. Law changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED --- Comment #11 from Jeffrey A. Law --- Per c#4, c#8, c#10.
[Bug rtl-optimization/115523] [avr] Remove SFmode insns
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115523 Jeffrey A. Law changed: What|Removed |Added Resolution|--- |FIXED Status|UNCONFIRMED |RESOLVED --- Comment #7 from Jeffrey A. Law --- Per c#6.
[Bug target/115795] RISC-V: vsetvl step causes wrong codegen after fusing info
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115795 Jeffrey A. Law changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED --- Comment #9 from Jeffrey A. Law --- Per c#8.
[Bug target/114809] [RISC-V RVV] Counting elements might be simpler
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114809 --- Comment #4 from Jeffrey A. Law --- I fixed the missed peephole a while back. But the question about cpop vs other strategies remains.
[Bug target/118931] [15 Regression] RISC-V: rv64gcv miscompile at -O[23] since r15-3228-g771256bcb9d
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118931 --- Comment #2 from Li Pan --- 13 │ int main () 14 │ { 15 │ vector(16) unsigned char vect__3.5; 16 │ unsigned char a_lsm.2; 17 │ long long int _5; 18 │ vector(16) unsigned char _13; 19 │ unsigned char _29; 20 │ 21 │[local count: 71618576]: 22 │ a_lsm.2_20 = a; 23 │ _13 = {a_lsm.2_20, a_lsm.2_20, a_lsm.2_20, a_lsm.2_20, a_lsm.2_20, a_lsm.2_20, a_lsm.2_20, a_lsm.2_20, a_lsm.2_20, a_lsm.2_20, a_lsm.2_20, a_lsm.2_20, a_lsm.2_20, a_lsm.2_20, a_lsm.2_20, a_lsm.2_20}; 24 │ vect__3.5_25 = _13 * { 151, 17, 7, 33, 119, 49, 231, 65, 87, 81, 199, 97, 55, 113, 167, 129 }; 25 │ _29 = .VEC_EXTRACT (vect__3.5_25, 13); 26 │ a = _29; 27 │ _5 = (long long int) _29; 28 │ __builtin_printf ("%llu\n", _5); 29 │ return 0; 30 │ 31 │ } It is correct from the tree-optimized, (unsigned char )(109 * 113) = 29 is what we expect. Should be a backend issue.
[Bug target/80878] -mcx16 (enable 128 bit CAS) on x86_64 seems not to work on 7.1.0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80878 --- Comment #48 from Luke Dalessandro --- (In reply to LIU Hao from comment #47) > (In reply to Luke Dalessandro from comment #46) > > But if 104688 isn't related to this issue, and thus Jakub's comment was in > > error, I definitely don't understand the underlying problem and why clang is > > fine doing it. > > Issue here is that if atomic load is implemented with a call to libatomic > routines then it's incorrect to implement CAS without a call. So my understanding is that 104688 basically determined that it's correct to implement atomic load with movdqa for aligned addresses on architectures with AVX support. And hence gcc could inline that in the same way clang does, and inline cmpxchg16b for compare_exchange/__atomic_compare_exchange{_n} as well. And thus there no longer has to be a libatomic call for any of these. I can support the fact that -mcx16 is maybe the wrong flag to use to force inlining here given it's cmpxchg-style name, but it really feels like a sophisticated user that's willing to live in implementation-defined land should be able to get the same performance for lock-free code out of gcc that it does out of clang in this situation.
[Bug target/118931] [15 Regression] RISC-V: rv64gcv miscompile at -O[23] since r15-3228-g771256bcb9d
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118931 Li Pan changed: What|Removed |Added CC||pan2.li at intel dot com --- Comment #1 from Li Pan --- Reproduced from upstream with "-mrvv-vector-bits=zvl", will take a look.
[Bug target/113715] RISC-V: If the Zcmp is enabled, the a0 register operates abnormally when the program returns
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113715 Jeffrey A. Law changed: What|Removed |Added Resolution|--- |FIXED Status|NEW |RESOLVED --- Comment #9 from Jeffrey A. Law --- Fixed on the trunk. No plans for further backports.
[Bug middle-end/23782] SRA pessimizes passing structures by value at -Os (+22% code size)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=23782 --- Comment #9 from Andrew Pinski --- I have a patch which builds on top of PR 14295 which improves the situtation here. It has a few testcase regressions but those are testcase issues which I will fix up later on.
[Bug target/118950] [14/15 regression] RISC-V: rv64gcv runtime mismatch at -O3 since r14-4038-gb975c0dc3be
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118950 --- Comment #5 from Robin Dapp --- Yeah, the original statement is recognized as a mask conversion pattern: pr118950.c:9:21: note: vect_recog_mask_conversion_pattern: detected: _152 = .MASK_LOAD (_230, 8B, _229, 0); pr118950.c:9:21: note: mask_conversion pattern recognized: patt_355 = .MASK_LOAD (_230, 8B, patt_54, 0); but also as a scatter/gather: pr118950.c:9:21: note: gather/scatter pattern: detected: _152 = .MASK_LOAD (_230, 8B, _229, 0); pr118950.c:9:21: note: gather_scatter pattern recognized: patt_375 = .MASK_LEN_GATHER_LOAD ((sizetype) _215 + 20, _85, 1, 0, _229, 0); The type of _152 is _Bool but patt_375's type is unsigned char. With unsigned char the presence of padding bits is not obvious and we should have looked at _152's type.
[Bug tree-optimization/118954] [15 regression] Miscompile at -O3 since r15-1757-g4d24159a1fcb15
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118954 Richard Biener changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org Status|NEW |ASSIGNED --- Comment #11 from Richard Biener --- I will have a look.
[Bug c/118953] New: Miscompile at -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118953 Bug ID: 118953 Summary: Miscompile at -O3 Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: yunboni at smail dot nju.edu.cn Target Milestone: --- This code prints 45 at -O3 and 0 at -O0/1/2/s: int printf(const char *, ...); int a, d; long b, c; int e(int f, int g, unsigned long h, long j) { unsigned long i = 0; if (g) switch (f) { case 8: i = b; break; case 6: i = c; } else switch (f) { case 8: i = h; break; case 24: case 32: i = j; } return i; } int main() { int k = a * (409628 - 28); d = e(k - 1048524, 0, k - 1048487, (unsigned long)k - 1048531); printf("%d\n", d); } Compiler Explorer: https://godbolt.org/z/YTWjbWe3s Bisected to https://github.com/gcc-mirror/gcc/commit/602e824eec30a7c6792b8b27d61c40f1c1a2714c
[Bug libstdc++/118494] std::counting_semaphore should work
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118494 Jonathan Wakely changed: What|Removed |Added Target Milestone|--- |16.0
[Bug libstdc++/99552] FAIL: 29_atomics/atomic/wait_notify/bool.cc (test for excess errors)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99552 Jonathan Wakely changed: What|Removed |Added See Also||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=115955 Target Milestone|--- |16.0
[Bug libstdc++/110854] constructor of std::counting_semaphore is not constexpr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110854 Jonathan Wakely changed: What|Removed |Added Target Milestone|15.0|16.0
[Bug tree-optimization/118954] [15 regression] Miscompile at -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118954 Richard Biener changed: What|Removed |Added CC||rguenth at gcc dot gnu.org --- Comment #2 from Richard Biener --- I can't reproduce, not even with exactly the godbolt revision.
[Bug tree-optimization/114999] A few missing optimizations due to `a - b` and `b - a` not being detected as negatives of each other
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114999 --- Comment #13 from Jennifer Schmitz --- Created attachment 60540 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=60540&action=edit Patch for improving codegen of absolute differences of unsigned integers in aarch64 This patch builds on top of the previous one, improving codegen for the same test cases for unsigned integers (32-bit and 64-bit) for aarch64. The patch adds a new define_insn_and_split pattern in the aarch64 backend.
[Bug target/118844] Link failure caused by crtbeginS.o
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118844 --- Comment #3 from GCC Commits --- The releases/gcc-14 branch has been updated by LuluCheng : https://gcc.gnu.org/g:9ffecde121af883b60bbe60d00425036bc873048 commit r14-11321-g9ffecde121af883b60bbe60d00425036bc873048 Author: Lulu Cheng Date: Wed Feb 12 14:29:58 2025 +0800 LoongArch: Fix the issue of function jump out of range caused by crtbeginS.o [PR118844]. Due to the presence of R_LARCH_B26 in /usr/lib/gcc/loongarch64-linux-gnu/14/crtbeginS.o, its addressing range is [PC-128MiB, PC+128MiB-4]. This means that when the code segment size exceeds 128MB, linking with lld will definitely fail (ld will not fail because the order of the two is different). The linking order: lld: crtbeginS.o + .text + .plt ld : .plt + crtbeginS.o + .text To solve this issue, add '-mcmodel=extreme' when compiling crtbeginS.o. PR target/118844 libgcc/ChangeLog: * config/loongarch/t-crtstuff: Add '-mcmodel=extreme' to CRTSTUFF_T_CFLAGS_S. (cherry picked from commit ae14d7d04da8c6cb542269722638071f999f94d8)
[Bug tree-optimization/118953] [14/15 regression] Miscompile at -O2 since r14-2473-g602e824eec30a7
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118953 Richard Biener changed: What|Removed |Added Ever confirmed|0 |1 Priority|P3 |P2 Last reconfirmed||2025-02-20 Status|UNCONFIRMED |NEW --- Comment #3 from Richard Biener --- Confirmed. # RANGE [irange] int [-INF, +INF] MASK 0xc000 VALUE 0x34 _7 = k_11 + -1048524; switch (_7) [33.33%], case 8: [33.33%], case 24: [33.33%], case 32: [33.33%]> (k_11 is zero at runtime). EVRP then makes Global Exported: _7 = [irange] int [-INF, 2146435123] MASK 0xc000 VALUE 0x34 Global Exported: i_20 = [irange] long unsigned int [45, 45] MASK 0xc07d VALUE 0x0 # RANGE [irange] int [-INF, 2146435123] MASK 0xc000 VALUE 0x34 _7 = k_11 + -1048524; d = 45; out of this.
[Bug c++/118951] __FILE__ inserts the filename as array, __builtin_FILE as pointer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118951 --- Comment #1 from Richard Biener --- We can't change the signature of builtins. Also there's nothing like an array return value for functions in C or C++?
[Bug tree-optimization/118954] [15 regression] Miscompile at -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118954 --- Comment #3 from Sam James --- I can (with -fno-ssp), so bisecting.
[Bug tree-optimization/118953] [14/15 regression] Miscompile at -O2 since r14-2473-g602e824eec30a7
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118953 --- Comment #2 from Yunbo Ni --- (In reply to Sam James from comment #1) > I get '45' at -O2 and -O3 locally and on godbolt, but -O1 shows 0. Yes, you're right. I mistakenly wrote the result from the case before it was reduced. Sorry about that.
[Bug tree-optimization/118953] [14/15 regression] Miscompile at -O2 since r14-2473-g602e824eec30a7
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118953 Sam James changed: What|Removed |Added Summary|[14/15 regression] |[14/15 regression] |Miscompile at -O3 since |Miscompile at -O2 since |r14-2473-g602e824eec30a7|r14-2473-g602e824eec30a7 Target Milestone|--- |14.3
[Bug tree-optimization/118953] [14/15 regression] Miscompile at -O3 since r14-2473-g602e824eec30a7
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118953 Sam James changed: What|Removed |Added CC||aldyh at gcc dot gnu.org, ||amacleod at redhat dot com Summary|Miscompile at -O3 |[14/15 regression] ||Miscompile at -O3 since ||r14-2473-g602e824eec30a7 Keywords||wrong-code Component|c |tree-optimization --- Comment #1 from Sam James --- I get '45' at -O2 and -O3 locally and on godbolt, but -O1 shows 0. > Bisected to > https://github.com/gcc-mirror/gcc/commit/ > 602e824eec30a7c6792b8b27d61c40f1c1a2714c r14-2473-g602e824eec30a7
[Bug target/118949] RISC-V: Extra FRM writes since GCC-14.2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118949 Li Pan changed: What|Removed |Added CC||pan2.li at intel dot com --- Comment #2 from Li Pan --- There is minor changes for FRM in mode-switch from gcc-14 to gcc-15, one related change on FRM is adding it to global_reg. Because gcc-15 introduced late-combine will delete one necessary FRM back insn as it isn't live from the entry. For llvm, AFAIK when support round autovec in GCC, it may not support all cases unless it has some optimization recently. I will take a look if it related to FRM or something we can do here, and keep you posted.
[Bug tree-optimization/118521] [15 regression] std::vector Wstringop-overflow false positive since r15-4473
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118521 Richard Biener changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org --- Comment #10 from Richard Biener --- (In reply to Richard Biener from comment #9) [...] > diff --git a/gcc/passes.def b/gcc/passes.def > index 9fd85a35a63..c02fd0e186d 100644 > --- a/gcc/passes.def > +++ b/gcc/passes.def > @@ -346,9 +346,10 @@ along with GCC; see the file COPYING3. If not see > form if possible. */ >NEXT_PASS (pass_thread_jumps, /*first=*/false); >NEXT_PASS (pass_dominator, false /* may_peel_loop_headers_p */); > - NEXT_PASS (pass_strlen); >NEXT_PASS (pass_thread_jumps_full, /*first=*/false); >NEXT_PASS (pass_vrp, true /* final_p */); > + NEXT_PASS (pass_forwprop, /*last=*/true); > + NEXT_PASS (pass_strlen); >/* Run CCP to compute alignment and nonzero bits. */ >NEXT_PASS (pass_ccp, true /* nonzero_p */); >NEXT_PASS (pass_warn_restrict); > @@ -356,7 +357,6 @@ along with GCC; see the file COPYING3. If not see >NEXT_PASS (pass_dce, true /* update_address_taken_p */, true /* > remove_unused_locals */); >/* After late DCE we rewrite no longer addressed locals into SSA > form if possible. */ > - NEXT_PASS (pass_forwprop, /*last=*/true); >NEXT_PASS (pass_sink_code, true /* unsplit edges */); >NEXT_PASS (pass_phiopt, false /* early_p */); >NEXT_PASS (pass_fold_builtins); > Causes +FAIL: c-c++-common/Wstringop-overflow.c -std=gnu++17 (test for warnings, line 93) +FAIL: c-c++-common/Wstringop-overflow.c -std=gnu++17 (test for warnings, line 94) ... +FAIL: gcc.dg/strlenopt-3.c scan-tree-dump-times optimized "return 0" 3 +FAIL: gcc.dg/strlenopt-45.c (test for excess errors) +FAIL: gcc.dg/strlenopt-45.c scan-tree-dump-times optimized "call_in_true_branch_not_eliminated_" 0 +FAIL: gcc.dg/strlenopt-70.c scan-tree-dump-times optimized "_not_eliminated_" 0 +FAIL: gcc.dg/strlenopt-70.c scan-tree-dump-times optimized "strlen" 0 +FAIL: gcc.dg/strlenopt-73.c scan-tree-dump-times optimized "_not_eliminated_" 0 +FAIL: gcc.dg/strlenopt-73.c scan-tree-dump-times optimized "strlen" 0 +FAIL: gcc.dg/strlenopt-77.c scan-tree-dump-times optimized "call_in_true_branch_not_eliminated_" 0 +FAIL: gcc.dg/strlenopt-80.c (test for excess errors) +FAIL: gcc.dg/strlenopt-80.c scan-tree-dump-times optimized "failure_on_line (" 0 +FAIL: gcc.dg/strlenopt-91.c scan-tree-dump-not optimized "abort" +FAIL: gcc.dg/tree-ssa/builtin-snprintf-3.c scan-tree-dump-not optimized "failure_range" +FAIL: gcc.dg/tree-ssa/builtin-snprintf-7.c scan-tree-dump-times optimized "_not_eliminated" 0 +FAIL: gcc.dg/tree-ssa/builtin-snprintf-8.c scan-tree-dump-not optimized "abort" +FAIL: gcc.dg/tree-ssa/builtin-snprintf-9.c scan-tree-dump-not optimized "abort" +FAIL: gcc.dg/tree-ssa/builtin-sprintf-4.c scan-tree-dump-not optimized "failure_on_line" +FAIL: gcc.dg/tree-ssa/builtin-sprintf-9.c scan-tree-dump-times optimized "call_in_true_branch_not_eliminated_" 0 +FAIL: gcc.dg/tree-ssa/builtin-sprintf.c (test for excess errors) +UNRESOLVED: gcc.dg/tree-ssa/builtin-sprintf.c compilation failed to produce executable +FAIL: gcc.dg/tree-ssa/pr79327-2.c scan-tree-dump-not optimized "failure_on_line" +FAIL: gcc.dg/tree-ssa/pr83198.c scan-tree-dump-not optimized "link_error ();" > diff --git a/gcc/tree-scalar-evolution.cc b/gcc/tree-scalar-evolution.cc > index 0ba85917d41..a0d1c2f3d86 100644 > --- a/gcc/tree-scalar-evolution.cc > +++ b/gcc/tree-scalar-evolution.cc > @@ -284,6 +284,7 @@ along with GCC; see the file COPYING3. If not see > #include "tree-into-ssa.h" > #include "builtins.h" > #include "case-cfn-macros.h" > +#include "tree-eh.h" > > static tree analyze_scalar_evolution_1 (class loop *, tree); > static tree analyze_scalar_evolution_for_address_of (class loop *loop, > @@ -3947,6 +3948,19 @@ final_value_replacement_loop (class loop *loop) > print_gimple_stmt (dump_file, SSA_NAME_DEF_STMT (rslt), 0); > fprintf (dump_file, "\n"); > } > + > + if (! SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rslt)) > + { > + gimple *use_stmt; > + imm_use_iterator imm_iter; > + FOR_EACH_IMM_USE_STMT (use_stmt, imm_iter, rslt) > + { > + gimple_stmt_iterator gsi = gsi_for_stmt (use_stmt); > + if (!stmt_can_throw_internal (cfun, use_stmt) > + && fold_stmt (&gsi, follow_all_ssa_edges)) > + update_stmt (gsi_stmt (gsi)); > + } > + } > } > >return any; > > this should have the least chance of regressing things. I'll report results. This OTOH works fine, so posted for review.
[Bug jit/117047] [15 regression] Segfault in libgccjit garbage collection when compiling GNU Emacs with Native Compilation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117047 --- Comment #11 from Sam James --- (In reply to Richard Biener from comment #10) > So how does one go to try reproducing this? Does it show up when building > emacs itself? Yes. If you build Emacs with ./configure --with-native-compilation, it should happen (it may need --with-native-compilation=aot in order to pre-compile more) just on `make`. No need to run Emacs manually or install it.