[Bug middle-end/119600] HOST_WIDEST_FAST_INT should be used instead of long for BITMAP_WORD in bitmap.h

2025-04-03 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119600

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
   Last reconfirmed||2025-04-03

--- Comment #2 from Richard Biener  ---
Confirmed.

[Bug tree-optimization/119605] change the code fixup_cfg for __builtin_unreachable to be a verifier

2025-04-03 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119605

--- Comment #2 from Richard Biener  ---
I thought we verify this already ...

[Bug libstdc++/119606] [15 regression] Commit 'Optimize string constructor' causes regression in Snappy workload for -mcpu=neoverse-v2 with LTO

2025-04-03 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119606

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |15.0

[Bug libstdc++/119606] [15 regression] Commit 'Optimize string constructor' causes regression in Snappy workload for -mcpu=neoverse-v2 with LTO

2025-04-03 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119606

Andrew Pinski  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=86590
   Target Milestone|--- |15.0

--- Comment #4 from Andrew Pinski  ---
Most likely just a dup of bug 86590.

[Bug rtl-optimization/119607] New: [15 regression] glib miscompiled since r15-7895-gb191e8bdecf881

2025-04-03 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119607

Bug ID: 119607
   Summary: [15 regression] glib miscompiled since
r15-7895-gb191e8bdecf881
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Keywords: needs-source, wrong-code
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: sjames at gcc dot gnu.org
  Target Milestone: ---

Created attachment 60970
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=60970&action=edit
meson.i686-pc-linux-gnu.x86.ini

All the failing tests involve threads which is a pain. It also only fails with
-m32 so can't use Valgrind, and sanitizers make things work.

I'm going to do the basic bits (comparing objects, getting dumps) but likely
need some help to finish it then, not comfortable debugging threaded
applications yet when the threads are relevant to the problem. I'll say when
I'm at that point.

--

```
$ mkdir ~/bugs/glib && cd ~/bugs/glib
$ wget https://download.gnome.org/sources/glib/2.84/glib-2.84.0.tar.xz
$ tar xvf glib-2.84.0.tar.xz
$ mkdir build && cd build
$ meson setup --native-file ~/scripts/meson/meson.i686-pc-linux-gnu.x86.ini
~/bugs/glib/glib-2.84.0
$ ninja test
[...]
Summary of Failures:

165/384 glib:gobject / signals-refcount4  
ERROR1.22s   killed by signal 6 SIGABRT
194/384 glib:gobject / properties-refcount1   
ERROR1.63s   killed by signal 11 SIGSEGV
175/384 glib:gobject / signals-refcount2  
ERROR1.67s   killed by signal 5 SIGTRAP
210/384 glib:gobject+slow / closure-refcount  
ERROR3.78s   killed by signal 6 SIGABRT
321/384 glib:gio / gsettings  
ERROR0.56s   killed by signal 5 SIGTRAP
300/384 glib:gio+slow / actions   
ERROR3.59s   killed by signal 5 SIGTRAP

Ok:372
Fail:  6
Skipped:   6
```

An example failure:
```
$ MESON_TEST_ITERATION=1 MALLOC_PERTURB_=117
G_TEST_BUILDDIR=/home/sam/bugs/glib/build/gobject/tests MALLOC_CHECK_=2
LD_LIBRARY_PATH=/home/sam/bugs/glib/build/gobject:/home/sam/bugs/glib/build/glib
G_DEBUG=gc-friendly G_TEST_SRCDIR=/home/sam/bugs/glib/glib-2.84.0/gobject/tests
G_ENABLE_DIAGNOSTIC=1
/home/sam/bugs/glib/build/gobject/tests/properties-refcount1
TAP version 14
# random seed: R02S95fc5335f798bde46c993a52cc355a49
1..1
# Start of gobject tests
# Start of refcount tests
# .b
# .e
# .c
not ok /gobject/refcount/properties-1 - GLib-GObject-FATAL-CRITICAL:
g_closure_ref: assertion 'closure->ref_count > 0' failed
Bail out!
Segmentation fault (core dumped) MESON_TEST_ITERATION=1
MALLOC_PERTURB_=117 G_TEST_BUILDDIR=/home/sam/bugs/glib/build/gobject/tests
MALLOC_CHECK_=2
LD_LIBRARY_PATH=/home/sam/bugs/glib/build/gobject:/home/sam/bugs/glib/build/glib
G_DEBUG=gc-friendly G_TEST_SRCDIR=/home/sam/bugs/glib/glib-2.84.0/gobject/tests
G_ENABLE_DIAGNOSTIC=1
/home/sam/bugs/glib/build/gobject/tests/properties-refcount1
```


The failures aren't stable (sometimes segfault, sometimes trap, rarely passes
too). The reliance on threading means it can't be reproduced under rr record
either.

[Bug rtl-optimization/119607] [15 regression] glib miscompiled since r15-7895-gb191e8bdecf881

2025-04-03 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119607

--- Comment #1 from Sam James  ---
Needs -O3 -m32 -march=x86-64 -mtune=znver2 -fno-semantic-interposition. I
suspect the -O3/-fno-semantic-interposition is just because of inlining.

[Bug tree-optimization/119606] New: [15 regression] Commit 'Optimize string constructor' causes regression in Snappy workload for -mcpu=neoverse-v2 with LTO

2025-04-03 Thread jschmitz at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119606

Bug ID: 119606
   Summary: [15 regression] Commit 'Optimize string constructor'
causes regression in Snappy workload for
-mcpu=neoverse-v2 with LTO
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jschmitz at gcc dot gnu.org
CC: hubicka at ucw dot cz
  Target Milestone: ---
Target: aarch64

Created attachment 60969
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=60969&action=edit
Script to reproduce snappy regression

The commit that optimizes string constructors
(https://gcc.gnu.org/g:9c5505a35d9d71705464f9254f55407192d31ec3) causes changes
in performance for the Snappy workload for -mcpu=neoverse-v2, including some
regressions.

In the attachment is a script to reproduce the regressions. It builds GCC from
commits 37f35ebc and 9c5505a3 and runs Snappy with -O3 -Wl,-z,muldefs -lm
-flto=auto -Wl,--sort-section=name -mcpu=neoverse-v2.

Use it like this:
parentdir= ./snappy_script.sh

As of today, we observed the following runtime changes (values are percentages;
positive values mean that running Snappy from commit 9c5505a3 has longer
runtime than from commit 37f35ebc):

BM_UFlat/4/2 2.92308
BM_UValidate/5/2 -2.9106
BM_UValidate/7/1 2.29277
BM_UValidate/11/1 5.47945
BM_UIOVecSource/0/1 4.00891
BM_UIOVecSource/0/2 6.37636
BM_UIOVecSource/2/1 -3.59375
BM_UIOVecSource/2/2 2.8754
BM_UIOVecSource/4/2 4.42478
BM_UIOVecSource/5/2 2.42424
BM_UIOVecSource/10/2 8.71985
BM_UIOVecSink/3 3.1746
BM_UFlatSink/10/2 2.41935
BM_ZFlat/0/1 3.24826
BM_ZFlat/0/2 6.54952
BM_ZFlat/1/2 2.00501
BM_ZFlat/2/2 4.46735
BM_ZFlat/4/2 4.5045
BM_ZFlat/5/2 2.47678
BM_ZFlat/10/2 9.17782

In the past, we have also seen regressions in other tests, such as UFlat/6/1
and UFlat/6/2.

[Bug libstdc++/119606] [15 regression] Commit 'Optimize string constructor' causes regression in Snappy workload for -mcpu=neoverse-v2 with LTO

2025-04-03 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119606

--- Comment #2 from Andrew Pinski  ---
Is it really using std::string here?

[Bug libstdc++/119606] [15 regression] Commit 'Optimize string constructor' causes regression in Snappy workload for -mcpu=neoverse-v2 with LTO

2025-04-03 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119606

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|15.0|---
   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=93008

[Bug rtl-optimization/119607] [15 regression] glib miscompiled since r15-7895-gb191e8bdecf881

2025-04-03 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119607

--- Comment #2 from Sam James  ---
```
not ok /gobject/refcount/properties-1 - GLib-GObject-FATAL-CRITICAL:
g_closure_ref: assertion 'closure->ref_count > 0' failed
Bail out!

Thread 5 "properties-refc" received signal SIGTRAP, Trace/breakpoint trap.
[Switching to Thread 0xf5dfeb40 (LWP 3283743)]
_g_log_abort (breakpoint=) at
../glib-2.84.0/glib/gmessages.c:431
431 G_BREAKPOINT ();
(gdb) bt
#0  _g_log_abort (breakpoint=) at
../glib-2.84.0/glib/gmessages.c:431
#1  g_logv (log_domain=0xf7dc676f "GLib-GObject", log_level=,
format=0xf7ef22e9 "%s: assertion '%s' failed", args=0xf5dfde1c
"\300\347\334\367\023m\334\367\001")
at ../glib-2.84.0/glib/gmessages.c:1287
#2  0xf7e56dd4 in g_log (log_domain=0xf7dc676f "GLib-GObject",
log_level=G_LOG_LEVEL_CRITICAL, format=0xf7ef22e9 "%s: assertion '%s' failed",
format=0xf7ef22e9 "%s: assertion '%s' failed",
log_level=G_LOG_LEVEL_CRITICAL) at ../glib-2.84.0/glib/gmessages.c:1329
#3  0xf7e59697 in g_return_if_fail_warning (log_domain=0xf7dc676f
"GLib-GObject", pretty_function=0xf7dce7c0 <__func__.16> "g_closure_ref",
expression=0xf7dc6d13 "closure->ref_count > 0") at
../glib-2.84.0/glib/gmessages.c:3074
#4  0xf7d7ced1 in g_closure_ref (closure=0x5656c0e0) at
../glib-2.84.0/gobject/gclosure.c:556
#5  g_closure_invoke (closure=0x5656c0e0, return_value=0x0, n_param_values=2,
param_values=0xf5dfdfa0, invocation_hint=0xf5dfdf34) at
../glib-2.84.0/gobject/gclosure.c:811
#6  0xf7d96ebd in signal_emit_unlocked_R (node=node@entry=0xf5dfe074,
detail=detail@entry=59, instance=instance@entry=0x56564148,
emission_return=,
instance_and_params=) at
../glib-2.84.0/gobject/gsignal.c:3735
#7  0xf7d996c5 in signal_emit_valist_unlocked
(instance=instance@entry=0x56564148, signal_id=signal_id@entry=1,
detail=detail@entry=59, var_args=)
at ../glib-2.84.0/gobject/gsignal.c:3534
#8  0xf7da36b5 in g_signal_emit_valist (instance=0x56564148, signal_id=1,
detail=59, var_args=0xf5dfe15c "\310>VV\310>VV") at
../glib-2.84.0/gobject/gsignal.c:3277
#9  g_signal_emit (instance=0x56564148, signal_id=1, detail=59) at
../glib-2.84.0/gobject/gsignal.c:3597
#10 0xf7d81c5f in g_object_dispatch_properties_changed (object=0x56564148,
n_pspecs=1, pspecs=0xf5dfe1bc) at ../glib-2.84.0/gobject/gobject.c:1827
#11 0xf7d8481c in g_object_notify_queue_thaw (object=object@entry=0x56564148,
nqueue=, nqueue@entry=0xf5400610, take_ref=take_ref@entry=0)
at ../glib-2.84.0/gobject/gobject.c:761
#12 0xf7d895ff in g_object_set_valist (object=,
first_property_name=, var_args=) at
../glib-2.84.0/gobject/gobject.c:3161
#13 0xf7d8a286 in g_object_set (_object=0x56564148,
first_property_name=0x5655665d "dummy") at
../glib-2.84.0/gobject/gobject.c:3325
#14 0x56df in my_test_do_property (test=0x56564148) at
../glib-2.84.0/gobject/tests/properties-refcount1.c:172
#15 run_thread (test=0x56564148) at
../glib-2.84.0/gobject/tests/properties-refcount1.c:181
#16 0xf7e8fb4f in g_thread_proxy (data=0x56564590) at
../glib-2.84.0/glib/gthread.c:890
#17 0xf7b3d37d in start_thread (arg=) at pthread_create.c:448
#18 0xf7bff628 in __GI___clone3 () at
../sysdeps/unix/sysv/linux/i386/clone3.S:111
```

And:
```
# random seed: R02S6e89c19e14c81f615263ad8d91eba505
1..1
# Start of gobject tests
# Start of refcount tests
[New Thread 0xf79dfb40 (LWP 3620390)]
[New Thread 0xf6fffb40 (LWP 3620391)]
[New Thread 0xf67feb40 (LWP 3620392)]
[New Thread 0xf5dffb40 (LWP 3620393)]
[New Thread 0xf53ffb40 (LWP 3620395)]
# .e
# .b
# .f
# .d
# .c
# .e
# .f
# .b
# .d
# .c

Thread 5 "properties-refc" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0xf5dffb40 (LWP 3620393)]
0xf7d81c83 in g_object_notify_queue_free (data=0x39) at
../glib-2.84.0/gobject/gobject.c:665
665   g_slist_free (nqueue->pspecs);
(gdb) bt
#0  0xf7d81c83 in g_object_notify_queue_free (data=0x39) at
../glib-2.84.0/gobject/gobject.c:665
#1  0xf4a00610 in ?? ()
(gdb) x/5i $pc
=> 0xf7d81c83 :  push   DWORD PTR [eax]
   0xf7d81c85 :  call   0xf7d72300

   0xf7d81c8a :  popeax
   0xf7d81c8b :  popedx
   0xf7d81c8c :  push   0x8
```

[Bug ipa/119604] expand_call_inline could use an RAII for input_location

2025-04-03 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119604

--- Comment #4 from Richard Biener  ---
We should get rid of input_location uses in the middle-end instead ;)

[Bug rtl-optimization/119607] [15 regression] glib miscompiled since r15-7895-gb191e8bdecf881 with -O3 -m32 -march=x86-64 -mtune=znver2 -fno-semantic-interposition

2025-04-03 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119607

--- Comment #5 from Sam James  ---
noipa on g_closure_ref is enough to fix it

[Bug libstdc++/119593] Format width is not correctly handled for wide string/characters

2025-04-03 Thread tkaminsk at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119593

Tomasz Kamiński  changed:

   What|Removed |Added

Summary|Format width is not |Format width is not
   |correctly handled for   |correctly handled for wide
   |unicode characters  |string/characters

--- Comment #1 from Tomasz Kamiński  ---
The

[Bug libstdc++/119593] Format width is not correctly handled for wide string/characters

2025-04-03 Thread tkaminsk at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119593

--- Comment #3 from Tomasz Kamiński  ---
Two separate problems compound in this case:
 * UTF-32LE, UTF-32BE used for wchar_t, are not recognized as unicode encoding
 * character with is always assumed to be 1

[Bug rtl-optimization/119594] [15 regression] wrong code at -Os with "-fno-dce -fno-tree-dce -fno-tree-dse" on x86_64-linux-gnu since r15-1575

2025-04-03 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119594

--- Comment #6 from Jakub Jelinek  ---
First of all, REG_UNUSED/REG_DEAD notes are only officially meaningful in
passes which df_add_note_problem before df_analyze, which is cse1 and regcprop
but not fwprop.
But I actually don't see anything incorrect on the REG_UNUSED note even if that
wasn't the case, we have
(insn 7 2 8 2 (set (reg/v:DI 101 [ g ])
(const_int -1 [0x])) "pr119594.c":8:10 95
{*movdi_internal}
 (nil))
at the start of the function, then one loop and afterwards no more looping. 
And in there
(insn 26 24 27 7 (set (reg:DI 104 [ g ])
(zero_extend:DI (subreg:SI (reg/v:DI 101 [ g ]) 0))) "pr119594.c":11:8
175 {*zero_extendsidi2}
 (expr_list:REG_DEAD (reg/v:DI 101 [ g ])
(nil)))
(insn 27 26 28 7 (set (reg/v:DI 101 [ g ])
(reg:DI 104 [ g ])) "pr119594.c":11:8 95 {*movdi_internal}
 (expr_list:REG_DEAD (reg:DI 104 [ g ])
(expr_list:REG_UNUSED (reg/v:DI 101 [ g ])
(nil
(insn 28 27 29 7 (set (reg:DI 105)
(const_int 4294967295 [0x])) "pr119594.c":12:3 95
{*movdi_internal}
 (nil))
(insn 29 28 30 7 (set (reg:DI 5 di)
(reg:DI 105)) "pr119594.c":12:3 95 {*movdi_internal}
 (expr_list:REG_EQUAL (const_int 4294967295 [0x])
(expr_list:REG_DEAD (reg:DI 105)
(nil
That IMHO correctly reflects the state, there is MEM  [(int
*)&g] = 18446744073709551615; first and then later g[1] = 0; which affectively
changes it to 
MEM  [(int *)&g] = 0x;
And g is otherwise unused.
Now, fwprop1 turns that info
(insn 27 24 28 7 (set (reg/v:DI 101 [ g ])
(zero_extend:DI (subreg:SI (reg/v:DI 101 [ g ]) 0))) "pr119594.c":11:8
175 {*zero_extendsidi2}
 (expr_list:REG_EQUAL (const_int 4294967295 [0x])
(expr_list:REG_UNUSED (reg/v:DI 101 [ g ])
(nil
(insn 28 27 29 7 (set (reg:DI 105)
(const_int 4294967295 [0x])) "pr119594.c":12:3 95
{*movdi_internal}
 (nil))
(insn 29 28 30 7 (set (reg:DI 5 di)
(reg:DI 105)) "pr119594.c":12:3 95 {*movdi_internal}
 (expr_list:REG_EQUAL (const_int 4294967295 [0x])
(expr_list:REG_DEAD (reg:DI 105)
(nil
(that part is reasonable) but at the same time it removed insn 7, which is not
correct because its SET_DEST is used in insn 27.

[Bug rtl-optimization/119607] [15 regression] glib miscompiled since r15-7895-gb191e8bdecf881 with -O3 -m32 -march=x86-64 -mtune=znver2 -fno-semantic-interposition

2025-04-03 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119607

--- Comment #3 from Sam James  ---
Created attachment 60971
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=60971&action=edit
gclosure.c.i.xz

gclosure.o seems to be the victim.

It's built with:
```
x86_64-pc-linux-gnu-gcc -m32 -Igobject/libgobject-2.0.so.0.8400.0.p -Igobject
-I../glib-2.84.0/gobject -I. -I../glib-2.84.0 -Iglib -I../glib-2.84.0/glib
-I/usr/lib/libffi/include -fvisibility=hidden -fdiagnostics-color=always
-D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -Wextra -Wpedantic -std=gnu99 -O2 -g
-D_GNU_SOURCE -fno-strict-aliasing -DG_ENABLE_DEBUG -Wduplicated-branches
-Wfloat-conversion -Wimplicit-fallthrough -Wmisleading-indentation
-Wmissing-field-initializers -Wnonnull -Wnull-dereference -Wunused
-Wno-unused-parameter -Wno-cast-function-type -Wno-pedantic
-Wno-format-zero-length -Wno-variadic-macros -Werror=format=2 -Werror=init-self
-Werror=missing-include-dirs -Werror=pointer-arith -Werror=unused-result
-Wstrict-prototypes -Wno-bad-function-cast
-Werror=implicit-function-declaration -Werror=missing-prototypes
-Werror=pointer-sign -O3 -march=x86-64 -mtune=znver2
-fno-semantic-interposition -ggdb3 -fPIC '-DG_LOG_DOMAIN="GLib-GObject"'
-DGOBJECT_COMPILATION -MD -MQ gobject/libgobject-2.0.so.0.8400.0.p/gclosure.c.o
-MF gobject/libgobject-2.0.so.0.8400.0.p/gclosure.c.o.d -o
gobject/libgobject-2.0.so.0.8400.0.p/gclosure.c.o -c
../glib-2.84.0/gobject/gclosure.c -save-temps
```

[Bug rtl-optimization/119607] [15 regression] glib miscompiled since r15-7895-gb191e8bdecf881 with -O3 -m32 -march=x86-64 -mtune=znver2 -fno-semantic-interposition

2025-04-03 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119607

Sam James  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=119594
 CC||jakub at gcc dot gnu.org,
   ||rsandifo at gcc dot gnu.org

--- Comment #4 from Sam James  ---
The first diff is in fwprop1? 

LHS = -mtune=znver2 ...
RHS = -mtune=generic ...

```
@@ -5060,7 +5060,11 @@ change not profitablerescanning insn with uid = 315.
 verify found no changes in insn with uid = 315.
 change not profitablechange not profitablechange not profitablechange not
profitablechange not profitablechange not profitablechange not profitablechange
not profitablerescanning insn w
ith uid = 372.
 verify found no changes in insn with uid = 372.
-change not profitablechange not profitablechange not profitablechange not
profitablechange not profitablechange not profitable
+change not profitablechange not profitablechange not profitablechange not
profitablerescanning insn with uid = 404.
+verify found no changes in insn with uid = 404.
+change not profitablerescanning insn with uid = 411.
+verify found no changes in insn with uid = 411.
+change not profitable

 try_optimize_cfg iteration 1

@@ -5089,7 +5093,7 @@ deleting insn with uid = 216.
 deleting insn with uid = 215.
 Deleted 23 trivially dead insns

-Number of successful forward propagations: 5
```

```
@@ -6643,12 +6647,11 @@ Dataflow summary:
  (expr_list:REG_DEAD (reg:SI 347 [ i_144 ])
 (expr_list:REG_UNUSED (reg:CC 17 flags)
 (nil
-(insn # # # 24 (set (mem/f:SI (plus:SI (reg/v/f:SI 191 [ __p ])
-(reg:SI 174 [ _97 ])) [3 _98->data+0 S4 A32])
+(insn # # # 24 (set (mem/f:SI (plus:SI (mult:SI (reg:SI 347 [ i_144 ])
+(const_int 8 [0x8]))
+(reg/v/f:SI 191 [ __p ])) [3 _98->data+0 S4 A32])
 (reg/v/f:SI 215 [ pre_marshal_data ]))
"../glib-2.84.0/gobject/gclosure.c":426:30# {*movsi_internal}
- (expr_list:REG_DEAD (reg/v/f:SI 215 [ pre_marshal_data ])
-(expr_list:REG_DEAD (reg/v/f:SI 191 [ __p ])
-(nil
+ (nil))
 (debug_insn # # # 24 (debug_marker) "../glib-2.84.0/gobject/gclosure.c":427:3#
  (nil))
 (insn # # # 24 (set (reg/f:SI 348 [ closure_122(D)->notifiers ])
@@ -6677,12 +6680,11 @@ Dataflow summary:
 (mem/f:SI (plus:SI (reg/v/f:SI 214 [ closure ])
 (const_int 12 [0xc])) [4 closure_122(D)->notifiers+0 S4 A32]))
"../glib-2.84.0/gobject/gclosure.c":428:34# {*movsi_internal}
  (nil))
-(insn # # # 24 (set (mem/f:SI (plus:SI (reg/f:SI 349 [
closure_122(D)->notifiers ])
-(reg:SI 179 [ _103 ])) [3 _104->data+0 S4 A32])
+(insn # # # 24 (set (mem/f:SI (plus:SI (plus:SI (reg/f:SI 349 [
closure_122(D)->notifiers ])
+(reg:SI 174 [ _97 ]))
+(const_int 8 [0x8])) [3 _104->data+0 S4 A32])
 (reg/v/f:SI 217 [ post_marshal_data ]))
"../glib-2.84.0/gobject/gclosure.c":428:34# {*movsi_internal}
- (expr_list:REG_DEAD (reg/f:SI 349 [ closure_122(D)->notifiers ])
-(expr_list:REG_DEAD (reg/v/f:SI 217 [ post_marshal_data ])
-(nil
+ (nil))
 (debug_insn # # # 24 (debug
[...]
```

I need help at this point. jakub, would you be able to look?

[Bug c++/119608] New: ICE compiling module interface including boost.json in GMF and exporting one entity

2025-04-03 Thread cjangus at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119608

Bug ID: 119608
   Summary: ICE compiling module interface including boost.json in
GMF and exporting one entity
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: cjangus at gmail dot com
  Target Milestone: ---

Created attachment 60972
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=60972&action=edit
Archive containing preprocessed module interface.

Single file preprocessed repro attached.

The original source code is just:

---
module;

#include 

export module repro;

export namespace boost::json
{
using boost::json::visit;
}
---

but with a minimally modified boost to work around TU exposure issues in
boost.system.

Command line:
g++ -fPIC -Og -g -Wall -fmodules -std=c++2b -finput-charset=UTF-8 -o
out/boost_json_repro2.a.gcm.o -c -x c++ -fpreprocessed -fdirectives-only
boost_json_repro2.a.gcm.ii

Output:
boost_json_repro2.a.gcm.ii:215201:8: internal compiler error: in
write_location, at cp/module.cc:17560
215201 | export module repro;

System/version:

Target: x86_64-pc-linux-gnu
Configured with: ../configure --prefix=/opt/gcc-latest --enable-languages=c,c++
--enable-libstdcxx-debug --enable-libstdcxx-backtrace --disable-bootstrap
--disable-multilib --disable-libvtv --disable-libssp --disable-libffi
--with-system-zlib --without-isl --with-arch_64=x86-64-v2
--with-bugurl=https://gcc.gnu.org/bugzilla
gcc version 15.0.1 20250330 (experimental) (GCC)

[Bug rtl-optimization/119594] [15 regression] wrong code at -Os with "-fno-dce -fno-tree-dce -fno-tree-dse" on x86_64-linux-gnu since r15-1575

2025-04-03 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119594

--- Comment #5 from Jakub Jelinek  ---
Note, with the PR115910 patch this is latent again, the difference is one extra
(expr_list:REG_EQUAL (const_int 4294967295 [0x])
note.
(insn 28 27 29 (set (reg:DI 105)
(const_int 4294967295 [0x])) "pr119594.c":12:3 -1
 (nil))

(insn 29 28 30 (set (reg:DI 5 di)
(reg:DI 105)) "pr119594.c":12:3 -1
 (nil))
vs.
(insn 28 27 29 (set (reg:DI 105)
(const_int 4294967295 [0x])) "pr119594.c":12:3 -1
 (nil))

(insn 29 28 30 (set (reg:DI 5 di)
(reg:DI 105)) "pr119594.c":12:3 -1
 (expr_list:REG_EQUAL (const_int 4294967295 [0x])
(nil)))
from expansion and with many further changes during fwprop1, starting with
-original cost = 1, replacement cost = 5; rejecting replacement
-change not profitable Setting REG_EQUAL note
+original cost = 4, replacement cost = 4; keeping replacement
+rescanning insn with uid = 26.
+updating insn 26 in-place
+verify found no changes in insn with uid = 26.

 propagating insn 26 into insn 27, replacing:
 (set (reg/v:DI 101 [ g ])
 (reg:DI 104 [ g ]))
-successfully matched this instruction to *zero_extendsidi2:
+successfully matched this instruction to *movdi_internal:
 (set (reg/v:DI 101 [ g ])
-(zero_extend:DI (subreg:SI (reg/v:DI 101 [ g ]) 0)))
-original cost = 4, replacement cost = 1; keeping replacement
+(const_int 4294967295 [0x]))
+original cost = 4, replacement cost = 4; keeping replacement
 rescanning insn with uid = 27.
 updating insn 27 in-place
 verify found no changes in insn with uid = 27.

[Bug tree-optimization/113281] [11 Regression] Latent wrong code due to vectorization of shift reduction and missing promotions since r9-1590

2025-04-03 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281

--- Comment #38 from GCC Commits  ---
The master branch has been updated by Alexandre Oliva :

https://gcc.gnu.org/g:6f72af0c2e389e9252b6994643155e51ef68821b

commit r15-9169-g6f72af0c2e389e9252b6994643155e51ef68821b
Author: Alexandre Oliva 
Date:   Thu Apr 3 03:06:44 2025 -0300

[testsuite] [riscv] xfail some [PR113281] tests

Some of the tests regressed with a fix for the vectorization of
shifts.  The riscv cost models need to be adjusted to avoid the
unprofitable optimization.  The failure of these tests has been known
since 2024-03-13, without a forthcoming fix, so I suggest we consider
it expected by now.  Adjust the tests to reflect that expectation.


for  gcc/testsuite/ChangeLog

PR tree-optimization/113281
* gcc.dg/vect/costmodel/riscv/rvv/pr113281-1.c: XFAIL.
* gcc.dg/vect/costmodel/riscv/rvv/pr113281-2.c: Likewise.
* gcc.dg/vect/costmodel/riscv/rvv/pr113281-5.c: Likewise.

[Bug libstdc++/119606] [15 regression] Commit 'Optimize string constructor' causes regression in Snappy workload for -mcpu=neoverse-v2 with LTO

2025-04-03 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119606

Andrew Pinski  changed:

   What|Removed |Added

  Component|tree-optimization   |libstdc++
   Keywords||missed-optimization

--- Comment #1 from Andrew Pinski  ---
Is Snappy written using C++17 or c++20?

[Bug cobol/119597] SEGV on Cobol "hello world" on Power

2025-04-03 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119597

--- Comment #3 from Richard Biener  ---
Possibly issues of the manual layout of cobol FE <-> libgcobol interoperability
structs?

[Bug libstdc++/119606] [15 regression] Commit 'Optimize string constructor' causes regression in Snappy workload for -mcpu=neoverse-v2 with LTO

2025-04-03 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119606

--- Comment #3 from Andrew Pinski  ---
Is this benchmarking the whole benchmark program running or a function of the
benchmark? If the former I am not sure this benchmark is a good one ...

[Bug middle-end/119482] slow compilation on ladybird interpreter

2025-04-03 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119482

--- Comment #15 from Richard Biener  ---
(In reply to ak from comment #13)
> This patch gives another 23% speedup due to reducing time handling the
> linked lists for lazy bitmaps. Probably there is more tuning potential in
> bitmaps
> (most of the top 10 hot functions are bitmap related)
> 
> The patch actually saves memory too:
> 
> $ /usr/bin/time --format 'time %E maxrss %MK' ../obj-fast/gcc/cc1plus-bitmap
> -std=gnu++20 -O2 pr119482.cc  -quiet -w
> time 0:43.19 maxrss 854920K
> $ /usr/bin/time --format 'time %E maxrss %MK'
> ../obj-fast/gcc/cc1plus-large-bitmap -std=gnu++20 -O2 pr119482.cc  -quiet -w
> time 0:35.96 maxrss 788424K
> 
> -8% memory savings. I guess this means the access patterns are actually not
> that
> sparse. 

Note this will be bad for sparse access patterns which is what these bitmaps
are for.  Depending on how the bitmaps in question are used there's the
possibility to use a tree instead of a linked list of elements
via bitmap_tree_view () (but no bulk/iteration possible then).

> diff --git a/gcc/bitmap.h b/gcc/bitmap.h
> index 4a73ccdba794..a0a5098a25da 100644
> --- a/gcc/bitmap.h
> +++ b/gcc/bitmap.h
> @@ -283,7 +283,7 @@ typedef unsigned long BITMAP_WORD;
>  /* Number of words to use for each element in the linked list.  */
> 
>  #ifndef BITMAP_ELEMENT_WORDS
> -#define BITMAP_ELEMENT_WORDS ((128 + BITMAP_WORD_BITS - 1) /
> BITMAP_WORD_BITS)
> +#define BITMAP_ELEMENT_WORDS ((512 + BITMAP_WORD_BITS - 1) /
> BITMAP_WORD_BITS)
>  #endif
> 
>  /* Number of bits in each actual element of a bitmap.  */

[Bug rtl-optimization/119607] [15 regression] glib miscompiled since r15-7895-gb191e8bdecf881 with -O3 -m32 -march=x86-64 -mtune=znver2 -fno-semantic-interposition

2025-04-03 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119607

Sam James  changed:

   What|Removed |Added

Summary|[15 regression] glib|[15 regression] glib
   |miscompiled since   |miscompiled since
   |r15-7895-gb191e8bdecf881|r15-7895-gb191e8bdecf881
   ||with -O3 -m32 -march=x86-64
   ||-mtune=znver2
   ||-fno-semantic-interposition
   Target Milestone|--- |15.0

[Bug libstdc++/119593] Format width is not correctly handled for wide string/characters

2025-04-03 Thread tkaminsk at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119593

--- Comment #2 from Tomasz Kamiński  ---
The problem is not limited to wide characters, and also appears for wide
strings:
std::format(L"{:+<3}", L"\U0001f921"); // two '+' of paddings
// https://godbolt.org/z/o4s7qTEz9

[Bug rtl-optimization/119607] [15 regression] glib miscompiled since r15-7895-gb191e8bdecf881 with -O3 -m32 -march=x86-64 -mtune=znver2 -fno-semantic-interposition

2025-04-03 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119607

--- Comment #6 from Richard Biener  ---
Does -fno-ipa-ra fix it?  The bisection is odd if fwprop1 already differs.

[Bug rtl-optimization/119594] [15 regression] wrong code at -Os with "-fno-dce -fno-tree-dce -fno-tree-dse" on x86_64-linux-gnu since r15-1575

2025-04-03 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119594

--- Comment #7 from Jakub Jelinek  ---
The steps are in particular that the fwprop pass proper optimizes
(insn 26 24 27 7 (set (reg:DI 104 [ g ])
(zero_extend:DI (subreg:SI (reg/v:DI 101 [ g ]) 0))) "pr119594.c":11:8
175 {*zero_extendsidi2}
 (expr_list:REG_DEAD (reg/v:DI 101 [ g ])
(nil)))
(insn 27 26 28 7 (set (reg/v:DI 101 [ g ])
(reg:DI 104 [ g ])) "pr119594.c":11:8 95 {*movdi_internal}
 (expr_list:REG_DEAD (reg:DI 104 [ g ])
(expr_list:REG_UNUSED (reg/v:DI 101 [ g ])
(nil
into
(insn 26 24 27 7 (set (reg:DI 104 [ g ])
(zero_extend:DI (subreg:SI (reg/v:DI 101 [ g ]) 0))) "pr119594.c":11:8
175 {*zero_extendsidi2}
 (expr_list:REG_EQUAL (const_int 4294967295 [0x])
(expr_list:REG_DEAD (reg/v:DI 101 [ g ])
(nil
(insn 27 26 28 7 (set (reg/v:DI 101 [ g ])
(zero_extend:DI (subreg:SI (reg/v:DI 101 [ g ]) 0))) "pr119594.c":11:8
175 {*zero_extendsidi2}
 (expr_list:REG_EQUAL (const_int 4294967295 [0x])
(expr_list:REG_UNUSED (reg/v:DI 101 [ g ])
(nil
(i.e. replaces (reg:DI 104 [ g ]) in insn 27 with the SET_SRC of insn 26.
REG_DEAD/REG_UNUSED notes aren't updated, that generally isn't responsibility
of passes, passes that want to use those notes should df_note_add_problem and
df_analyze.

But then it calls delete_trivially_dead_insns and that subpass deletes insn 26
(that is ok) and insn 7 (not ok).  I don't see REG_DEAD note uses in there
though.

[Bug rtl-optimization/119607] [15 regression] glib miscompiled since r15-7895-gb191e8bdecf881 with -O3 -m32 -march=x86-64 -mtune=znver2 -fno-semantic-interposition

2025-04-03 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119607

--- Comment #7 from Sam James  ---
It doesn't, but neither does -fdisable-rtl-fwprop{1,2}, so let me check again.

[Bug c++/119564] ICE using module including boost headers

2025-04-03 Thread cjangus at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119564

--- Comment #11 from Cameron Angus  ---
Okay, updated attachments reproduce bug on gcc version 15.0.1 20250330, from a
few days ago. The change added a bit more to the GMF, and also required
exporting something. Very slightly different output, but ICE points to the
exact same line in GCC source as before.

In file included from
/home/cjangus/kantan/out/gcc-latest-modules/libboost-system-1.85.0/include/boost/system/detail/error_code.hpp:14,
 from
/home/cjangus/kantan/out/gcc-latest-modules/libboost-system-1.85.0/include/boost/system/errc.hpp:14,
 from
/home/cjangus/kantan/out/gcc-latest-modules/libboost-system-1.85.0/include/boost/system/result.hpp:8,
 from
/home/cjangus/kantan/out/gcc-latest-modules/libboost-json-1.85.0/include/boost/json/result_for.hpp:17,
 from
/home/cjangus/kantan/out/gcc-latest-modules/libboost-json-1.85.0/include/boost/json/system_error.hpp:17,
 from
/home/cjangus/kantan/out/gcc-latest-modules/libboost-json-1.85.0/include/boost/json/error.hpp:14,
 from
/home/cjangus/kantan/out/gcc-latest-modules/libboost-json-1.85.0/include/boost/json/detail/except.hpp:13,
 from
/home/cjangus/kantan/out/gcc-latest-modules/libboost-json-1.85.0/include/boost/json/string.hpp:19,
 from
/home/cjangus/kantan/out/gcc-latest-modules/libboost-json-1.85.0/include/boost/json/value.hpp:21,
 from
/home/cjangus/kantan/out/gcc-latest-modules/libboost-json-1.85.0/include/boost/json/array.hpp:1762,
 from
/home/cjangus/kantan/out/gcc-latest-modules/libboost-json-1.85.0/include/boost/json.hpp:15,
 from
/home/cjangus/kantan/root/kdeps/kdeps/kdeps/include/kdeps/gcc_repro.hpp:2,
 from
/home/cjangus/kantan/root/kdeps/kdeps/kdeps/include/kdeps/gcc_repro_a.ipp:4,
of module gcc_repro_a, imported at
/home/cjangus/kantan/root/kdeps/kdeps/kdeps/include/kdeps/gcc_repro.cpp:4:
/home/cjangus/kantan/out/gcc-latest-modules/libboost-system-1.85.0/include/boost/system/detail/error_category.hpp:
In member function ‘virtual bool boost::system::error_category::failed(int)
const’:
/home/cjangus/kantan/out/gcc-latest-modules/libboost-system-1.85.0/include/boost/system/detail/error_category.hpp:106:18:
internal compiler error: in tree_node_structure_for_code, at tree.cc:603
  106 | virtual bool failed( int ev ) const noexcept
  |  ^~

[Bug tree-optimization/119619] New: fdump-passes says musttail pass is off when a function with musttail exists

2025-04-03 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119619

Bug ID: 119619
   Summary: fdump-passes says musttail pass is off when a function
with musttail exists
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pinskia at gcc dot gnu.org
  Target Milestone: ---

Take:
```
int f() ;
struct A{ ~A(); };
int g()
{
  A a;
  [[clang::musttail]]  return f();
}

```

Compile with `-O0 -fdump-passes` and grep for musttail and you get:
   tree-musttail   :  OFF

This confuses compiler explorer and not have an option for musttail.

[Bug libstdc++/119620] New: flat_set::emplace is constrained

2025-04-03 Thread hewillk at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119620

Bug ID: 119620
   Summary: flat_set::emplace is constrained
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hewillk at gmail dot com
  Target Milestone: ---

This might be seen as *asking for trouble* bug report:

Currently, in libstdc++, flat_set::emplace requires
is_constructible_v, which is not required by the
standard.

https://godbolt.org/z/4q8hYssao

#include 

template
concept can_emplace_str = requires (C& c) {
  c.emplace(" ");
};
static_assert(can_emplace_str>);

It is worth noting that flat_multiset::emplace is constrained through.

I don't know why the standard sometimes constrains emplace(), sometimes only
constrains insert(), and sometimes constrains neither.

It seems to me that we should always require is_constructible_v or is_constructible_v for both functions instead of
just Preconditions?

Since those hard errors are always annoying as it's pointed out in LWG 4121.

[Bug tree-optimization/119616] Firefox fails to build with PGO (error: cannot tail-call: other reasons)

2025-04-03 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119616

--- Comment #13 from Andrew Pinski  ---
Created attachment 60991
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=60991&action=edit
Reduced as far as I can reduce it

[Bug tree-optimization/119616] mixing musttail with normal returns with taking the address of an argument

2025-04-03 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119616

Andrew Pinski  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2025-04-04
Summary|Firefox fails to build with |mixing musttail with normal
   |PGO (error: cannot  |returns with taking the
   |tail-call: other reasons)   |address of an argument
 Status|UNCONFIRMED |NEW

--- Comment #14 from Andrew Pinski  ---
IAnother reduced testcase:
```
int ReadPackedFixed(int*);
int MiniParse(int ptr);
int PackedFixed(int ptr) {
  if (!ptr)
[[clang::musttail]] return MiniParse(ptr);
  ReadPackedFixed(&ptr);
  return  1;
}
```

Basically it is all about taking the address of the argument and having a
normal return without a musttail.

[Bug tree-optimization/119616] mixing musttail with normal returns with taking the address of an argument

2025-04-03 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119616

--- Comment #15 from Andrew Pinski  ---
Why it fails for FF with PGO is because we decide not to inline a few things
and things just go down hill. Why it works at -O1 vs -O2 is because the
musttail pass skips over non-musttail edges.

[Bug target/119596] x86: too eager use of rep movsq/rep stosq for inlined ops

2025-04-03 Thread ak at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119596

--- Comment #11 from ak at gcc dot gnu.org ---
#define m_CORE_AVX512 (m_SKYLAKE_AVX512 | m_CANNONLAKE \
   | m_ICELAKE_CLIENT | m_ICELAKE_SERVER | m_CASCADELAKE \
   | m_TIGERLAKE | m_COOPERLAKE | m_SAPPHIRERAPIDS \
   | m_ROCKETLAKE | m_GRANITERAPIDS | m_GRANITERAPIDS_D \
   | m_DIAMONDRAPIDS)

so if you use -mtune=sapphirerapids it should work.

Based on that I think this is invalid.

[Bug target/119596] x86: too eager use of rep movsq/rep stosq for inlined ops

2025-04-03 Thread ak at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119596

ak at gcc dot gnu.org changed:

   What|Removed |Added

 Status|RESOLVED|NEW
 Resolution|DUPLICATE   |---

--- Comment #10 from ak at gcc dot gnu.org ---
Also I don't think this really a duplicate.

[Bug cobol/119414] cobol driver unconditionally adds platform-specific command line options.

2025-04-03 Thread iains at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119414

Iain Sandoe  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |iains at gcc dot gnu.org

--- Comment #4 from Iain Sandoe  ---
I'll do a patch to remove the un-needed options and post it.

[Bug target/119308] Cobol ICE on "hello world" on POWER in rs6000_output_function_epilogue

2025-04-03 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119308

--- Comment #10 from GCC Commits  ---
The master branch has been updated by Peter Bergner :

https://gcc.gnu.org/g:c669ab0a866697577fec0c8c2e662640c4be4c94

commit r15-9188-gc669ab0a866697577fec0c8c2e662640c4be4c94
Author: Peter Bergner 
Date:   Thu Apr 3 10:52:29 2025 -0500

rs6000: Add Cobol support to traceback table [PR119308]

The AIX traceback table documentation states the tbtab "lang" field for
Cobol should be set to 7.  Use it.

2025-04-03  Peter Bergner  

gcc/
PR target/119308
* config/rs6000/rs6000-logue.cc (rs6000_output_function_epilogue):
Handle GCC COBOL for the tbtab lang field.

[Bug rtl-optimization/119607] [15 regression] glib miscompiled since r15-7895-gb191e8bdecf881 with -O3 -m32 -march=x86-64 -mtune=znver2 -fno-semantic-interposition

2025-04-03 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119607

--- Comment #12 from Jakub Jelinek  ---
I don't see any IL differences in that function between r15-7894 and r15-7895
before the ira pass.
There are significant differences in the IRA pass but that is to be expected.

[Bug target/119596] x86: too eager use of rep movsq/rep stosq for inlined ops

2025-04-03 Thread ak at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119596

ak at gcc dot gnu.org changed:

   What|Removed |Added

 CC||ak at gcc dot gnu.org

--- Comment #9 from ak at gcc dot gnu.org ---
The problem here seems to be that REP MOVSQ is generated.

The Intel CPUs have optimizations for short strings (enumerated by the "fast
short strings CPUID"), but they only work with MOVSB, not MOVSQ.

Most likely your regression would go away with MOVSB.

gcc has this:

/* X86_TUNE_PREFER_KNOWN_REP_MOVSB_STOSB: Enable use of REP MOVSB/STOSB to
   move/set sequences of bytes with known size.  */
DEF_TUNE (X86_TUNE_PREFER_KNOWN_REP_MOVSB_STOSB,
  "prefer_known_rep_movsb_stosb",
  m_SKYLAKE | m_CORE_HYBRID | m_CORE_ATOM | m_TREMONT | m_CORE_AVX512
  | m_ZHAOXIN)


Likely this needs to be enabled for SPR too.

[Bug target/119610] aarch64: Wrong unwind info with -fstack-clash-protection -fstack-protector-strong since r14-3900-g3e4afea3b192c2

2025-04-03 Thread acoplan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119610

Alex Coplan  changed:

   What|Removed |Added

Summary|aarch64: Wrong unwind info  |aarch64: Wrong unwind info
   |with|with
   |-fstack-clash-protection|-fstack-clash-protection
   |-fstack-protector-strong|-fstack-protector-strong
   ||since
   ||r14-3900-g3e4afea3b192c2

--- Comment #2 from Alex Coplan  ---
More specifically started with
r14-3900-g3e4afea3b192c205c9a9da99f4cac65c68087eaf, FWIW.

[Bug rtl-optimization/119607] [15 regression] glib miscompiled since r15-7895-gb191e8bdecf881 with -O3 -m32 -march=x86-64 -mtune=znver2 -fno-semantic-interposition

2025-04-03 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119607

--- Comment #17 from Jakub Jelinek  ---
I don't have cycles to test this nor push upstream, so if you could do that, it
would be great.

[Bug tree-optimization/119614] [15 regression] protobuf-29.4 fails to build with -O2 (error: cannot tail-call: call and return value are different)

2025-04-03 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119614

--- Comment #1 from Jakub Jelinek  ---
Reduced testcase:

volatile int v;

[[gnu::noinline]] const char *
foo (int x)
{
  v += x;
  return 0;
}

const char *
bar (int x)
{
  if (x == 42)
[[gnu::musttail]] return foo (42);
  [[gnu::musttail]] return foo (32);
}

const char *
baz (int x)
{
  if (x == 5)
return foo (42);
  return foo (32);
}

Guess I need to extend my hack to work also on pointer singletons and maybe
floating point values as well.

[Bug target/119533] RISC-V: libgo build failures (ICE) with Vector enabled

2025-04-03 Thread vineetg at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119533

--- Comment #9 from Vineet Gupta  ---
(In reply to Andrew Pinski from comment #3)
> I suspect if you run the testsuite with -fnon-call-exceptions you might find
> a reduced C (or C++) testcae for the same issue.

No joy. 

With the toggle forced, there a bunch of additional "excess error" failures, a
few more ICE (in gimple) but nothing which hits similar issue in VSETVL pass or
other places.

[Bug tree-optimization/119614] [15 regression] protobuf-29.4 fails to build with -O2 (error: cannot tail-call: call and return value are different)

2025-04-03 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119614

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #2 from Jakub Jelinek  ---
volatile int v;

[[gnu::noinline]] double
foo (int x)
{
  v += x;
  return 0.5;
}

double
bar (int x)
{
  if (x == 42)
[[gnu::musttail]] return foo (42);
  [[gnu::musttail]] return foo (32);
}

double
baz (int x)
{
  if (x == 5)
return foo (42);
  return foo (32);
}

shows that we handle floating point right already, but
volatile int v;

[[gnu::noinline]] const char *
foo (int x)
{
  v += x;
  return (const char *) -42;
}

const char *
bar (int x)
{
  if (x == 42)
[[gnu::musttail]] return foo (42);
  [[gnu::musttail]] return foo (32);
}

const char *
baz (int x)
{
  if (x == 5)
return foo (42);
  return foo (32);
}

shows that for pointers we can't handle just NULL and need to handle other
singletons.

[Bug tree-optimization/119614] [15 regression] protobuf-29.4 fails to build with -O2 (error: cannot tail-call: call and return value are different)

2025-04-03 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119614

Jakub Jelinek  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2025-04-03
   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org

[Bug libstdc++/112934] excessive code for std::map::erase(key)

2025-04-03 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112934

Jonathan Wakely  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |redi at gcc dot gnu.org
   Target Milestone|--- |16.0
 Status|NEW |ASSIGNED

--- Comment #4 from Jonathan Wakely  ---
Soon after GCC 15 branches from master, which should be in a few weeks.

[Bug libstdc++/117983] [12/13/14 Regression] -Wstringop-overflow false positive for __builtin_memmove from vector::insert

2025-04-03 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117983

--- Comment #8 from GCC Commits  ---
The releases/gcc-14 branch has been updated by Jonathan Wakely
:

https://gcc.gnu.org/g:4366711d2d66ea9a2d4fe9dd112795ef0c6f785c

commit r14-11508-g4366711d2d66ea9a2d4fe9dd112795ef0c6f785c
Author: Jonathan Wakely 
Date:   Fri Mar 28 22:00:38 2025 +

libstdc++: Fix bogus -Wstringop-overflow in std::vector::insert [PR117983]

This was fixed on trunk by r15-4473-g3abe751ea86e34, but that isn't
suitable for backporting. Instead, just add another unreachable
condition in std::vector::_M_range_insert so the compiler knows this
memcpy doesn't use a length originating from a negative ptrdiff_t
converted to a very positive size_t.

libstdc++-v3/ChangeLog:

PR libstdc++/117983
* include/bits/vector.tcc (vector::_M_range_insert): Add
unreachable condition to tell the compiler begin() <= end().
* testsuite/23_containers/vector/modifiers/insert/117983.cc: New
test.

Reviewed-by: Tomasz KamiÅski 

(cherry picked from commit 878812b6f6905774ab37cb78903e3e11bf1c508c)

[Bug target/102294] memset expansion is sometimes slow for small sizes

2025-04-03 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102294

--- Comment #15 from Andrew Pinski  ---
*** Bug 119596 has been marked as a duplicate of this bug. ***

[Bug target/119596] x86: too eager use of rep movsq/rep stosq for inlined ops

2025-04-03 Thread mjguzik at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119596

--- Comment #13 from Mateusz Guzik  ---
I see there is a significant disconnect here between what I meant with this
problem report and your perspective, so I'm going to be more explicit.

Of course for best performance on a given uarch you would want to -mtune for
that uarch, but that's not the goal here.

Rather, with the Linux kernel as an example, assume the code has to be compiled
with a generic x86_64 chip in mind. Then I claim the asm emitted for small
inline memcpy and memset uses loses on performance.

Last I had a serious look at string ops optimization was around 2018 or 2019
and at that time all CPUs (AMD included) were struggling with short ops vs rep
stosq/movsq.

It makes sense to request benchmarks from other CPUs today.

To that end I'm asking what kind of standard is expected here in terms of tests
to run. As for AMD uarchs, I can get my hands on 2: EPYC Genoa (4th gen) and
EPYC 7571.

[Bug rtl-optimization/119607] [15 regression] glib miscompiled since r15-7895-gb191e8bdecf881 with -O3 -m32 -march=x86-64 -mtune=znver2 -fno-semantic-interposition

2025-04-03 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119607

Sam James  changed:

   What|Removed |Added

 CC||bugzilla at tecnocode dot co.uk

--- Comment #15 from Sam James  ---
Thanks. That so far looks promising (testsuite is clean).

CC'd a glib maintainer but are you planning to submit that, or shall I with
your co-authored-by and attribution, if testing in more places succeeds?

(I think your change is sufficient as g_closure_ref uses ATOMIC_INC_ASSIGN ->
ATOMIC_CHANGE_FIELD?)

[Bug rtl-optimization/119607] [15 regression] glib miscompiled since r15-7895-gb191e8bdecf881 with -O3 -m32 -march=x86-64 -mtune=znver2 -fno-semantic-interposition

2025-04-03 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119607

--- Comment #16 from Sam James  ---
https://gitlab.gnome.org/GNOME/glib/-/issues/1672 references the assertion not
being atomic at least:
> Unsynchronized read of ref_count in g_closure_ref / g_closure_unref from 
> assertion.

Looks like gvariant also had similar problems before:
https://gitlab.gnome.org/GNOME/glib/-/commit/6d108587a4896357be3ca42a43d22c825af62e91.

[Bug target/118538] throw not caught causing an seg fault rather than a `terminate called after throwing an instance of 'int'` message

2025-04-03 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118538

Andrew Pinski  changed:

   What|Removed |Added

 Status|WAITING |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #23 from Andrew Pinski  ---
PR 119610 has more analysis of what is going wrong. So marking this as a dup.

*** This bug has been marked as a duplicate of bug 119610 ***

[Bug target/119610] [12/13/14/15 regression] aarch64: Wrong unwind info with -fstack-clash-protection -fstack-protector-strong since r14-3900-g3e4afea3b192c2

2025-04-03 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119610

Andrew Pinski  changed:

   What|Removed |Added

 CC||disservin.social at gmail dot 
com

--- Comment #3 from Andrew Pinski  ---
*** Bug 118538 has been marked as a duplicate of this bug. ***

[Bug rtl-optimization/119607] [15 regression] glib miscompiled since r15-7895-gb191e8bdecf881 with -O3 -m32 -march=x86-64 -mtune=znver2 -fno-semantic-interposition

2025-04-03 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119607

Sam James  changed:

   What|Removed |Added

 Resolution|--- |MOVED
 Status|UNCONFIRMED |RESOLVED
   See Also||https://gitlab.gnome.org/GN
   ||OME/glib/-/merge_requests/4
   ||575

--- Comment #18 from Sam James  ---
Many thanks Jakub. Sent
https://gitlab.gnome.org/GNOME/glib/-/merge_requests/4575.

[Bug tree-optimization/119605] change the code fixup_cfg for __builtin_unreachable to be a verifier

2025-04-03 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119605

--- Comment #3 from Andrew Pinski  ---
(In reply to Richard Biener from comment #2)
> I thought we verify this already ...

We don't. Even Jan thought we verfified this already too, see
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117892#c4 .

[Bug target/119610] [12/13/14/15 regression] aarch64: Wrong unwind info with -fstack-clash-protection -fstack-protector-strong since r14-3900-g3e4afea3b192c2

2025-04-03 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119610

Andrew Pinski  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2025-04-03

--- Comment #4 from Andrew Pinski  ---
.

[Bug tree-optimization/119613] New: [15 regression] ICE when building protobuf-29.4 with -O0 (purge_dead_edges, at cfgrtl.cc:3356)

2025-04-03 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119613

Bug ID: 119613
   Summary: [15 regression] ICE when building protobuf-29.4 with
-O0 (purge_dead_edges, at cfgrtl.cc:3356)
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Keywords: ice-on-valid-code
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: sjames at gcc dot gnu.org
  Target Milestone: ---

I have another bug coming (with explicit musttail error), but hit this along
the way. I'll reduce this one when I have the other done.

```
$ g++ -c generated_message_tctable_lite.cc.ii
during RTL pass: expand
/var/tmp/portage/dev-libs/protobuf-29.4/work/protobuf-29.4/src/google/protobuf/generated_message_tctable_lite.cc:
In static member function ‘static const char*
google::protobuf::internal::TcParser::MiniParse(google::protobuf::MessageLite*,
const char*, google::protobuf::internal::ParseContext*,
google::protobuf::internal::TcFieldData, const
google::protobuf::internal::TcParseTableBase*, uint64_t)’:
/var/tmp/portage/dev-libs/protobuf-29.4/work/protobuf-29.4/src/google/protobuf/generated_message_tctable_lite.cc:324:1:
internal compiler error: in purge_dead_edges, at cfgrtl.cc:3356
  324 | }
  | ^
0x5bc08a159a0c internal_error(char const*, ...)
   
/usr/src/debug/sys-devel/gcc-15.0./gcc-15.0./gcc/diagnostic-global-context.cc:517
0x5bc08a159ba7 fancy_abort(char const*, int, char const*)
   
/usr/src/debug/sys-devel/gcc-15.0./gcc-15.0./gcc/diagnostic.cc:1749
0x5bc088ca0ac9 purge_dead_edges(basic_block_def*)
/usr/src/debug/sys-devel/gcc-15.0./gcc-15.0./gcc/cfgrtl.cc:3356
0x5bc08a793200 find_bb_boundaries
   
/usr/src/debug/sys-devel/gcc-15.0./gcc-15.0./gcc/cfgbuild.cc:635
0x5bc08a7922ab find_many_sub_basic_blocks(simple_bitmap_def*)
   
/usr/src/debug/sys-devel/gcc-15.0./gcc-15.0./gcc/cfgbuild.cc:755
0x5bc08a69a391 execute
   
/usr/src/debug/sys-devel/gcc-15.0./gcc-15.0./gcc/cfgexpand.cc:7278
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See <https://bugs.gentoo.org/> for instructions.
```

---

```

Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-pc-linux-gnu/15/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-pc-linux-gnu
Configured with:
/var/tmp/portage/sys-devel/gcc-15.0./work/gcc-15.0./configure
--host=x86_64-pc-linux-gnu --build=x86_64-pc-linux-gnu --prefix=/usr
--bindir=/usr/x86_64-pc-linux-gnu/gcc-bin/15
--includedir=/usr/lib/gcc/x86_64-pc-linux-gnu/15/include
--datadir=/usr/share/gcc-data/x86_64-pc-linux-gnu/15
--mandir=/usr/share/gcc-data/x86_64-pc-linux-gnu/15/man
--infodir=/usr/share/gcc-data/x86_64-pc-linux-gnu/15/info
--with-gxx-include-dir=/usr/lib/gcc/x86_64-pc-linux-gnu/15/include/g++-v15
--disable-silent-rules --disable-dependency-tracking
--with-python-dir=/share/gcc-data/x86_64-pc-linux-gnu/15/python
--enable-libphobos --enable-objc-gc
--enable-languages=c,c++,d,go,objc,obj-c++,fortran,ada,cobol,m2,rust
--enable-obsolete --enable-secureplt --disable-werror --with-system-zlib
--enable-nls --without-included-gettext --disable-libunwind-exceptions
--enable-checking=yes,extra,rtl --with-bugurl=https://bugs.gentoo.org/
--with-pkgversion='Gentoo Hardened 15.0. p, commit
43e87541519c3e496094d7febd6b772ce0fb33b9' --with-gcc-major-version-only
--enable-libstdcxx-time --enable-lto --disable-libstdcxx-pch --enable-shared
--enable-threads=posix --enable-__cxa_atexit --enable-clocale=gnu
--enable-multilib --with-multilib-list=m32,m64 --disable-fixed-point
--enable-targets=all --enable-offload-defaulted
--enable-offload-targets=nvptx-none --enable-libgomp --disable-libssp
--enable-libada --disable-cet --disable-systemtap --enable-valgrind-annotations
--disable-vtable-verify --disable-libvtv --with-zstd --with-isl
--disable-isl-version-check --enable-default-pie --enable-host-pie
--enable-host-bind-now --enable-default-ssp --disable-fixincludes
--with-gxx-libcxx-include-dir=/usr/include/c++/v1
--with-build-config='bootstrap-O3 bootstrap-lto'
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 15.0.1 20250403 (experimental)
92ca72b41a74aef53978cadbda33dd38b69d3ed3 (Gentoo Hardened 15.0. p, commit
43e87541519c3e496094d7febd6b772ce0fb33b9)
```

[Bug tree-optimization/119614] New: [15 regression] protobuf-29.4 fails to build with -O2 (error: cannot tail-call: call and return value are different)

2025-04-03 Thread sjames at gcc dot gnu.org via Gcc-bugs
--with-gxx-libcxx-include-dir=/usr/include/c++/v1
--with-build-config='bootstrap-O3 bootstrap-lto'
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 15.0.1 20250403 (experimental)
92ca72b41a74aef53978cadbda33dd38b69d3ed3 (Gentoo Hardened 15.0. p, commit
43e87541519c3e496094d7febd6b772ce0fb33b9)
```

[Bug c/119612] [15 regression] gcc.dg/pr106465.c newly re-broken

2025-04-03 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119612

Sam James  changed:

   What|Removed |Added

   Keywords||testsuite-fail
 CC||uecker at gcc dot gnu.org
   Target Milestone|--- |15.0
Summary|gcc.dg/pr106465.c newly |[15 regression]
   |re-broken   |gcc.dg/pr106465.c newly
   ||re-broken

[Bug tree-optimization/119613] [15 regression] ICE when building protobuf-29.4 with -O0 (purge_dead_edges, at cfgrtl.cc:3356)

2025-04-03 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119613

--- Comment #1 from Sam James  ---
Created attachment 60976
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=60976&action=edit
generated_message_tctable_lite.cc.ii.xz

[Bug target/119610] [12/13/14/15 regression] aarch64: Wrong unwind info with -fstack-clash-protection -fstack-protector-strong since r14-3900-g3e4afea3b192c2

2025-04-03 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119610

Sam James  changed:

   What|Removed |Added

   Target Milestone|--- |12.5
Summary|aarch64: Wrong unwind info  |[12/13/14/15 regression]
   |with|aarch64: Wrong unwind info
   |-fstack-clash-protection|with
   |-fstack-protector-strong|-fstack-clash-protection
   |since   |-fstack-protector-strong
   |r14-3900-g3e4afea3b192c2|since
   ||r14-3900-g3e4afea3b192c2

[Bug middle-end/119611] New: Function call substitution cause confusing warning messages

2025-04-03 Thread siddhesh at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119611

Bug ID: 119611
   Summary: Function call substitution cause confusing warning
messages
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: enhancement
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: siddhesh at gcc dot gnu.org
  Target Milestone: ---

Consider the following:

```
#include 
void f (void*);

void h (void)
{
  char a[8];
  stpcpy (stpcpy (a, "12345678"), "abcdefgh");
  f (a);
}

$ gcc -D_FORTIFY_SOURCE=2 -O2 -S f.c 
f.c: In function ‘h’:
f.c:7:3: warning: ‘__builtin___stpcpy_chk’ writing 9 bytes into a region of
size 8 [-Wstringop-overflow=]
7 |   stpcpy (stpcpy (a, "12345678"), "abcdefgh");
  |   ^
f.c:6:8: note: destination object ‘a’ of size 8
6 |   char a[8];
  |^
f.c:7:3: warning: ‘__builtin_memcpy’ writing 9 bytes into a region of size 8
[-Wstringop-overflow=]
7 |   stpcpy (stpcpy (a, "12345678"), "abcdefgh");
  |   ^
f.c:6:8: note: at offset [0, 7] into destination object ‘a’ of size 8
6 |   char a[8];
  |^
```

Here, instead of referring to stpcpy (or its _chk variant), the warning refers
to __builtin_memcpy, which does not appear anywhere in the program.  It's not
exactly a serious issue, but it can cause some confusion for users.

When doing call substitution, there should be a way to somehow retain a
reference to the original function so that the warning message can refer to it
instead.

[Bug target/119308] Cobol ICE on "hello world" on POWER in rs6000_output_function_epilogue

2025-04-03 Thread bergner at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119308

Peter Bergner  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED
   Target Milestone|--- |15.0

--- Comment #11 from Peter Bergner  ---
Fixed in trunk.

[Bug ipa/119147] 525.x264_r is approx. 10% slower with LTO+PGO than without (at -Ofast -march-native)

2025-04-03 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119147

--- Comment #4 from Jan Hubicka  ---
Re-benchmarked current trunk -flto -Ofast -march=native (base) and  -flto
-Ofast -march=native + PGO (peak) on znver3
   Estimated   Estimated
 Base BaseBasePeak PeakPeak
Benchmarks   Copies  Run Time RateCopies  Run Time Rate
--- ---  -  ----  -  -
525.x264_r1   87.1   20.1  *   1101   17.3  

-flto -Ofast profile is:
   7.67%  x264_r_base.tru  [.] x264_pixel_satd_8x4.lto_priv.0 ◆
   4.80%  x264_r_base.tru  [.] get_ref.lto_priv.0 ▒
   4.08%  x264_r_base.tru  [.] mc_chroma.lto_priv.0   ▒
   1.58%  x264_r_base.tru  [.] x264_me_search_ref ▒
   1.41%  x264_r_base.tru  [.] pixel_hadamard_ac  ▒
   1.31%  x264_r_base.tru  [.] x264_pixel_satd_4x4.lto_priv.0 ▒
   1.17%  x264_r_base.tru  [.] sub4x4_dct.lto_priv.0  ▒
   1.11%  x264_r_base.tru  [.] refine_subpel.lto_priv.0   ▒
   1.10%  x264_r_base.tru  [.] quant_4x4.lto_priv.0   ▒
   0.98%  x264_r_base.tru  [.] quant_trellis_cabac.lto_priv.0 ▒
   0.77%  x264_r_base.tru  [.] hpel_filter.lto_priv.0 ▒
   0.68%  x264_r_base.tru  [.] x264_pixel_sad_x4_8x8.lto_priv.0   ▒
   0.56%  x264_r_base.tru  [.] frame_init_lowres_core.lto_priv.0  ▒
   0.55%  x264_r_base.tru  [.] x264_pixel_sad_x4_16x16.lto_priv.0 ▒
   0.54%  x264_r_base.tru  [.] x264_pixel_sad_16x16.lto_priv.0▒

While with PGO
   5.04%  x264_r_peak.tru  [.] refine_subpel.lto_priv.0◆
   4.42%  x264_r_peak.tru  [.] x264_pixel_satd_8x8.constprop.1 ▒
   3.66%  x264_r_peak.tru  [.] mc_chroma.constprop.1   ▒
   3.45%  x264_r_peak.tru  [.] x264_pixel_satd_16x16.lto_priv.0▒
   2.78%  x264_r_peak.tru  [.] x264_me_search_ref  ▒
   2.13%  x264_r_peak.tru  [.] x264_mb_analyse_intra.lto_priv.0▒
   2.06%  x264_r_peak.tru  [.] x264_macroblock_encode  ▒
   1.43%  x264_r_peak.tru  [.] x264_slicetype_mb_cost  ▒
   1.38%  x264_r_peak.tru  [.] mc_chroma.lto_priv.0▒
   1.22%  x264_r_peak.tru  [.] x264_pixel_hadamard_ac_16x16.constprop.0▒
   0.99%  x264_r_peak.tru  [.] x264_mb_encode_8x8_chroma   ▒
   0.96%  x264_r_peak.tru  [.] quant_trellis_cabac.lto_priv.0  ▒
   0.92%  x264_r_peak.tru  [.] x264_pixel_sad_x4_8x8.lto_priv.0▒
   0.77%  x264_r_peak.tru  [.] hpel_filter.lto_priv.0  ▒
   0.77%  x264_r_peak.tru  [.] x264_mb_mc_0xywh▒
   0.73%  x264_r_peak.tru  [.] x264_pixel_satd_4x4.constprop.1 ▒

We speculatively inline get_ref into refine_subpel (which is called indirectly
but pointer is always the same).  Similarly we constant propagate stride to
mc_chroma. This seems good, but sum of time spent in mc_chroma clones grows up.
Inlining decisions on pixel_satd differs but seems fine.

Next problem is that vectorizer turns itself off when trip count is low.
Following hack:

diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
index 9413dcef702..8882a5dea11 100644
--- a/gcc/tree-vect-loop.cc
+++ b/gcc/tree-vect-loop.cc
@@ -2483,14 +2483,16 @@ vect_analyze_loop_costing (loop_vec_info loop_vinfo,
   if (estimated_niter == -1)
estimated_niter = likely_max_stmt_executions_int (loop);
 }
-  if (estimated_niter != -1
+  if (estimated_niter != -1 && 0
   && ((unsigned HOST_WIDE_INT) estimated_niter
  < MAX (th, (unsigned) min_profitable_estimate)))
 {
   if (dump_enabled_p ())
dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
-"not vectorized: estimated iteration count too "
-"small.\n");
+"not vectorized: estimated iteration count %li smaller
"
+"than threshold %li.\n",
+(long) estimated_niter,
+(long MAX (th, (unsigned) min_profitable_estimate)));
   if (dump_enabled_p ())
dump_printf_loc (MSG_NOTE, vect_location,
 "not vectorized: estimated iteration count smaller "

improves PGO score to 18.1 (96.6 runtime).

This speeds up mc_chroma.constprop.1 by about 50%. Unvectorized:

│for( int x = 0; x < i_width; x++ )  ▒
│dst[x] = ( cA*src[x]  + cB*src[x+1] + cC*srcp[x]▒
   0.00 │a0:┌─ movzbl (%rcx,%rax,1),%edx ▒
   1.69 │   │  movzbl 0x1(%rcx,%rax,1),%r14d ▒
   0.15 │   │  imul   %ebx,%edx  ▒
   2.57 │   │  imul   %r10d,%r14d▒
   1.95 │   │  add%r14d,%edx 

[Bug rtl-optimization/119607] [15 regression] glib miscompiled since r15-7895-gb191e8bdecf881 with -O3 -m32 -march=x86-64 -mtune=znver2 -fno-semantic-interposition

2025-04-03 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119607

--- Comment #13 from Sam James  ---
Manually inlining g_closure_ref into g_closure_invoke means things work.

[Bug target/119547] RISC-V: VSETVL mistakenly modified other data

2025-04-03 Thread zhijin.zeng at spacemit dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119547

--- Comment #11 from 曾治金  ---
(In reply to Robin Dapp from comment #10)
> > 4. run
> > ```
> > export LD_LIBRARY_PATH=//lib
> > ./opencv_test_core 
> > --gtest_filter="Core_ConvertScale/ElemWiseTest.accuracy/0"
> > ```
> 
> [==] Running 1 test from 1 test case.
> [--] Global test environment set-up.
> [--] 1 test from Core_ConvertScale/ElemWiseTest
> [ RUN  ] Core_ConvertScale/ElemWiseTest.accuracy/0, where GetParam() =
> 16-byte object <90-CA 91-00 00-00 00-00 30-CC 91-00 00-00 00-00>
> [   OK ] Core_ConvertScale/ElemWiseTest.accuracy/0 (14348 ms)
> [--] 1 test from Core_ConvertScale/ElemWiseTest (14349 ms total)
> 
> [--] Global test environment tear-down
> [==] 1 test from 1 test case ran. (14352 ms total)
> [  PASSED  ] 1 test.
> 
> It seems to pass for me with the current GCC 15 and --param
> logical-op-non-short-circuit=0.  I followed your instructions to build but
> needed to work around the ICE I mentioned.
> 
> Anything else I can try to get a runnable test?

I recompile the opencv application  with current gcc(commit b6aafe9a5b), and it
still reproduce this bug. Do you have apply the patch of step 3 which enable
vector implement of cvt_64f function?

[Bug rtl-optimization/119607] [15 regression] glib miscompiled since r15-7895-gb191e8bdecf881 with -O3 -m32 -march=x86-64 -mtune=znver2 -fno-semantic-interposition

2025-04-03 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119607

--- Comment #14 from Jakub Jelinek  ---
To me this looks like just not thread safe code in glib2.
The important part of the function is just trying to atomically increment the
closure->ref_count bitfield.
In *.optimized dump this is
   [local count: 756815742]:
  old_int_61 = MEM[(union ClosureInt *)closure_16(D)].vint;
  tmp.vint = old_int_61;
  _62 = tmp.closure.ref_count;
  _63 = (short unsigned int) _62;
  _64 = _63 + 1;
  _65 = () _64;
  tmp.closure.ref_count = _65;
  new_int_68 = tmp.vint;
  new_int.12_69 = (unsigned int) new_int_68;
  _70 = &MEM[(union ClosureInt *)closure_16(D)].vint;
  _71 = (unsigned int) old_int_61;
  _72 = .ATOMIC_COMPARE_EXCHANGE (_70, _71, new_int.12_69, 4, 5, 5);
  _73 = IMAGPART_EXPR <_72>;
  tmp ={v} {CLOBBER(eos)};
  if (_73 == 0)
goto ; [0.04%]
  else
goto ; [99.96%]
Normal thread-safe code would do __atomic_load to read from &closure->vint, it
can be with relaxed model but the important thing is to make sure the compiler
doesn't try to read it multiple times.
That loop with r15-7894 is
.L336:
movl(%esi), %ebp
movl%ebp, %edx
movl%ebp, 32(%esp)
andw$32767, %dx
incl%edx
movl%edx, %eax
andw$32767, %ax
movw%ax, 20(%esp)
movl%ebp, %eax
andw$-32768, %ax
orw 20(%esp), %ax
movw%ax, 32(%esp)
movl%ebp, %eax
movl32(%esp), %ebp
lock cmpxchgl   %ebp, (%esi)
jne .L336
where %esi is closure, so it reads it from memory just once, saves a copy of it
in 32(%esp), awkwardly increments the 15-bit ref_count (I don't see the point
in masking before the increment, it could just do that after increment) in
there and preserves the  last bit.  In any case, 32(%esp) here is the tmp.vint
32-bit memory and %ebp holds the original value of closure->vint.
That same loop with r15-7895 is
.L336:
movl(%esi), %eax
movl%eax, 16(%esp)
movzwl  (%esi), %eax
andw$32767, %ax
leal1(%eax), %edx
movl%edx, %eax
andw$32767, %ax
movl%eax, %ecx
movzwl  (%esi), %eax
andw$-32768, %ax
orl %ecx, %eax
movw%ax, 16(%esp)
movl(%esi), %eax
movl16(%esp), %ecx
lock cmpxchgl   %ecx, (%esi)
jne .L336
The RA makes completely different decisions here.  But the important thing is
that
instead of reading from closure->vint once, it actually reads from there 4
times.  It isn't volatile and doesn't use __atomic_load, so IMHO why not. 
Except when multiple threads attempt to increase it, one has to be lucky that
all the 4 reads read the same value.  If not, you get what you observe.

Before RA this is (I think in both cases)
(insn 68 67 69 9 (set (reg/v:SI 127 [ old_int ])
(mem/j:SI (reg/v/f:SI 144 [ closure ]) [7 MEM[(union ClosureInt
*)closure_16(D)].vint+0 S4 A32])) "../glib-2.84.0/gobject/gclosure.c":564:124
discrim 3 96 {*movsi_internal}
 (nil))
(insn 69 68 72 9 (set (mem/j/c:SI (plus:SI (reg/f:SI 19 frame)
(const_int -16 [0xfff0])) [7 MEM[(union 
*)_14].vint+0 S4 A32])
(reg/v:SI 127 [ old_int ])) "../glib-2.84.0/gobject/gclosure.c":564:114
discrim 3 96 {*movsi_internal}
 (nil))
(insn 72 69 73 9 (parallel [
(set (reg:HI 161 [ _62 ])
(and:HI (subreg:HI (reg/v:SI 127 [ old_int ]) 0)
(const_int 32767 [0x7fff])))
(clobber (reg:CC 17 flags))
]) "../glib-2.84.0/gobject/gclosure.c":564:181 discrim 3 720 {*andhi_1}
 (expr_list:REG_EQUAL (and:HI (mem/c:HI (plus:SI (reg/f:SI 19 frame)
(const_int -16 [0xfff0])) [7 MEM[(union 
*)_14]+0 S2 A32])
(const_int 32767 [0x7fff]))
(nil)))
(insn 73 72 75 9 (parallel [
(set (reg:HI 130 [ _64 ])
(plus:HI (reg:HI 161 [ _62 ])
(const_int 1 [0x1])))
(clobber (reg:CC 17 flags))
]) "../glib-2.84.0/gobject/gclosure.c":564:192 discrim 3 298 {*addhi_1}
 (expr_list:REG_DEAD (reg:HI 161 [ _62 ])
(expr_list:REG_UNUSED (reg:CC 17 flags)
(nil
(insn 75 73 77 9 (parallel [
(set (reg:HI 164)
(and:HI (reg:HI 130 [ _64 ])
(const_int 32767 [0x7fff])))
(clobber (reg:CC 17 flags))
]) "../glib-2.84.0/gobject/gclosure.c":564:192 discrim 3 720 {*andhi_1}
 (expr_list:REG_UNUSED (reg:CC 17 flags)
(nil)))
(insn 77 75 78 9 (parallel [
(set (reg:HI 166)
(and:HI (subreg:HI (reg/v:SI 127 [ old_int ]) 0)
(const_int -32768 [0x8000])))
(clobber (reg:CC 17 flags))
]) "../glib-2.84.0/gobject/gclosure.c":564:192 discrim 3 720 {*andhi_1}
 (nil))
(insn 78 77 79 9 (parallel [
(set (reg

[Bug c/119612] [15 regression] gcc.dg/pr106465.c newly re-broken

2025-04-03 Thread uecker at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119612

uecker at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=118765
   Last reconfirmed||2025-04-03
 Ever confirmed|0   |1

--- Comment #1 from uecker at gcc dot gnu.org ---

Yes, this runs into my new checking assertion that I added for PR118765 in 

https://gcc.gnu.org/cgit/gcc/commit/?id=accbc1b90bd942aa36ac1485a21056b774ce02df

 which is weird.

[Bug target/119596] x86: too eager use of rep movsq/rep stosq for inlined ops

2025-04-03 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119596

Andrew Pinski  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #12 from Andrew Pinski  ---
(In reply to ak from comment #11)
> #define m_CORE_AVX512 (m_SKYLAKE_AVX512 | m_CANNONLAKE \
>| m_ICELAKE_CLIENT | m_ICELAKE_SERVER | m_CASCADELAKE
> \
>| m_TIGERLAKE | m_COOPERLAKE | m_SAPPHIRERAPIDS \
>| m_ROCKETLAKE | m_GRANITERAPIDS | m_GRANITERAPIDS_D \
>| m_DIAMONDRAPIDS)
> 
> so if you use -mtune=sapphirerapids it should work.
> 
> Based on that I think this is invalid.

Yes and they are asking about -mtune=generic which is why it is a dup of bug
102294 which is also asking about the generic tuning to change.

*** This bug has been marked as a duplicate of bug 102294 ***

[Bug target/119596] x86: too eager use of rep movsq/rep stosq for inlined ops

2025-04-03 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119596

Alexander Monakov  changed:

   What|Removed |Added

 CC||amonakov at gcc dot gnu.org

--- Comment #16 from Alexander Monakov  ---
Mateusz, please have a look at PR 95435 for the previous round of tuning for
AMD, there's a benchmarking script linked from there in PR 43052.

[Bug target/119596] x86: too eager use of rep movsq/rep stosq for inlined ops

2025-04-03 Thread mjguzik at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119596

--- Comment #14 from Mateusz Guzik  ---
So I reran the bench on AMD EPYC 9R14 and also experienced a win.

To recap gcc emits rep movsq/stosq for sizes > 40. I'm replacing that with
unrolled loops for sizes up to 256 and punting to actual funcs past that.

All tests on fresh Linux master (a2cc6ff5ec8f91bc463fd3b0c26b61166a07eb11).

fstat() rate went from ~2.4 mln to ~2.5 mln:

before:
min:2412348 max:2412348 total:2412348
min:2412025 max:2412025 total:2412025
min:2411442 max:2411442 total:2411442

after:
min:2506723 max:2506723 total:2506723
min:2508430 max:2508430 total:2508430
min:2510306 max:2510306 total:2510306

The "hello world" build also got faster, by 1%.

Total builds during test period, excluding warmup:
before: 8069
after: 8136

Full results at the end.

Note that while running fstat() in a loop is very microbenchmark-ey, spawning
the compiler to do something is not.

Finally, I don't claim an unrolled loop is the fastest thing to do for that
specific uarch. I am claiming the old uarchs suffer a lot for rep movsq/stosq
usage for these sizes and turns out this is also a problem for the new ones.

Also note this provided a win despite increased i-cache footprint.

Do you guys need results from *old* archs? Because things sucking for those is
rather well established I think.

full results of the hello world build:

before:
warmup: 403 ops (80 ops/s)
bench: 806 ops (80 ops/s)
taskset --cpu-list 1 ./ccbench 10  8.75s user 6.22s system 99% cpu 15.01s
(15.007) total
warmup: 404 ops (80 ops/s)
bench: 806 ops (80 ops/s)
taskset --cpu-list 1 ./ccbench 10  8.71s user 6.26s system 99% cpu 15.01s
(15.013) total
warmup: 404 ops (80 ops/s)
bench: 807 ops (80 ops/s)
taskset --cpu-list 1 ./ccbench 10  8.75s user 6.22s system 99% cpu 15.01s
(15.008) total
warmup: 404 ops (80 ops/s)
bench: 807 ops (80 ops/s)
taskset --cpu-list 1 ./ccbench 10  8.72s user 6.25s system 99% cpu 15.02s
(15.019) total
warmup: 404 ops (80 ops/s)
bench: 807 ops (80 ops/s)
taskset --cpu-list 1 ./ccbench 10  8.74s user 6.24s system 99% cpu 15.02s
(15.016) total
warmup: 404 ops (80 ops/s)
bench: 807 ops (80 ops/s)
taskset --cpu-list 1 ./ccbench 10  8.83s user 6.14s system 99% cpu 15.01s
(15.006) total
warmup: 404 ops (80 ops/s)
bench: 807 ops (80 ops/s)
taskset --cpu-list 1 ./ccbench 10  8.72s user 6.25s system 99% cpu 15.01s
(15.010) total
warmup: 404 ops (80 ops/s)
bench: 807 ops (80 ops/s)
taskset --cpu-list 1 ./ccbench 10  8.75s user 6.23s system 99% cpu 15.02s
(15.025) total
warmup: 404 ops (80 ops/s)
bench: 807 ops (80 ops/s)
taskset --cpu-list 1 ./ccbench 10  8.71s user 6.26s system 99% cpu 15.01s
(15.008) total
warmup: 403 ops (80 ops/s)
bench: 808 ops (80 ops/s)
taskset --cpu-list 1 ./ccbench 10  8.69s user 6.28s system 99% cpu 15.01s
(15.011) total

after:
warmup: 408 ops (81 ops/s)
bench: 814 ops (81 ops/s)
taskset --cpu-list 1 ./ccbench 10  8.85s user 6.12s system 99% cpu 15.01s
(15.010) total
warmup: 407 ops (81 ops/s)
bench: 813 ops (81 ops/s)
taskset --cpu-list 1 ./ccbench 10  8.83s user 6.13s system 99% cpu 15.01s
(15.009) total
warmup: 407 ops (81 ops/s)
bench: 815 ops (81 ops/s)
taskset --cpu-list 1 ./ccbench 10  8.91s user 6.07s system 99% cpu 15.01s
(15.014) total
warmup: 407 ops (81 ops/s)
bench: 812 ops (81 ops/s)
taskset --cpu-list 1 ./ccbench 10  8.81s user 6.15s system 99% cpu 15.01s
(15.009) total
warmup: 408 ops (81 ops/s)
bench: 813 ops (81 ops/s)
taskset --cpu-list 1 ./ccbench 10  8.81s user 6.15s system 99% cpu 15.01s
(15.011) total
warmup: 407 ops (81 ops/s)
bench: 813 ops (81 ops/s)
taskset --cpu-list 1 ./ccbench 10  8.86s user 6.12s system 99% cpu 15.02s
(15.024) total
warmup: 406 ops (81 ops/s)
bench: 814 ops (81 ops/s)
taskset --cpu-list 1 ./ccbench 10  8.87s user 6.10s system 99% cpu 15.01s
(15.013) total
warmup: 408 ops (81 ops/s)
bench: 813 ops (81 ops/s)
taskset --cpu-list 1 ./ccbench 10  8.86s user 6.12s system 99% cpu 15.02s
(15.021) total
warmup: 408 ops (81 ops/s)
bench: 814 ops (81 ops/s)
taskset --cpu-list 1 ./ccbench 10  8.82s user 6.15s system 99% cpu 15.02s
(15.017) total
warmup: 409 ops (81 ops/s)
bench: 815 ops (81 ops/s)
taskset --cpu-list 1 ./ccbench 10  8.83s user 6.14s system 99% cpu 15.02s
(15.020) total

[Bug c/119612] New: gcc.dg/pr106465.c newly re-broken

2025-04-03 Thread dcb314 at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119612

Bug ID: 119612
   Summary: gcc.dg/pr106465.c newly re-broken
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: dcb314 at hotmail dot com
  Target Milestone: ---

testsuite $ /home/dcb40b/gcc/results.20250325.ubsan/bin/gcc -w -c
./gcc.dg/pr106465.c
testsuite $ /home/dcb40b/gcc/results.20250327.ubsan/bin/gcc -w -c
./gcc.dg/pr106465.c
./gcc.dg/pr106465.c: In function ‘main’:
./gcc.dg/pr106465.c:88:16: internal compiler error: in
tagged_types_tu_compatible_p, at c/c-typeck.cc:1816
   88 | struct { char (*p)[++n]; } g4(void) { };
  |^


testsuite $ /home/dcb40b/gcc/results.20250325.ubsan/bin/gcc -v 2>&1 | grep exp
gcc version 15.0.1 20250325 (experimental) (737a5760bb24a0a9) 
testsuite $ /home/dcb40b/gcc/results.20250327.ubsan/bin/gcc -v 2>&1 | grep exp
gcc version 15.0.1 20250327 (experimental) (accbc1b90bd942aa) 


testsuite $ git log 737a5760bb24a0a9..accbc1b90bd942aa | grep -c "^commit"
57

[Bug target/119596] x86: too eager use of rep movsq/rep stosq for inlined ops

2025-04-03 Thread mjguzik at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119596

--- Comment #15 from Mateusz Guzik  ---
so tl;dr

Suggested action: don't use rep for sizes <= 256 with by default

[Bug middle-end/119611] Function call substitution cause confusing warning messages

2025-04-03 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119611

Andrew Pinski  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2025-04-03

--- Comment #1 from Andrew Pinski  ---
Confirmed, I thought this was reported before too.

[Bug target/118892] [14 Regression] ICE (segfault) in rebuild_jump_labels on aarch64-linux-gnu since r14-5289

2025-04-03 Thread pavol at rusnak dot io via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118892

--- Comment #17 from pavol at rusnak dot io ---
Is the fix going to be backported from master to 14.x release? Possibly
targeting 14.3.0 release?

[Bug cobol/119377] cobol.dg/group1/declarative_1.cob fails (segfaults, uninitialised vars)

2025-04-03 Thread iains at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119377

--- Comment #4 from Iain Sandoe  ---
on Darwin the newly-added tests:

INSPECT_ISO_Example_1, 2, 3, 4, 5-f, 5-r, 6 and 7 fail with the same symptoms.

[Bug ipa/119604] expand_call_inline could use an RAII for input_location

2025-04-03 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119604

--- Comment #5 from Andrew Pinski  ---
(In reply to Richard Biener from comment #4)
> We should get rid of input_location uses in the middle-end instead ;)

Agreed but that is huge task. I will try to get rid of some of them once stage
1 opens up. Basically we need to pass down the location in many locations.

[Bug c++/119387] [14/15 Regression] Regression in performance by a factor of 6 when building with debugging symbols since r14-5979

2025-04-03 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119387

--- Comment #17 from GCC Commits  ---
The master branch has been updated by Patrick Palka :

https://gcc.gnu.org/g:a926345f22b500a2620adb83e6821e01fb8cc8fd

commit r15-9189-ga926345f22b500a2620adb83e6821e01fb8cc8fd
Author: Patrick Palka 
Date:   Thu Apr 3 16:33:46 2025 -0400

c++: P2280R4 and speculative constexpr folding [PR119387]

Compiling the testcase in this PR uses 2.5x more memory and 6x more
time ever since r14-5979 which implements P2280R4.  This is because
our speculative constexpr folding now does a lot more work trying to
fold ultimately non-constant calls to constexpr functions, and in turn
produces a lot of garbage.  We do sometimes successfully fold more
thanks to P2280R4, but it seems to be trivial stuff like calls to
std::array::size or std::addressof.  The benefit of P2280 therefore
doesn't seem worth the cost during speculative constexpr folding, so
this patch restricts the paper to only manifestly-constant evaluation.

PR c++/119387

gcc/cp/ChangeLog:

* constexpr.cc (p2280_active_p): New.
(cxx_eval_constant_expression) : Use it to
restrict P2280 relaxations.
: Likewise.

Reviewed-by: Jason Merrill 

[Bug middle-end/103616] Improve LRA's to use literal pool more often when profitable

2025-04-03 Thread vmakarov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103616

--- Comment #8 from Vladimir Makarov  ---
I looked at the generated code  and I see only one issue with func foo:

void foo (void)
{
  double d = 0.0, e = 7.8;
  __asm ("# %0 %1" : : "m" (d), "m" (e));
}

for which GCC generates:

movq$0x0, -16(%rsp)
movq.LC1(%rip), %rax
movq%rax, -8(%rsp)
#APP
# 5 "b.i" 1
# -16(%rsp) -8(%rsp)  # ! .LC1(%rip) could be used instead -8(%rsp)
# 0 "" 2
#NO_APP
ret

I believe it is not a RA issue.

We have the following after expand:
7: r99:DF=[`*.LC1']
8: [r93:DI-0x8]=r99:DF
9: {asm_operands;clobber flags:CC;}

after vreg (changing r93 to frame):
7: r99:DF=[`*.LC1']
8: [frame:DI-0x8]=r99:DF
9: {asm_operands;clobber flags:CC;}

after combine1 (constant propagation of r99):
7: NOTE_INSN_DELETED
8: [frame:DI-0x8]=7.79982236431605997495353221893310546875e+0
9: {asm_operands;clobber flags:CC;}

after reload (generation of reload insn 15 for insn 8):
7: NOTE_INSN_DELETED
   15: ax:DF=[`*.LC1']
  REG_EQUAL 7.79982236431605997495353221893310546875e+0
8: [sp:DI-0x8]=ax:DF
9: {asm_operands;clobber flags:CC;}

Theoretically we could  do constant propagation of memory value in,
before, or after RA but it is too complicated.  Therefore I think it
should be fixed before or in expand pass.

[Bug target/119573] nvptx: PTX '.const', constant state space

2025-04-03 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119573

--- Comment #4 from GCC Commits  ---
The trunk branch has been updated by Thomas Schwinge :

https://gcc.gnu.org/g:5deeae29dab2af64e3342daf7a3e424c64ea

commit r15-9190-g5deeae29dab2af64e3342daf7a3e424c64ea
Author: Thomas Schwinge 
Date:   Wed Apr 2 10:25:17 2025 +0200

nvptx: Don't use PTX '.const', constant state space [PR119573]

This avoids cases where a "File uses too much global constant data" (final
executable, or single object file), and avoids cases of wrong code
generation:
"error : State space incorrect for instruction 'st'" ('st.const'), or
another
case where an "illegal instruction was encountered", or a lot of cases
where
for two compilation units (such as a library linked with user code) we ran
into
"error : Memory space doesn't match" due to differences in '.const' usage
between definition and use of a variable.

We progress:

ptxas error   : File uses too much global constant data (0x1f01a bytes,
0x1 max)
nvptx-run: cuLinkAddData failed: a PTX JIT compilation failed
(CUDA_ERROR_INVALID_PTX, 218)

... into:

PASS: 20_util/to_chars/103955.cc  -std=gnu++17 (test for excess errors)
[-FAIL:-]{+PASS:+} 20_util/to_chars/103955.cc  -std=gnu++17 execution
test

We progress:

ptxas error   : File uses too much global constant data (0x36c65 bytes,
0x1 max)
nvptx-as: ptxas returned 255 exit status

... into:

[-UNSUPPORTED:-]{+PASS:+} gcc.c-torture/compile/pr46534.c   -O0 
{+(test for excess errors)+}
[-UNSUPPORTED:-]{+PASS:+} gcc.c-torture/compile/pr46534.c   -O1 
{+(test for excess errors)+}
[-UNSUPPORTED:-]{+PASS:+} gcc.c-torture/compile/pr46534.c   -O2 
{+(test for excess errors)+}
[-UNSUPPORTED:-]{+PASS:+} gcc.c-torture/compile/pr46534.c   -O3 -g 
{+(test for excess errors)+}
[-UNSUPPORTED:-]{+PASS:+} gcc.c-torture/compile/pr46534.c   -Os 
{+(test for excess errors)+}

[-FAIL:-]{+PASS:+} g++.dg/torture/pr31863.C   -O0  (test for excess
errors)
[-FAIL:-]{+PASS:+} g++.dg/torture/pr31863.C   -O1  (test for excess
errors)
[-FAIL:-]{+PASS:+} g++.dg/torture/pr31863.C   -O2  (test for excess
errors)
[-FAIL:-]{+PASS:+} g++.dg/torture/pr31863.C   -O3 -g  (test for excess
errors)
[-FAIL:-]{+PASS:+} g++.dg/torture/pr31863.C   -Os  (test for excess
errors)

[-FAIL:-]{+PASS:+} gfortran.dg/bind-c-contiguous-1.f90   -O0  (test for
excess errors)
[-UNRESOLVED:-]{+PASS:+} gfortran.dg/bind-c-contiguous-1.f90   -O0 
[-compilation failed to produce executable-]{+execution test+}

[-FAIL:-]{+PASS:+} gfortran.dg/bind-c-contiguous-4.f90   -O0  (test for
excess errors)
[-UNRESOLVED:-]{+PASS:+} gfortran.dg/bind-c-contiguous-4.f90   -O0 
[-compilation failed to produce executable-]{+execution test+}

[-FAIL:-]{+PASS:+} gfortran.dg/bind-c-contiguous-5.f90   -O0  (test for
excess errors)
[-UNRESOLVED:-]{+PASS:+} gfortran.dg/bind-c-contiguous-5.f90   -O0 
[-compilation failed to produce executable-]{+execution test+}

[-FAIL:-]{+PASS:+} 20_util/to_chars/double.cc  -std=gnu++17 (test for
excess errors)
[-UNRESOLVED:-]{+PASS:+} 20_util/to_chars/double.cc  -std=gnu++17
[-compilation failed to produce executable-]{+execution test+}

[-FAIL:-]{+PASS:+} 20_util/to_chars/float.cc  -std=gnu++17 (test for
excess errors)
[-UNRESOLVED:-]{+PASS:+} 20_util/to_chars/float.cc  -std=gnu++17
[-compilation failed to produce executable-]{+execution test+}

[-FAIL:-]{+PASS:+} special_functions/13_ellint_3/check_value.cc 
-std=gnu++17 (test for excess errors)
[-UNRESOLVED:-]{+PASS:+} special_functions/13_ellint_3/check_value.cc 
-std=gnu++17 [-compilation failed to produce executable-]{+execution test+}

[-FAIL:-]{+PASS:+}
tr1/5_numerical_facilities/special_functions/14_ellint_3/check_value.cc 
-std=gnu++17 (test for excess errors)
[-UNRESOLVED:-]{+PASS:+}
tr1/5_numerical_facilities/special_functions/14_ellint_3/check_value.cc 
-std=gnu++17 [-compilation failed to produce executable-]{+execution test+}

..., and progress likewise, but fail later with an unrelated error:

[-FAIL:-]{+PASS:+} ext/special_functions/hyperg/check_value.cc 
-std=gnu++17 (test for excess errors)
[-UNRESOLVED:-]{+FAIL:+} ext/special_functions/hyperg/check_value.cc 
-std=gnu++17 [-compilation failed to produce executable-]{+execution test+}

   
[...]/libstdc++-v3/testsuite/ext/special_functions/hyperg/check_value.cc:12317:
void test(const testcase_hyperg (&)[Num], Ret) [with Ret = double;
unsigned int Num = 19]: Assertion 'max_abs_frac < toler' failed.

..., and:

[-FAIL:-]{+PASS:+}
tr1/5_numerical_facilities/special_functions/17_hyperg/check_value.cc 
-std=gnu++17 (test for excess errors)
[-UNRESOLVED:-]{+FAIL:+}
tr1/5_numerical_facilities/special_functions/17_hyperg/check_val

[Bug middle-end/119613] [15 regression] ICE when building protobuf-29.4 with -O0 (purge_dead_edges, at cfgrtl.cc:3356)

2025-04-03 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119613

--- Comment #3 from Andrew Pinski  ---
Created attachment 60978
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=60978&action=edit
Reduced testcase

[Bug middle-end/119613] [15 regression] ICE when building protobuf-29.4 with -O0 (purge_dead_edges, at cfgrtl.cc:3356)

2025-04-03 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119613

Andrew Pinski  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2025-04-03

--- Comment #4 from Andrew Pinski  ---
Confirmed. My reduced testcase still works with clang too (had to add back the
b argument to j to get that working.

[Bug middle-end/119613] [15 regression] ICE when building protobuf-29.4 with -O0 (purge_dead_edges, at cfgrtl.cc:3356)

2025-04-03 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119613

Andrew Pinski  changed:

   What|Removed |Added

  Attachment #60978|0   |1
is obsolete||

--- Comment #5 from Andrew Pinski  ---
Created attachment 60979
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=60979&action=edit
Add back inline not to get warning about always_inline

[Bug target/119547] RISC-V: VSETVL mistakenly modified other data

2025-04-03 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119547

--- Comment #12 from Robin Dapp  ---
> I recompile the opencv application  with current gcc(commit b6aafe9a5b), and
> it still reproduce this bug. Do you have apply the patch of step 3 which
> enable vector implement of cvt_64f function?

Yes, I followed your instructions step by step.

There are vector instructions but I'm not seeing a similar pattern to the one
you were showing.  Which exact function (mangled) do I need to look at?

You should be seeing an internal compiler error when building from trunk though
(see above).  Did you configure with --disable-checking?  And which other
configure flags did you use for GCC?

If I cannot get it to work, would you mind putting a breakpoint at the
beginning of the miscompiled function and get the invocation parameters for a
failing test?  That way we can just extract the function into a run test and
use the parametes.

[Bug c++/119615] New: Divergence with Clang on musttail (differing tailcall target signature)

2025-04-03 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119615

Bug ID: 119615
   Summary: Divergence with Clang on musttail (differing tailcall
target signature)
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Keywords: accepts-invalid
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: sjames at gcc dot gnu.org
  Target Milestone: ---

For the following (gnarly reduction from PR119614):
```
char *h() {
  return 0;
}

const char *p(int) {
  [[clang::musttail]] return h();
}
```

We accept it, while Clang rejects it with:
```
a.cxx:6:23: error: cannot perform a tail call to function 'h' because its
signature is incompatible with the calling function
6 |   [[clang::musttail]] return h();
  |   ^
a.cxx:1:1: note: target function has different number of parameters (expected 1
but has 0)
1 | char *h() {
  | ^
a.cxx:6:5: note: tail call required by 'musttail' attribute here
6 |   [[clang::musttail]] return h();
  | ^
1 error generated.
```

[Bug target/119547] RISC-V: VSETVL mistakenly modified other data

2025-04-03 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119547

--- Comment #13 from Robin Dapp  ---
Hmm, now I compiled with -O3 on top of --param logical-op-non-short-circuit=0 
(which shouldn't actually be necessary or change anything as it's the default)
but there is a segmentation fault in

_ZN2cv12cpu_baselineL13cvtScale8u64fEPKhmS2_mPhmNS_5Size_IiEEPv

Probably close enough to your issue?

[Bug c++/119615] Divergence with Clang on musttail (differing tailcall target signature)

2025-04-03 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119615

--- Comment #1 from Sam James  ---
(In reply to Sam James from comment #0)
> For the following (gnarly reduction from PR119614):

(Ignore that bit, I changed my mind and used something simpler.)

[Bug c++/119615] Divergence with Clang on musttail (differing tailcall target signature)

2025-04-03 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119615

--- Comment #4 from Sam James  ---
WFM.

[Bug middle-end/119613] [15 regression] ICE when building protobuf-29.4 with -O0 (purge_dead_edges, at cfgrtl.cc:3356)

2025-04-03 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119613

--- Comment #6 from Andrew Pinski  ---
;; _9 = d (D.2936); [tail call] [must tail call]

(call_insn/j 14 13 15 3 (set (reg:DI 0 ax)
(call (mem:QI (symbol_ref:DI ("_Z1d1b") [flags 0x41]  ) [0 _Z1d1bD.2875 S1 A8])
(const_int 0 [0]))) "/app/example.cpp":12:31 -1
 (expr_list:REG_EH_REGION (const_int 2 [0x2])
(nil))
(nil))

Maybe the REG_EH_REGION  here?

[Bug c++/119564] ICE using module including boost headers

2025-04-03 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119564

--- Comment #12 from Andrew Pinski  ---
Reducing this.

And yes it looks GC related:
```
In module gcc_repro_a, imported at t1.cc:84182:
t0.cc: In member function ‘virtual bool
boost::system::error_category::failed(int) const’:
t0.cc:154191:18: internal compiler error: in tree_node_structure_for_code, at
tree.cc:603
154191 | virtual bool failed( int ev ) const noexcept
   |  ^~
0x293961f internal_error(char const*, ...)
   
/home/apinski/src/upstream-gcc-match/gcc/gcc/diagnostic-global-context.cc:517
0xab2a85 fancy_abort(char const*, int, char const*)
/home/apinski/src/upstream-gcc-match/gcc/gcc/diagnostic.cc:1749
0x9c0a00 tree_node_structure_for_code
/home/apinski/src/upstream-gcc-match/gcc/gcc/tree.cc:603
0x9c0a00 tree_node_structure_for_code
/home/apinski/src/upstream-gcc-match/gcc/gcc/tree.cc:540
0x9c0a00 tree_node_structure(tree_node const*)
/home/apinski/src/upstream-gcc-match/gcc/gcc/tree.cc:4169
0xd7e97b gt_ggc_mx_lang_tree_node(void*)
./gt-cp-tree.h:114
0xd7f0dd gt_ggc_mx_lang_tree_node(void*)
./gt-cp-tree.h:475
0xd7f0dd gt_ggc_mx_lang_tree_node(void*)
./gt-cp-tree.h:475
0xd7f0dd gt_ggc_mx_lang_tree_node(void*)
./gt-cp-tree.h:475
0xd7f0dd gt_ggc_mx_lang_tree_node(void*)
./gt-cp-tree.h:475
0xd7f0dd gt_ggc_mx_lang_tree_node(void*)
./gt-cp-tree.h:475
0x111e846 gt_ggc_mx_tree_statement_list_node(void*)
/home/apinski/src/upstream-gcc-match/gcc/objdir/gcc/gtype-desc.cc:1945
0xd7ee7a gt_ggc_mx_lang_tree_node(void*)
./gt-cp-tree.h:522
0xd7f0dd gt_ggc_mx_lang_tree_node(void*)
./gt-cp-tree.h:475
0xd7f613 gt_ggc_mx_lang_tree_node(void*)
./gt-cp-tree.h:355
0xd7eb8e gt_ggc_mx_lang_tree_node(void*)
./gt-cp-tree.h:616
0xd7f0dd gt_ggc_mx_lang_tree_node(void*)
./gt-cp-tree.h:475
0xd7f0dd gt_ggc_mx_lang_tree_node(void*)
./gt-cp-tree.h:475
0xd7f0dd gt_ggc_mx_lang_tree_node(void*)
./gt-cp-tree.h:475
0xd7f0dd gt_ggc_mx_lang_tree_node(void*)
./gt-cp-tree.h:475
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See  for instructions.

```

[Bug rtl-optimization/119607] [15 regression] glib miscompiled since r15-7895-gb191e8bdecf881 with -O3 -m32 -march=x86-64 -mtune=znver2 -fno-semantic-interposition

2025-04-03 Thread bugzilla at tecnocode dot co.uk via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119607

--- Comment #19 from Philip Withnall  ---
Thanks both, that’s quite an old latent bug fixed :)

[Bug c++/119601] [OpenMP] append_args bugs with parameter packs

2025-04-03 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119601

Sam James  changed:

   What|Removed |Added

   Last reconfirmed||2025-04-04
 Ever confirmed|0   |1
 Status|UNCONFIRMED |ASSIGNED

[Bug c++/119602] [OpenMP] append_args dependent prefer_type uses value from first instantiation in all instantiations

2025-04-03 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119602

Sam James  changed:

   What|Removed |Added

   Last reconfirmed||2025-04-04
 Ever confirmed|0   |1
 Status|UNCONFIRMED |ASSIGNED

[Bug target/119596] x86: too eager use of rep movsq/rep stosq for inlined ops

2025-04-03 Thread mjguzik at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119596

--- Comment #18 from Mateusz Guzik  ---
Ok, I see.

I think I also see the discrepancy here.

When you bench "libcall", you are going to glibc with SIMD-enabled routines.

In contrast, the kernel avoids SIMD for performance reasons and instead will
only do regular stores *or* rep mov/stos in these.

But this also means that your "libcall is faster" results wont hold in the
kernel, where you have to assume the thing is handled with the rep prefix.

Perhaps gcc could take -mno-sse into consideration when deciding when to punt
to libcall?

I'm going to provide results after I regain access to the hw. I'm going to
whack sizes above 512 as they don't add any value and also remove libcall as
that wont be a valid test for my purpose. Instead I'm going to add better
granularity of sizes < 512.

[Bug tree-optimization/119614] [15 regression] protobuf-29.4 fails to build with -O2 (error: cannot tail-call: call and return value are different)

2025-04-03 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119614

--- Comment #3 from Sam James  ---
Created attachment 60980
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=60980&action=edit
reduced.ii

Attaching the gnarly thing cvise put out, jakub's is far more useful, but I'm
putting this here as I'm about to file another bug involving my one.

[Bug c++/119615] Divergence with Clang on musttail (differing tailcall target signature)

2025-04-03 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119615

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |INVALID
   See Also||https://github.com/llvm/llv
   ||m-project/issues/54964

--- Comment #3 from Andrew Pinski  ---
>while Clang rejects it with

Yes clang rejects because of function signature  changes. GCC definition of
musttail is less restrictive in this area and that is a known restriction in
clang which was reported already a few years back:
https://github.com/llvm/llvm-project/issues/54964 .

[Bug target/119596] x86: too eager use of rep movsq/rep stosq for inlined ops

2025-04-03 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119596

--- Comment #17 from Uroš Bizjak  ---
(In reply to Alexander Monakov from comment #16)
> Mateusz, please have a look at PR 95435 for the previous round of tuning for
> AMD, there's a benchmarking script linked from there in PR 43052.

FYI, this benchmarking script can also be found in contrib/bench-stringop.

[Bug c++/119615] Divergence with Clang on musttail (differing tailcall target signature)

2025-04-03 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119615

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #2 from Jakub Jelinek  ---
I think that is intentional.  We can tail call this thing on various
architectures (and not on others), it depends on if the arguments are passed in
registers or on the stack etc.

  1   2   >