date:20250626

[Bug c++/120834] New: Potential memory leak during exception handling

2025-06-26 Thread dorian.haglund at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120834

Bug ID: 120834
   Summary: Potential memory leak during exception handling
   Product: gcc
   Version: 15.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: dorian.haglund at gmail dot com
  Target Milestone: ---

Running the following code with valgrind reports an error:

```cpp
struct throwing_dtor_t {
  ~throwing_dtor_t() noexcept(false) { throw 123; }
};

int main() {
  try {
try {
  throw throwing_dtor_t{};
} catch (throwing_dtor_t const &) {
}
  } catch (int) {
  }
  return 0;
}
```

```bash
~ $ 0 valgrind --leak-check=full ./a.out 
==143975== Memcheck, a memory error detector
==143975== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==143975== Using Valgrind-3.22.0 and LibVEX; rerun with -h for copyright info
==143975== Command: ./a.out
==143975== 
==143975== 
==143975== HEAP SUMMARY:
==143975== in use at exit: 129 bytes in 1 blocks
==143975==   total heap usage: 3 allocs, 2 frees, 73,989 bytes allocated
==143975== 
==143975== 129 bytes in 1 blocks are definitely lost in loss record 1 of 1
==143975==at 0x4846828: malloc (in
/usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==143975==by 0x492CD4B: __cxa_allocate_exception (in
/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.33)
==143975==by 0x1091DE: main (in /home/dorian/a.out)
==143975== 
==143975== LEAK SUMMARY:
==143975==definitely lost: 129 bytes in 1 blocks
==143975==indirectly lost: 0 bytes in 0 blocks
==143975==  possibly lost: 0 bytes in 0 blocks
==143975==still reachable: 0 bytes in 0 blocks
==143975== suppressed: 0 bytes in 0 blocks
==143975== 
==143975== For lists of detected and suppressed errors, rerun with: -s
==143975== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
```

A leak also appear if you turn both asan and ubsan on (but not only asan)

System:
```bash
~ $ 0 uname -a
Linux dorian-XPS-15-9510 6.11.0-26-generic #26~24.04.1-Ubuntu SMP
PREEMPT_DYNAMIC Thu Apr 17 19:20:47 UTC 2 x86_64 x86_64 x86_64 GNU/Linux
```

[Bug target/119628] Need better mechanisms to manage register saves in callee for tail calls (inc. preserve_none for x86_64?)

2025-06-26 Thread hjl.tools at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119628

--- Comment #21 from H.J. Lu  ---
(In reply to Ken Jin from comment #15)
> I tested again this time with taskset, turbo boost off, on a quiet system,
> with PGO. These are the results. They're quite good:
> 
> # Indirect goto + LTO + PGO
> This machine benchmarks at 576728 pystones/second
> 
> # Tail calls, no preserve_none + LTO + PGO*
> This machine benchmarks at 539522 pystones/second
> 
> # Tail calls, preserve_none + LTO + PGO*
> This machine benchmarks at 572234 pystones/second
> 
> So roughly a 6-7% gain from preserve_none on the pystones benchmark over no
> preserve_none. Thanks again H.J. for the patch.
> 
> *PGO is disabled for tail calling functions in the bytecode interpreter, but
> enabled for everything else, as it seems PGO slows down those functions. I
> used the attributes `no_instrument_function,no_profile_instrument_function`
> to turn it off for the bytecode functions.
> 
> Something strange is going on with PGO for tail calls on my system. However,
> I can't figure it out right now.
> 
> Everything is benchmarked on this branch
> https://github.com/Fidget-Spinner/cpython/pull/new/Fidget-Spinner:cpython:
> tail-call-gcc-3

Hi Ken, my patch has been merged into GCC master branch.  Can you give it a
try?

[Bug c++/83875] [feature request] target_clones compatible SIMD capability/length check

2025-06-26 Thread mkretz at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83875

--- Comment #10 from Matthias Kretz (Vir)  ---
(In reply to Matthias Kretz (Vir) from comment #7)
> what should the following print?
> [...]

By now I think we should just leave those examples continue to be ODR
violations. The C++ machinery for doing this is templates. E.g.

template
  constexpr int native_simd_width = value;

Now, when the compiler gets to

[[gnu::target_clones("default,avx,avx512f")]]
void f()
{ std::cout << native_simd_width<>; }

it creates three clones of 'f', and in each of them the default template
arguments to 'native_simd_width' are different, instantiating
'native_simd_width<16>', 'native_simd_width<32>', and 'native_simd_width<64>'.

The std::simd implementation is prepared for working like this, since it
already uses a template parameter for avoiding ODR violations on linking TUs
compiled with different compiler flags. If this is implementable, that would be
a huge win for some use cases of the simd library type.

[Bug middle-end/120608] [15/16 regression] error: cannot tail-call: other reasons when using address sanitizer with musttail

2025-06-26 Thread jakub at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120608

--- Comment #19 from Jakub Jelinek  ---
Why are you using the attribute at -O0?

In any case, this boils down to roughly -O0 -fsanitize=address
[[gnu::noipa]] int
foo (int x)
{
  return x;
}

[[gnu::noipa]] void
bar (int *x, int *y, int *z)
{
  (void) x;
  (void) y;
  (void) z;
}

[[gnu::noipa]] int
baz (int x)
{
  int a = 4;
  {
int b = 8;
{
  int c = 10;
  bar (&a, &b, &c);
  if (a + b + c == 22)
[[gnu::musttail]] return foo (x);
  bar (&a, &b, &c);
}
bar (&a, &b, &a);
  }
  bar (&a, &a, &a);
  return 42;
}

During gimplification, .ASAN_MARK (POISON, ...); calls are added as try ...
finally.
So we get something like:
  try
{
  .ASAN_MARK (UNPOISON, &a, 4);
  a = 4;
  {
int b;

try
  {
.ASAN_MARK (UNPOISON, &b, 4);
b = 8;
{
  int c;

  try
{
  .ASAN_MARK (UNPOISON, &c, 4);
  c = 10;
...
  D.3414 = foo (x); [must tail call]
  // predicted unlikely by early return (on trees) predictor.
  return D.3414;
...
}
  finally
{
  .ASAN_MARK (POISON, &c, 4);
}
...
  }
finally
  {
.ASAN_MARK (POISON, &b, 4);
  }
...
}
  finally
{
  .ASAN_MARK (POISON, &a, 4);
}

Now, the eh pass turns those into
  D.3414 = foo (x); [must tail call]
  // predicted unlikely by early return (on trees) predictor.
  finally_tmp.3 = 0;
  goto ;
...
  :
  .ASAN_MARK (POISON, &c, 4);
  switch (finally_tmp.3) , case 1: >
  :
  goto ;
  :
  finally_tmp.4 = 0;
  goto ;
...
  :
  .ASAN_MARK (POISON, &b, 4);
  switch (finally_tmp.4) , case 1: >
  :
  goto ;
  :
  goto ;
...
  :
  .ASAN_MARK (POISON, &a, 4);
  goto ;
  :
  return D.3414;

And note we've been asked not to optimize anything and so we don't.
Before sanopt0 pass we still have
  _26 = foo (x_24(D)); [must tail call]
  // predicted unlikely by early return (on trees) predictor.
  finally_tmp.3_27 = 0;
  goto ; [INV]
...
   :
  # _6 = PHI <_26(3), _23(D)(4)>
  # finally_tmp.3_8 = PHI 
  .ASAN_MARK (POISON, &c, 4);
  if (finally_tmp.3_8 == 1)
goto ; [INV]
  else
goto ; [INV]

   :
:
  finally_tmp.4_31 = 0;
  goto ; [INV]
...
   :
  # finally_tmp.4_9 = PHI 
  .ASAN_MARK (POISON, &b, 4);
  if (finally_tmp.4_9 == 1)
goto ; [INV]
  else
goto ; [INV]
...
   :
  # _7 = PHI <_6(8), _34(9)>
  .ASAN_MARK (POISON, &a, 4);

   :
:
  return _7;

And then sanopt0 actually comes before musttail pass (the -O0 special copy of
that), so .ASAN_MARK calls
are lowered into something musttail pass has no easy way to match.

So, in order to deal with this, we'd need to do something with pass ordering:
  NEXT_PASS (pass_sanopt);
  NEXT_PASS (pass_cleanup_eh);
  NEXT_PASS (pass_musttail);
We want the musttail pass before sanopt, but am not sure if we still rely on
cleanup_eh or not (I think tailc/musttail
pass has workarounds for that), so maybe simply moving pass_musttail 2 lines up
would work.
And another problem is the lack of forward propagation of the finally_tmp.*
SSA_NAMEs into PHI nodes and whether
tailc/musttail will be able to deal with the GIMPLE_CONDs in there.  If we
track through which edges we go from the
musttail call to the return path and we see GIMPLE_CONDs on that path, we could
look it up; e.g. for the
if (finally_tmp.3_8 == 1) case, see it defined in # finally_tmp.3_8 = PHI

and because we came to bb 5 through the 3->5 edge, look at finally_tmp.3_27
SSA_NAME_DEF_STMT and because it is 0,
figure out the condition is false, etc.

[Bug c++/120831] Raise a diagnostic when a class/struct that is marked as final introduces a virtual method

2025-06-26 Thread redi at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120831

Jonathan Wakely  changed:

   What|Removed |Added

   Last reconfirmed||2025-06-26
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1

--- Comment #1 from Jonathan Wakely  ---
Probably not something that will trigger often, but could be useful
occasionally.

[Bug c/120833] gcc does not recognize tail calls

2025-06-26 Thread rockeet at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120833

--- Comment #1 from rockeet  ---
FYI: https://godbolt.org/z/svE61Ghzv

[Bug c/120833] New: gcc does not recognize tail calls

2025-06-26 Thread rockeet at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120833

Bug ID: 120833
   Summary: gcc does not recognize tail calls
   Product: gcc
   Version: 16.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: rockeet at gmail dot com
  Target Milestone: ---

struct S1 {
const char* data;
long size;
};
struct S2 {
const char* data;
long size;
};
__attribute__((noinline))
struct S1 get_s1(const char* s, long n) {
struct S1 x;
x.data = s;
x.size = n;
return x;
}
struct S1 get_s1_wrap(const char* s, long n) {
return get_s1(s, n);
}
struct S2 get_s2(const char* s, long n) {
struct S1 x = get_s1(s, n);
struct S2 y = {x.data, x.size};
return y;
}
struct S2 get_s2_2(const char* s, long n) {
struct S1 x = get_s1(s, n);
return *(struct S2*)&x;
}

the newest gcc & g++ generate code:

"get_s1":
mov rax, rdi
mov rdx, rsi
ret
"get_s1_wrap":
jmp "get_s1"
"get_s2":
sub rsp, 8
call"get_s1"
add rsp, 8
ret
"get_s2_2":
sub rsp, 8
call"get_s1"
add rsp, 8
ret

gcc does not recognize get_s2 & get_s2 are tail calls.

clang generate ideal code at very old version(clang-3.0):
get_s1: # @get_s1
mov RAX, RDI
mov RDX, RSI
ret

get_s1_wrap:# @get_s1_wrap
jmp get_s1  # TAILCALL

get_s2: # @get_s2
jmp get_s1  # TAILCALL

get_s2_2:   # @get_s2_2
jmp get_s1  # TAILCALL

[Bug target/119628] Need better mechanisms to manage register saves in callee for tail calls (inc. preserve_none for x86_64?)

2025-06-26 Thread fw at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119628

Florian Weimer  changed:

   What|Removed |Added

 CC||fw at gcc dot gnu.org

--- Comment #19 from Florian Weimer  ---
Note that this implementation of the preserve_none attribute is incompatible
with Clang.

[Bug fortran/120711] [15/16 regression] Growing arrays segfaults

2025-06-26 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120711

--- Comment #7 from GCC Commits  ---
The master branch has been updated by Andre Vehreschild :

https://gcc.gnu.org/g:dff66a690f6d47963e5cb96677d0e194b85948fa

commit r16-1696-gdff66a690f6d47963e5cb96677d0e194b85948fa
Author: Andre Vehreschild 
Date:   Wed Jun 25 09:12:35 2025 +0200

Fortran: Fix out of bounds access in structure constructor's clean up
[PR120711]

A structure constructor's generated clean up code was using an offset
variable, which was manipulated before the clean up was run leading to
an out of bounds access.

PR fortran/120711

gcc/fortran/ChangeLog:

* trans-array.cc (gfc_trans_array_ctor_element): Store the value
of the offset for reuse.

gcc/testsuite/ChangeLog:

* gfortran.dg/asan/array_constructor_1.f90: New test.

[Bug fortran/120711] [15/16 regression] Growing arrays segfaults

2025-06-26 Thread vehre at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120711

--- Comment #8 from Andre Vehreschild  ---
Planning to backport to gcc-15 in about a week, i.e. on July 3th. 2025. When it
has not been backported by then, feel free to remind me or do the backport of
dff66a690f6d47963e5cb96677d0e194b85948fa

[Bug target/119628] Need better mechanisms to manage register saves in callee for tail calls (inc. preserve_none for x86_64?)

2025-06-26 Thread hjl.tools at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119628

--- Comment #20 from H.J. Lu  ---
(In reply to Florian Weimer from comment #19)
> Note that this implementation of the preserve_none attribute is incompatible
> with Clang.

This isn't a critical issue since the same compiler should be used
to compile all sources with preserve_none attribute.

[Bug c++/96570] Warnings desired for time_t to/from int coversions

2025-06-26 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96570

Richard Biener  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2025-06-26
   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org

--- Comment #16 from Richard Biener  ---
I have posted a prototype at
https://gcc.gnu.org/pipermail/gcc-patches/2025-June/687651.html

[Bug tree-optimization/120833] gcc does not recognize tail calls

2025-06-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120833

Andrew Pinski  changed:

   What|Removed |Added

   Severity|normal  |enhancement
  Component|c   |tree-optimization
 CC||pinskia at gcc dot gnu.org

[Bug libfortran/103886] Use 64-bit time_t on 32-bit glibc targets

2025-06-26 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103886

Richard Biener  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2025-06-26

--- Comment #6 from Richard Biener  ---
Confirmed.

[Bug driver/120832] New: Use 64bit time_t for host tools on 32bit hosts

2025-06-26 Thread rguenth at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120832

Bug ID: 120832
   Summary: Use 64bit time_t for host tools on 32bit hosts
   Product: gcc
   Version: 16.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: driver
  Assignee: unassigned at gcc dot gnu.org
  Reporter: rguenth at gcc dot gnu.org
  Target Milestone: ---

At least gcc.cc:load_specs calls stat() which will fail after Y2038

[Bug fortran/88076] Shared Memory implementation for Coarrays

2025-06-26 Thread vehre at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88076

--- Comment #22 from Andre Vehreschild  ---
(In reply to Jerry DeLisle from comment #21)
...

> Additional information with these patches.  I see the following installation
> with make install.
> 
> :~/dev/usrav/lib/gcc$ ls x86_64-pc-linux-gnu/16.0.0/
> 32 crtbeginT.o  crtfastmath.o  crtprec80.o  include-fixed   libcaf_shmem.la 
> libgcc.a plugin  crtbegin.o   crtend.o crtprec32.ofinclude
> install-tools   libcaf_single.a   libgcc_eh.a crtbeginS.o  crtendS.o   
> crtprec64.oinclude  libcaf_shmem.a  libcaf_single.la  libgcov.a
> 
> You will see the various libcaf libraries install in a subdirectory 'gcc' of
> the lib directory. The lib directory is the 32-bit librairies.

Er, not to my understanding. lib/gcc/x86_64-pc-linux-gnu/16.0.0 contains a
mixture of 32/64-bit files. For example the crtend.o on my 64-bit system is a
ELF 64-bit LSB relocatable, x86-64, version 1 (SYSV), with debug_info, not
stripped

Therefore I am convinced that also the libcaf_shmem.a in that directory will
contain 64-bit object files. 

> There are no such files in or subdirectory in the lib64 subdirectory which
> is for the 64-bit version of the libraries.

That's correct. But in lib64 also no caf_single is present. That is also only
in lib/x86_... At the moment libcaf_shmem is provided only as a static link
library, just like caf_single. I.e., when linking against libcaf_single works,
the same should be possible for libcaf_shmem. 

I can only ask you to do a clean build and maybe also drop the installation
directory. Sometimes build systems find funny things and then this oddities
happen.

[Bug target/120835] New: on nvptx target with openmp, gcc 15.1 computes different results with differing -O levels.

2025-06-26 Thread schulz.benjamin at googlemail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120835

Bug ID: 120835
   Summary: on nvptx target with openmp, gcc 15.1 computes
different results with differing -O levels.
   Product: gcc
   Version: 15.1.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: schulz.benjamin at googlemail dot com
  Target Milestone: ---

Created attachment 61721
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=61721&action=edit
openmp library

Hi there, the attached program yields different results with different -O
levels... 

It just makes a matrix multiplication, then a Cholesky decomposition, then an
LU decomposition

[Bug target/120835] on nvptx target with openmp, gcc 15.1 computes different results with differing -O levels.

2025-06-26 Thread schulz.benjamin at googlemail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120835

--- Comment #1 from Benjamin Schulz  ---
Created attachment 61722
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=61722&action=edit
main which calls the library

[Bug target/120835] on nvptx target with openmp, gcc 15.1 computes different results with differing -O levels.

2025-06-26 Thread schulz.benjamin at googlemail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120835

--- Comment #2 from Benjamin Schulz  ---
Created attachment 61723
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=61723&action=edit
cmake lists.txt for easy switch of the compilers

[Bug ada/120440] [15/16 regression] gnat exception handling miscompiled (`gnat ls` crashes when bootstrapped with -march=znver3) since r15-8901-g7bec4570301c43

2025-06-26 Thread sjames at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120440

--- Comment #11 from Sam James  ---
Oh:
```
(rr) p Copy
$24 = (access system.exceptions.machine.gnat_gcc_exception) 0x56314ade85a0
(rr) disas
Dump of assembler code for function
ada__exceptions__exception_propagation__gnat_gcc_exception_cleanupXn:
=> 0x56312853b80e <+0>: test   rsi,rsi
   0x56312853b811 <+3>: je 0x56312853b825

   0x56312853b813 <+5>: subrsp,0x8
   0x56312853b817 <+9>: movrdi,QWORD PTR [rsi-0x8]
   0x56312853b81b <+13>:call   0x5631285713d0 <__gnat_free>
   0x56312853b820 <+18>:addrsp,0x8
   0x56312853b824 <+22>:ret
   0x56312853b825 <+23>:ret
End of assembler dump.
(rr) p $rsi - 0x8
$25 = 94769709483416
(rr) p *($rsi - 0x8)
$26 = 721
```

[Bug ada/120440] [15/16 regression] gnat exception handling miscompiled (`gnat ls` crashes when bootstrapped with -march=znver3) since r15-8901-g7bec4570301c43

2025-06-26 Thread sjames at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120440

--- Comment #7 from Sam James  ---
(In reply to Richard Biener from comment #3)
> likewise whether -fno-tree-vectorize helps.

This didn't make a difference. I'll do the rest soon.

[Bug ada/120440] [15/16 regression] gnat exception handling miscompiled (`gnat ls` crashes when bootstrapped with -march=znver3) since r15-8901-g7bec4570301c43

2025-06-26 Thread sjames at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120440

--- Comment #9 from Sam James  ---
(In reply to Sam James from comment #8)

But for the top frame, _ada_gnatcmd is huge:
[...]
   0x55575082 <-248190>:leardx,[rip+0xab707]#
0x55620790 
   0x55575089 <-248183>:leardi,[rip+0xab6f0]#
0x55620780 
   0x55575090 <-248176>:learsi,[rip+0xab671]#
0x55620708
   0x55575097 <-248169>:call   0x555c9660

   0x5557509c <-248164>:movr12d,eax
   0x5557509f <-248161>:poprax
   0x555750a0 <-248160>:poprdx
   0x555750a1 <-248159>:leardi,[rbp-0x5a0]
   0x555750a8 <-248152>:call   0x555899b0

   0x555750ad <-248147>:movrsi,QWORD PTR [rbp-0x690]
   0x555750b4 <-248140>:movzx  r12d,r12b
   0x555750b8 <-248136>:learax,[rip+0xab6b9]#
0x55620778 
   0x555750bf <-248129>:xoredx,edx
   0x555750c1 <-248127>:movrdi,rbx
   0x555750c4 <-248124>:movzx  r14d,BYTE PTR [rax+r12*1]
   0x555750c9 <-248119>:call   0x5558a8e0
<__gnat_end_handler_v1>
=> 0x555750ce <-248114>:jmp0x555b29d5
<_ada_gnatcmd+4053>
   0x555750d3 <-248109>:movrbx,rax
   0x555750d6 <-248106>:movrax,rdx
   0x555750d9 <-248103>:movrsp,r12
   0x555750dc <-248100>:movrsp,QWORD PTR [rbp-0x6c0]
   0x555750e3 <-248093>:vzeroupper
   0x555750e6 <-248090>:decrax
[...]

but we're not in a loop there either AFAICT. It'll be interesting to try
decompose -march=znver3 next to see what actually does it given
-fno-tree-vectorize doesn't help.

[Bug target/120839] [16 Regression] ICE on x86_64-linux-gnu: in ix86_finalize_stack_frame_flags, at config/i386/i386.cc:8738 at -O1 and above with aligned on struct

2025-06-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120839

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||needs-bisection
 Status|UNCONFIRMED |NEW
  Known to fail||16.0
Summary|ICE on x86_64-linux-gnu: in |[16 Regression] ICE on
   |ix86_finalize_stack_frame_f |x86_64-linux-gnu: in
   |lags, at|ix86_finalize_stack_frame_f
   |config/i386/i386.cc:8738 at |lags, at
   |-O1 and above with aligned  |config/i386/i386.cc:8738 at
   |on struct   |-O1 and above with aligned
   ||on struct
   Target Milestone|--- |16.0
  Known to work||15.1.0
 Ever confirmed|0   |1
   Last reconfirmed||2025-06-27

--- Comment #1 from Andrew Pinski  ---
Confirmed.

[Bug ada/120440] [15/16 regression] gnat exception handling miscompiled (`gnat ls` crashes when bootstrapped with -march=znver3) since r15-8901-g7bec4570301c43

2025-06-26 Thread sjames at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120440

--- Comment #10 from Sam James  ---
Everything seems fine(?) before we call `Free (Copy)`:

```
Breakpoint 1, ada.exceptions.exception_propagation.gnat_gcc_exception_cleanup
(reason=urc_foreign_exception_caught, excep=0x56314ade85a0) at
../rts/a-exexpr.adb:354
354   Free (Copy);
(rr) p *Copy
$7 = (
  header => (
class => 5138137877735301376,
cleanup => (system.address) 0x56312853b80e,
private1 => 0,
private2 => 140732627361808,
private3 => 0,
private4 => 0,
private5 => 0,
private6 => 0
  ),
  occurrence => (
id => 0x5631285f9300 ,
machine_occurrence => (system.address) 0x56314ade85a0,
msg_length => 26,
msg => "bad input for 'Value: ""ls""", '["00"]' ,
exception_raised => false,
pid => 0,
num_tracebacks => 0,
tracebacks => (0 )
  )
)
```

I just don't yet see where 0xd21 materialises from:
```
Breakpoint 1, ada.exceptions.exception_propagation.gnat_gcc_exception_cleanup
(reason=urc_foreign_exception_caught, excep=0x56314ade85a0) at
../rts/a-exexpr.adb:354
354   Free (Copy);
(rr) p Copy
$10 = (access system.exceptions.machine.gnat_gcc_exception) 0x56314ade85a0
(rr) s
340procedure GNAT_GCC_Exception_Cleanup
(rr) s
354   Free (Copy);
(rr) s

Breakpoint 5, <__gnat_free> (ptr=0x2d1) at s-memory.adb:117
```

[Bug target/120840] CPython miscompiled with preserve_none

2025-06-26 Thread sjames at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120840

--- Comment #1 from Sam James  ---
Just ftr: with ./configure --with-tail-call-interp CFLAGS="-O2
-fno-stack-protector" (to override my own defaults), it still fails with a
corrupt stack.

[Bug target/120828] [16 Regression] Unrecognized insn after recent RISC-V change for .vf support

2025-06-26 Thread law at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120828

Jeffrey A. Law  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #3 from Jeffrey A. Law  ---
Should be resolved by Paul-Antoine's patch on the trunk.

[Bug target/120763] [meta-bug] Tracker for bugs to visit during weekly RISC-V meeting

2025-06-26 Thread law at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120763
Bug 120763 depends on bug 120828, which changed state.

Bug 120828 Summary: [16 Regression] Unrecognized insn after recent RISC-V 
change for .vf support
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120828

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

[Bug bootstrap/119430] profiledbootstrap fails on armv7a-unknown-linux-gnueabhif (crashes in elists__append_elmt during stagefeedback)

2025-06-26 Thread sjames at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119430

Sam James  changed:

   What|Removed |Added

 CC||aoliva at gcc dot gnu.org

--- Comment #6 from Sam James  ---
I was a little bit hopeful that Alex's fixes would help, but it fails unchanged
with r16-1724-gf9a6efa7a71e80.

[Bug ada/120440] [15/16 regression] gnat exception handling miscompiled (`gnat ls` crashes when bootstrapped with -march=znver3) since r15-8901-g7bec4570301c43

2025-06-26 Thread sjames at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120440

--- Comment #8 from Sam James  ---
(In reply to Richard Biener from comment #3)
> checking in gdb whether this is a misaligned vector access and/or whether we
> are in a early-break vectorized loop would be useful

(gdb) frame
#1  0x555c03d9 in <__gnat_free> (ptr=) at
s-memory.adb:120
120  c_free (Ptr);
(gdb) disas
Dump of assembler code for function __gnat_free:
   0x555c03d0 <+0>: subrsp,0x8
   0x555c03d4 <+4>: call   0x4740 
=> 0x555c03d9 <+9>: addrsp,0x8
   0x555c03dd <+13>:ret
End of assembler dump.

(gdb) frame 2
#2  0x5558a820 in
ada.exceptions.exception_propagation.gnat_gcc_exception_cleanup
(reason=, excep=) at ../rts/a-exexpr.adb:354
354   Free (Copy);
(gdb) disas
Dump of assembler code for function
ada__exceptions__exception_propagation__gnat_gcc_exception_cleanupXn:
   0x5558a80e <+0>: test   rsi,rsi
   0x5558a811 <+3>: je 0x5558a825

   0x5558a813 <+5>: subrsp,0x8
   0x5558a817 <+9>: movrdi,QWORD PTR [rsi-0x8]
   0x5558a81b <+13>:call   0x555c03d0 <__gnat_free>
=> 0x5558a820 <+18>:addrsp,0x8
   0x5558a824 <+22>:ret
   0x5558a825 <+23>:ret
End of assembler dump.

#3  0x5558a914 in <__gnat_end_handler_v1>
(gcc_exception=gcc_exception@entry=0x5572a660,
saved_cleanup=saved_cleanup@entry=0x5558a80e
,
propagating_exception=propagating_exception@entry=0x0)
at ../rts/a-exexpr.adb:519
519  Unwind_DeleteException (GCC_Exception);
(gdb) disas
Dump of assembler code for function __gnat_end_handler_v1:
   0x5558a8e0 <+0>: movQWORD PTR [rdi+0x8],rsi
   0x5558a8e4 <+4>: cmprdi,rdx
   0x5558a8e7 <+7>: setne  dl
   0x5558a8ea <+10>:learax,[rip+0x2f]# 0x5558a920
<__gnat_claimed_cleanup>
   0x5558a8f1 <+17>:cmprsi,rax
   0x5558a8f4 <+20>:setne  al
   0x5558a8f7 <+23>:test   dl,al
   0x5558a8f9 <+25>:jne0x5558a8fc
<__gnat_end_handler_v1+28>
   0x5558a8fb <+27>:ret
   0x5558a8fc <+28>:push   rbx
   0x5558a8fd <+29>:movrbx,rdi
   0x5558a900 <+32>:call   QWORD PTR [rip+0xbe88a]#
0x55649190 
   0x5558a906 <+38>:cmprbx,QWORD PTR [rax+0x8]
   0x5558a90a <+42>:je 0x5558a916
<__gnat_end_handler_v1+54>
   0x5558a90c <+44>:movrdi,rbx
   0x5558a90f <+47>:call   0x55602c40 <_Unwind_DeleteException>
=> 0x5558a914 <+52>:poprbx
   0x5558a915 <+53>:ret
   0x5558a916 <+54>:movQWORD PTR [rax+0x8],0x0
   0x5558a91e <+62>:jmp0x5558a90c
<__gnat_end_handler_v1+44>
End of assembler dump.

[Bug tree-optimization/120839] New: ICE on x86_64-linux-gnu: in ix86_finalize_stack_frame_flags, at config/i386/i386.cc:8738 at -O1 and above with aligned on struct

2025-06-26 Thread jiangchangwu at smail dot nju.edu.cn via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120839

Bug ID: 120839
   Summary: ICE on x86_64-linux-gnu: in
ix86_finalize_stack_frame_flags, at
config/i386/i386.cc:8738 at -O1 and above with aligned
on struct
   Product: gcc
   Version: 16.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jiangchangwu at smail dot nju.edu.cn
  Target Milestone: ---

Compiler Explorer: https://gcc.godbolt.org/z/e7M4d86c5

***
gcc version:
Using built-in specs.
COLLECT_GCC=gcc-trunk
COLLECT_LTO_WRAPPER=/home/software/gcc-trunk/libexec/gcc/x86_64-pc-linux-gnu/16.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc/configure --disable-multilib --disable-bootstrap
--enable-languages=c,c++ --prefix=/home/software/gcc-trunk --enable-coverage
--disable-werror --enable-checking=yes
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 16.0.0 20250613 (experimental) (GCC)

***
Program:
$ cat mutant.c
typedef struct {
  long double a, b
} c __attribute__((aligned(32)));
double d;
void e(c f) { d = f.a; }

***
Command Lines:
$ gcc -O1 mutant.c
mutant.c:3:1: warning: no semicolon at end of struct or union
3 | } c __attribute__((aligned(32)));
  | ^
during RTL pass: pro_and_epilogue
mutant.c: In function 'e':
mutant.c:5:24: internal compiler error: in ix86_finalize_stack_frame_flags, at
config/i386/i386.cc:8738
5 | void e(c f) { d = f.a; }
  |^
0x5554b38 internal_error(char const*, ...)
../../gcc/gcc/diagnostic-global-context.cc:517
0x54d2d0a fancy_abort(char const*, int, char const*)
../../gcc/gcc/diagnostic.cc:1803
0x2af82c6 ix86_finalize_stack_frame_flags
../../gcc/gcc/config/i386/i386.cc:8738
0x2b00b9e ix86_expand_epilogue(int)
../../gcc/gcc/config/i386/i386.cc:10039
0x40ca69d gen_epilogue()
../../gcc/gcc/config/i386/i386.md:21030
0x2accf69 target_gen_epilogue
../../gcc/gcc/config/i386/i386.md:20528
0x17a2a30 make_epilogue_seq
../../gcc/gcc/function.cc:6013
0x17a2c74 thread_prologue_and_epilogue_insns()
../../gcc/gcc/function.cc:6095
0x17a4f39 rest_of_handle_thread_prologue_and_epilogue
../../gcc/gcc/function.cc:6609
0x17a5387 execute
../../gcc/gcc/function.cc:6695
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See  for instructions.

[Bug target/120840] CPython miscompiled with preserve_none

2025-06-26 Thread kenjin4096 at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120840

--- Comment #2 from Ken Jin  ---
@Sam James

Passing CFLAGS="-fno-omit-frame-pointer -mno-omit-leaf-frame-pointer" to the
configure fixes the crash for me, does it do the same for you? If so this is
probably a pretty big hint.

[Bug target/120840] New: CPython miscompiled with preserve_none

2025-06-26 Thread sjames at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120840

Bug ID: 120840
   Summary: CPython miscompiled with preserve_none
   Product: gcc
   Version: 16.0
Status: UNCONFIRMED
  Keywords: wrong-code
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: sjames at gcc dot gnu.org
CC: hjl.tools at gmail dot com, kenjin4096 at gmail dot com
  Target Milestone: ---

>From https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119628#c24 ...

(In reply to H.J. Lu from comment #24)
> How do I reproduce it?  Do you have a step-by-step guide?  Thanks.

$ git clone https://github.com/python/cpython
$ git rev-parse HEAD
34ce1920ca33c11ca2c379ed0ef30a91010bef4f
$ ./configure --with-tail-call-interp
$ make -j$(nproc) -l$(nproc)
[...]
./_bootstrap_python ./Programs/_freeze_module.py codecs ./Lib/codecs.py
Python/frozen_modules/codecs.h
*** stack smashing detected ***: terminated
make: *** [Makefile:1912: Python/frozen_modules/codecs.h] Aborted (core dumped)
make: *** Waiting for unfinished jobs
./_bootstrap_python ./Programs/_freeze_module.py posixpath ./Lib/posixpath.py
Python/frozen_modules/posixpath.h
*** stack smashing detected ***: terminated
make: *** [Makefile:1930: Python/frozen_modules/posixpath.h] Aborted (core
dumped)
./_bootstrap_python ./Programs/_freeze_module.py abc ./Lib/abc.py
Python/frozen_modules/abc.h
*** stack smashing detected ***: terminated
make: *** [Makefile:1909: Python/frozen_modules/abc.h] Aborted (core dumped)

$ gdb --args ./_bootstrap_python ./Programs/_freeze_module.py codecs
./Lib/codecs.py Python/frozen_modules/codecs.h
Reading symbols from ./_bootstrap_python...
(gdb) r
Starting program: /tmp/cpython/_bootstrap_python ./Programs/_freeze_module.py
codecs ./Lib/codecs.py Python/frozen_modules/codecs.h
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib64/libthread_db.so.1".
*** stack smashing detected ***: terminated

Program received signal SIGABRT, Aborted.
__pthread_kill_implementation (threadid=, signo=6, no_tid=0) at
pthread_kill.c:44
44return INTERNAL_SYSCALL_ERROR_P (ret) ? INTERNAL_SYSCALL_ERRNO
(ret) : 0;
(gdb) bt
#0  __pthread_kill_implementation (threadid=, signo=6, no_tid=0)
at pthread_kill.c:44
#1  __pthread_kill_internal (threadid=, signo=6) at
pthread_kill.c:89
#2  __GI___pthread_kill (threadid=, signo=signo@entry=6) at
pthread_kill.c:100
#3  0x77c20dc2 in __GI_raise (sig=sig@entry=6) at
../sysdeps/posix/raise.c:26
#4  0x77c01383 in __GI_abort () at abort.c:73
#5  0x77c0258a in __libc_message_impl (fmt=fmt@entry=0x77de0816
"*** %s ***: terminated\n") at ../sysdeps/posix/libc_fatal.c:134
#6  0x77d28747 in __GI___fortify_fail (msg=msg@entry=0x77de082e
"stack smashing detected") at fortify_fail.c:24
#7  0x77d29942 in __stack_chk_fail () at stack_chk_fail.c:24
#8  0x55761b67 in _PyEval_EvalFrameDefault (tstate=,
frame=, throwflag=) at Python/ceval.c:1238
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

[Bug target/119628] Need better mechanisms to manage register saves in callee for tail calls (inc. preserve_none for x86_64?)

2025-06-26 Thread sjames at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119628

--- Comment #25 from Sam James  ---
Let's carry on in a new bug: PR120840.

[Bug c/120841] New: gcc prefer non-volatile register produces sub optimal code

2025-06-26 Thread rockeet at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120841

Bug ID: 120841
   Summary: gcc prefer non-volatile register produces sub optimal
code
   Product: gcc
   Version: 16.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: rockeet at gmail dot com
  Target Milestone: ---

void foo(char* buf, const char* src, size_t n) {
memcpy(buf, src, n);
memcpy(buf + n, &n, sizeof(n));
}

Both gcc and g++ produces(https://godbolt.org/z/TzW4jvToe):

pushrbx
mov rbx, rdx
callmemcpy
mov QWORD PTR [rax+rbx], rbx
pop rbx
ret

The optimal does not need non-volatile register:

pushrdx
callmemcpy
pop rdx
mov QWORD PTR [rax+rdx], rdx
ret

[Bug target/120840] CPython miscompiled with preserve_none

2025-06-26 Thread sjames at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120840

--- Comment #3 from Sam James  ---
(In reply to Ken Jin from comment #2)
> @Sam James
> 
> Passing CFLAGS="-fno-omit-frame-pointer -mno-omit-leaf-frame-pointer" to the
> configure fixes the crash for me, does it do the same for you? If so this is
> probably a pretty big hint.

Yeah, this works indeed.

[Bug target/120840] CPython miscompiled with preserve_none

2025-06-26 Thread hjl.tools at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120840

H.J. Lu  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2025-06-27

--- Comment #4 from H.J. Lu  ---
Reproduced.

[Bug target/120840] CPython miscompiled with preserve_none

2025-06-26 Thread hjl.tools at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120840

--- Comment #5 from H.J. Lu  ---
I backported the patch to GCC 15 branch, which works.  Sam, is it possible to
identify which commit on master caused the miscompilation?

[Bug target/120840] CPython miscompiled with preserve_none

2025-06-26 Thread sjames at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120840

--- Comment #6 from Sam James  ---
I'll look.

--- Comment #7 from Sam James  ---
It starts with r16-1551-g2c30f828e45078 with your patch on top. I did the
bisection without SSP as the default (vanilla).

When playing with the result on my usual trunk build, I noticed:
* -fno-stack-protector -fno-shrink-wrap-separate works
* -fstack-protector-strong -fno-shrink-wrap-separate still fails

[Bug rtl-optimization/120424] [14/15 regression] lra-elimination issues when fp2sp elimination is disabled part-way through lra

2025-06-26 Thread ebotcazou at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120424

Eric Botcazou  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed|2025-06-26 00:00:00 |2025-06-27
 Status|UNCONFIRMED |NEW

[Bug target/120841] gcc prefer non-volatile register produces sub optimal code

2025-06-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120841

Andrew Pinski  changed:

   What|Removed |Added

  Component|c   |target
   Severity|normal  |enhancement
   Keywords||missed-optimization, ra

[Bug middle-end/42909] inefficient code for trivial tail-call with large struct parameter

2025-06-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=42909

--- Comment #9 from Andrew Pinski  ---
Patch posted:
https://gcc.gnu.org/pipermail/gcc-patches/2025-June/687768.html

[Bug target/120841] gcc prefer non-volatile register produces sub optimal code

2025-06-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120841

--- Comment #1 from Andrew Pinski  ---
Gcc is already better than llvm by figuring out the return of memcpy is the
first argument.

I am not sure if the one extra move is going hurt here either.

[Bug fortran/120812] [regression] buffer(80:80) = C_NEW_LINE not working with gfortran 15.1 under Mac

2025-06-26 Thread christophe.peyret at onera dot fr via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120812

--- Comment #11 from Christophe Peyret  ---
same on Mac ARM :)

[Bug target/118518] gcc 14.2.1 nvptx cross compiler complains about alias definitions in a struct with two constructors that are not aliases

2025-06-26 Thread schulz.benjamin at googlemail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118518

--- Comment #17 from Benjamin Schulz  ---
My code in 

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120835 

apparently  provokes gcc to yield different results for the same computation
with differing -O optimization levels...

I do not use any kind of difficult tricks that c++ allows. just rather simple
c++

[Bug target/120835] on nvptx target with openmp, gcc 15.1 computes different results with differing -O levels.

2025-06-26 Thread schulz.benjamin at googlemail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120835

--- Comment #7 from Benjamin Schulz  ---
note that the matrix multiplication becomes only crazy with increasing -O when
called within another function that does computations.


correct are these:

A Cholesky decomposition with the multiplication on gpu
4 12 -16 
12 37 -43 
-16 -43 98 

2 0 0 
6 1 0 
-8 5 3 

Now the cholesky decomposition is entirely done on gpu
2 0 0 
6 1 0 
-8 5 3 

Now we do the same with the lu decomposition
1 -2 -2 -3 
3 -9 0 -9 
-1 2 4 7 
-3 -6 26 2 

Just the multiplication on gpu
1 0 0 0 
3 1 0 0 
-1 -0 1 0 
-3 4 -2 1 

1 -2 -2 -3 
0 -3 6 0 
0 0 2 4 
0 0 0 1 

Entirely on gpu

1 0 0 0 
3 1 0 0 
-1 -0 1 0 
-3 4 -2 1 

1 -2 -2 -3 
0 -3 6 0 
0 0 2 4 
0 0 0 1 



and on O1, one gets this:


A Cholesky decomposition with the multiplication on gpu
4 12 -16 
12 37 -43 
-16 -43 98 

2 0 0 
6 1 0 
-8 5 9.89949 

Now the cholesky decomposition is entirely done on gpu
2 0 0 
6 1 0 
-8 5 9.89949 

Now we do the same with the lu decomposition
1 -2 -2 -3 
3 -9 0 -9 
-1 2 4 7 
-3 -6 26 2 

Just the multiplication on gpu
1 0 0 0 
3 1 0 0 
-1 -0 1 0 
-3 4 -2 1 

1 -2 -2 -3 
0 -3 6 0 
0 0 2 4 
0 0 0 2 

Entirely on gpu
1 0 0 0 
3 1 0 0 
-1 -0 1 0 
-3 4 -2 1 

1 -2 -2 -3 
0 -3 6 0 
0 0 2 4 
0 0 0 1

[Bug target/120719] crc standard patterns are not implemented for x86

2025-06-26 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120719

--- Comment #4 from GCC Commits  ---
The master branch has been updated by Uros Bizjak :

https://gcc.gnu.org/g:f8f7ace4f20829f2fad87662f5163c9b13427e39

commit r16-1706-gf8f7ace4f20829f2fad87662f5163c9b13427e39
Author: Uros Bizjak 
Date:   Thu Jun 26 14:13:01 2025 +0200

i386: Introduce crc_revsi4 expanders [PR120719]

Introduce crc_revsi4 expanders to generate CRC32 instruction when
using
__builtin_rev_crc32_data* builtins with 0x1EDC6F41 poylnomial and -mcrc32.

PR target/120719

gcc/ChangeLog:

* config/i386/i386.md (crc_revsi4): New expander.

gcc/testsuite/ChangeLog:

* gcc.target/i386/crc-builtin-crc32.c: New test.

[Bug target/120719] crc standard patterns are not implemented for x86

2025-06-26 Thread ubizjak at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120719

Uroš Bizjak  changed:

   What|Removed |Added

 Resolution|--- |FIXED
   Target Milestone|--- |16.0
 Target|x86_64  |x86
 Status|ASSIGNED|RESOLVED

--- Comment #5 from Uroš Bizjak  ---
Implemented for gcc-16.

[Bug c++/120836] New: [16 regression] Including hides 'satisfaction value ... changed' diagnostic

2025-06-26 Thread m.cencora at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120836

Bug ID: 120836
   Summary: [16 regression] Including  hides
'satisfaction value ... changed' diagnostic
   Product: gcc
   Version: 16.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: m.cencora at gmail dot com
  Target Milestone: ---

Given code below:

#include 

template 
concept printable = requires (const T& a) { a.print(); };

struct foo
{
static constexpr bool member1 = printable;
bool member2 = printable;

void print() const
{

}

static constexpr bool member3 = printable;
};

int main()
{
static_assert(!foo::member1);
static_assert(foo{}.member2);
static_assert(foo::member3);
static_assert(printable);
}


When compiled on gcc-16 with -std=c++23 just prints:
: In function 'int main()':
:23:24: error: static assertion failed
   23 | static_assert(foo::member3);
  |   ~^~~
Compiler returned: 1

If I use -std=c++20 or compile with gcc-15 or remove '#include ' I
get much better diagnostics:

: In function 'int main()':
:22:25: error: static assertion failed
   22 | static_assert(foo{}.member2);
  |   ~~^~~
:23:24: error: static assertion failed
   23 | static_assert(foo::member3);
  |   ~^~~
:24:19: error: static assertion failed
   24 | static_assert(printable);
  |   ^~
:24:19: note: constraints not satisfied
:4:9:   required by the constraints of 'template concept
printable'
:4:21:   in requirements with 'const T& a' [with T = foo]
:4:21: error: satisfaction value of atomic constraint 'requires(const
T& a) {a->print();} [with T = foo]' changed from 'false' to 'true'
4 | concept printable = requires (const T& a) { a.print(); };
  | ^~~~
:8:37: note: satisfaction value first evaluated to 'false' from here
8 | static constexpr bool member1 = printable;
  | ^~
Compiler returned: 1

[Bug fortran/120637] Memory leak in finalization gfortran 9.5-16.0

2025-06-26 Thread vehre at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120637

--- Comment #6 from Andre Vehreschild  ---
Hi Antony,

I could not apply your patch. Neither by git am nor by patch -p1. So I had to
replay it essentially. With that applied all seems to be fine, executionwise.
But the regression tests fail for finalize_34.f90 where now __builtin_free is 
called 24 times instead of the expected 12 times. Looking into the dump one
sees, that ptr2->evtlist[0] components are freed 4 times in a row. Therefore I
strongly suggest to not remove the was_finalized checking. May be you can up,
with a better solution. But I don't think that bloating the executable for the
sake of removing a check in gfortran will be very well taken.

[Bug middle-end/28831] [12/13/14/15/16 Regression] Aggregate copy not elided when using a return value as a pass-by-value parameter

2025-06-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=28831

Andrew Pinski  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=42909

--- Comment #48 from Andrew Pinski  ---
Very much related to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=42909#c5 .

[Bug c++/120836] [16 regression] Including hides 'satisfaction value ... changed' diagnostic

2025-06-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120836

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |16.0

[Bug sanitizer/120837] False-positive from -fsanitize=undefined

2025-06-26 Thread sjames at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120837

Sam James  changed:

   What|Removed |Added

 CC||sjames at gcc dot gnu.org

--- Comment #3 from Sam James  ---
(In reply to Andrew Pinski from comment #2)

Thanks, sorry, I'd forgot we had any of these open still.

[Bug sanitizer/120471] [12/13/14/15/16 regression] -fsanitize=undefined causes read of uninitialized variable when accessing element in an array at -O0 level

2025-06-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120471

Andrew Pinski  changed:

   What|Removed |Added

 CC||drh at sqlite dot org

--- Comment #10 from Andrew Pinski  ---
*** Bug 120837 has been marked as a duplicate of this bug. ***

[Bug sanitizer/120837] False-positive from -fsanitize=undefined

2025-06-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120837

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |DUPLICATE
 Status|UNCONFIRMED |RESOLVED

--- Comment #2 from Andrew Pinski  ---
This is a dup of bug 120471 . The pattern for the workaround vs the original
code is exactly the same as what the testcase for bug 120471.

`(ll % 2 ? b : ib)[c % 3]`
vs:
`(nNew>nOld ? apNew : apOld)[nOld-1];`

*** This bug has been marked as a duplicate of bug 120471 ***

[Bug sanitizer/120471] [12/13/14/15/16 regression] -fsanitize=undefined causes read of uninitialized variable when accessing element in an array at -O0 level

2025-06-26 Thread sjames at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120471

--- Comment #11 from Sam James  ---
Affects SQLite too (PR120837).

[Bug c++/120836] Including hides 'satisfaction value ... changed' diagnostic

2025-06-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120836

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|16.0|---
Summary|[16 regression] Including   |Including  hides
   | hides|'satisfaction value ...
   |'satisfaction value ... |changed' diagnostic
   |changed' diagnostic |
   Keywords||GC

--- Comment #1 from Andrew Pinski  ---
Note this is not a regression.

Using the same GC parameters for GCC 15 as the trunk, we get the same behavior.

That is adding:
--param ggc-min-expand=30 --param ggc-min-heapsize=4096

to the command line of GCC 15.1.0, and we get the same behavior (or using the
GCC 15.1.0 compiler that was configured with --enable-checking=yes)

Which is what I expected is the cache is removed after the garbage collector
has run. So the `satisfaction value ... changed` is dependent on the caching of
the values.

I am not 100% sure this is a bug.

[Bug c++/120836] Including hides 'satisfaction value ... changed' diagnostic

2025-06-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120836

--- Comment #2 from Andrew Pinski  ---
(In reply to Andrew Pinski from comment #1)
> Note this is not a regression.
> 
> Using the same GC parameters for GCC 15 as the trunk, we get the same
> behavior.

I forgot to mention that the GC parameters for the trunk (or
--enable-checking=yes) are always lower than the auto figured out ones that are
used for the releases (or --enable-checking=release).

[Bug c/120837] New: False-positive from -fsanitize=undefined

2025-06-26 Thread drh at sqlite dot org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120837

Bug ID: 120837
   Summary: False-positive from -fsanitize=undefined
   Product: gcc
   Version: 13.3.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: drh at sqlite dot org
  Target Milestone: ---

UBSAN is raising an error on what appears to be perfectly valid code.  The
error message is: "runtime error: load of address 0x7ffe8df74630 with
insufficient space for an object of type 'struct MemPage *'".

== OUR WORK-AROUND

The SQLite check-in 2025-06-26T18:57Z
<https://sqlite.org/src/info/2025-06-26T18:57Z> clears the problem for us.  In
the very simple diff shown on the link, I think the code on both the right and
the left should be valid and should give the same result.  Yet, only the right
side works with gcc-ubsan when optimization is enabled.

== REPRO INSTRUCTIONS

  *  Download <https://sqlite.org/tmp/gcc-ubsan-20250626.tar.gz>
  *  Untar
  *  `gcc -O1 -fsanitize=undefined shell.c sqlite3.c -lm -ldl -lpthread`
  *  `./a.out https://sqlite.org/forum/forumpost/1d7c25d4a2d6f5e2>

[Bug c/120837] False-positive from -fsanitize=undefined

2025-06-26 Thread sjames at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120837

--- Comment #1 from Sam James  ---
The RHS working sounds like it might be SAVE_EXPR stuff again?

[Bug middle-end/42909] inefficient code for trivial tail-call with large struct parameter

2025-06-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=42909

--- Comment #7 from Andrew Pinski  ---
(In reply to Andrew Pinski from comment #5)
> Created attachment 61618 [details]
> Patch which I am testing for the aarch64 issue

But it miscompiles (there is no testcase for this though):
```
struct s1 { int x[1024]; };

void f(struct s1, struct s1);

void g(struct s1 x)
{
f(x, x);
}
```
Where we need a copy of the incoming argument as we can only reuse it once.

Let me think of how to fix that.

[Bug target/119628] Need better mechanisms to manage register saves in callee for tail calls (inc. preserve_none for x86_64?)

2025-06-26 Thread hjl.tools at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119628

--- Comment #24 from H.J. Lu  ---
(In reply to Ken Jin from comment #22)
> Hi H.J,
> 
> Thanks a lot for your work on this. I get a crash due to a possible
> miscompile on the latest GCC commit
> (7c67f7f8d4c8aadbe8efd733c29d13bfcbb0f50f).
> 
> Unfortunately, I cannot create a minimal reproducer right now, but something
> strange is going at the boundary where a normal function calls a
> preserve_none function, and where a preserve_none function returns back to
> the non preserve_none function. An interesting observation: passing
> `-fno-omit-frame-pointer -mno-omit-leaf-frame-pointer` seems to produce a
> working binary that doesn't crash.
> 
> The call stack at the crash is something like _PyEval_EvalFrameDefault (not
> tail call, not preserve_none) -> _TAIL_CALL_start_frame (preserve_none) ->
> (indeterminate number of tail calls, preserve_none) ->
> _TAIL_CALL_INTERPRETER_EXIT (preserve_none)
> 
> _TAIL_CALL_INTERPRETER_EXIT function contains a bare return out of the tail
> call sequence.

How do I reproduce it?  Do you have a step-by-step guide?  Thanks.

[Bug target/120830] [16 regression] ICE when building opencv-4.11.0 (as_a, at machmode.h:391)

2025-06-26 Thread hjl.tools at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120830

H.J. Lu  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|WAITING |RESOLVED

--- Comment #13 from H.J. Lu  ---
Fixed.

[Bug target/120830] [16 regression] ICE when building opencv-4.11.0 (as_a, at machmode.h:391)

2025-06-26 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120830

--- Comment #12 from GCC Commits  ---
The master branch has been updated by H.J. Lu :

https://gcc.gnu.org/g:64c55a99746ef8efa37937ee0fef29de4f081f25

commit r16-1725-g64c55a99746ef8efa37937ee0fef29de4f081f25
Author: H.J. Lu 
Date:   Thu Jun 26 10:05:30 2025 +0800

x86: Handle vector broadcast source

Use the inner scalar mode of vector broadcast source in:

  (set (reg:V8DF 394)
   (vec_duplicate:V8DF (reg:V2DF 190 [ alpha ])))

to compute the vector mode for broadcast from vector source.

gcc/

PR target/120830
* config/i386/i386-features.cc (ix86_get_vector_cse_mode): Handle
vector broadcast source.

gcc/testsuite/

PR target/120830
* g++.target/i386/pr120830.C: New test.

Signed-off-by: H.J. Lu

[Bug rtl-optimization/120424] [14/15 regression] lra-elimination issues when fp2sp elimination is disabled part-way through lra

2025-06-26 Thread aoliva at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120424

Alexandre Oliva  changed:

   What|Removed |Added

  Component|target  |rtl-optimization
 Ever confirmed|1   |0
   Target Milestone|14.4|---
 Status|ASSIGNED|UNCONFIRMED

[Bug target/119628] Need better mechanisms to manage register saves in callee for tail calls (inc. preserve_none for x86_64?)

2025-06-26 Thread kenjin4096 at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119628

--- Comment #22 from Ken Jin  ---
Hi H.J,

Thanks a lot for your work on this. I get a crash due to a possible miscompile
on the latest GCC commit (7c67f7f8d4c8aadbe8efd733c29d13bfcbb0f50f).

Unfortunately, I cannot create a minimal reproducer right now, but something
strange is going at the boundary where a normal function calls a preserve_none
function, and where a preserve_none function returns back to the non
preserve_none function. An interesting observation: passing
`-fno-omit-frame-pointer -mno-omit-leaf-frame-pointer` seems to produce a
working binary that doesn't crash.

The call stack at the crash is something like _PyEval_EvalFrameDefault (not
tail call, not preserve_none) -> _TAIL_CALL_start_frame (preserve_none) ->
(indeterminate number of tail calls, preserve_none) ->
_TAIL_CALL_INTERPRETER_EXIT (preserve_none)

_TAIL_CALL_INTERPRETER_EXIT function contains a bare return out of the tail
call sequence.

[Bug target/120424] [14/15/16 regression] lra-elimination issues when fp2sp elimination is disabled part-way through lra

2025-06-26 Thread sjames at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120424

Sam James  changed:

   What|Removed |Added

   Target Milestone|--- |14.4
Summary|lra-elimination issues when |[14/15/16 regression]
   |fp2sp elimination is|lra-elimination issues when
   |disabled part-way through   |fp2sp elimination is
   |lra |disabled part-way through
   ||lra
 Status|REOPENED|ASSIGNED

[Bug target/119628] Need better mechanisms to manage register saves in callee for tail calls (inc. preserve_none for x86_64?)

2025-06-26 Thread kenjin4096 at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119628

--- Comment #23 from Ken Jin  ---
> Hi Ken, my patch has been merged into GCC master branch.  Can you give it a 
> try?

I did a bench, note that this is not 100% what we use in CPython release
builds, as I had to pass `-fno-omit-frame-pointer -mno-omit-leaf-frame-pointer`
to all my configurations to get the main branch of GCC to not miscompile the
current code.

LTO+PGO enabled for all configurations, disabled PGO only around tail call
bytecode handlers as it regressed performance for those. Intel Turbo boost off.

NO preserve_none:
Pystone(1.1) time for 100 passes = 1.98081
This machine benchmarks at 504844 pystones/second

preserve_none:
Pystone(1.1) time for 100 passes = 1.7661
This machine benchmarks at 566219 pystones/second

I also took some benchmarks from the pyperformance benchmark suite that are
Python-heavy. Specifically, nbody, spectral_norm, and deltablue.

Mean +- std dev: [NO_preserve_none_nbody] 108 ms +- 2 ms ->
[preserve_none_nbody] 95.3 ms +- 2.0 ms: 1.13x faster
Mean +- std dev: [NO_preserve_none_spectralnorm] 95.7 ms +- 0.4 ms ->
[preserve_none_spectralnorm] 83.8 ms +- 0.3 ms: 1.14x faster
Mean +- std dev: [NO_preserve_none_deltablue] 3.59 ms +- 0.03 ms ->
[preserve_none_deltablue] 3.24 ms +- 0.02 ms: 1.11x faster

So seems like the actual speedup is the ~10% range for preserve_none vs
no_preserve_none.

On my system, labels-as-values (indirect goto) performs roughly same as
preserve_none + tail calls. However, note that PGO is disabled for the tail
call handlers, and CPython has been optimizing for indirect goto style for over
10 years! So the fact the performance matches is actually incredibly good.

[Bug middle-end/42909] inefficient code for trivial tail-call with large struct parameter

2025-06-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=42909

Andrew Pinski  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=117098

--- Comment #6 from Andrew Pinski  ---
(In reply to Andrew Pinski from comment #4)
> For x86_64, f2 was fixed in GCC 15.

Which was recorded as PR 117098 also.

[Bug analyzer/120809] [16 Regression] gcc.dg/analyzer/state-diagram-5.c fails when "dot" isn't installed starting with r16-1631-g2334d30cd8feac

2025-06-26 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120809

--- Comment #6 from GCC Commits  ---
The master branch has been updated by David Malcolm :

https://gcc.gnu.org/g:0e7296540be35831e791ffe9f419cd6107831fc9

commit r16-1715-g0e7296540be35831e791ffe9f419cd6107831fc9
Author: David Malcolm 
Date:   Thu Jun 26 13:28:50 2025 -0400

diagnostics, testsuite: don't assume host has "dot" [PR120809]

gcc/ChangeLog:
PR analyzer/120809
* diagnostic-format-html.cc
(html_builder::maybe_make_state_diagram): Bulletproof against the
SVG generation failing.
* xml.cc (xml::printer::push_element): Assert that the ptr is
nonnull.
(xml::printer::append): Likewise.

gcc/testsuite/ChangeLog:
PR analyzer/120809
* gcc.dg/analyzer/state-diagram-5.c: Split out into...
* gcc.dg/analyzer/state-diagram-5-html.c: ...this, adding
dg-require-dot...
* gcc.dg/analyzer/state-diagram-5-sarif.c: ...and this.

Signed-off-by: David Malcolm

[Bug fortran/88076] Shared Memory implementation for Coarrays

2025-06-26 Thread jvdelisle at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88076

--- Comment #23 from Jerry DeLisle  ---
(In reply to Andre Vehreschild from comment #22)
--- snip---
> 
> I can only ask you to do a clean build and maybe also drop the installation
> directory. Sometimes build systems find funny things and then this oddities
> happen.

The oddity is gone.

[Bug analyzer/120809] [16 Regression] gcc.dg/analyzer/state-diagram-5.c fails when "dot" isn't installed starting with r16-1631-g2334d30cd8feac

2025-06-26 Thread dmalcolm at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120809

David Malcolm  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #7 from David Malcolm  ---
Should be fixed by the above patch.

[Bug target/55212] [SH] Switch to LRA

2025-06-26 Thread glaubitz at physik dot fu-berlin.de via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55212

--- Comment #452 from John Paul Adrian Glaubitz  ---
(In reply to Oleg Endo from comment #451)
> There have been several changes and fixes to the LRA module recently. Also
> by Alex himself.  I wonder if all the hacks in the current patch set are
> still all needed or not.

I'll try to perform an LRA bootstrap tonight on master without any of the
patches and report back.

[Bug target/55212] [SH] Switch to LRA

2025-06-26 Thread glaubitz at physik dot fu-berlin.de via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55212

--- Comment #454 from John Paul Adrian Glaubitz  ---
(In reply to Oleg Endo from comment #453)
> As we have already learned from this PR here, a bootstrap is not sufficient
> evidence that everything works normally.  If it bootstraps and can't compile
> other software we're back to sqrt(1).  So it needs at least full testing of
> the included gcc testsuite.

I didn't say I would stop after the bootstrap ;-). Of course, I will perform
more tests if the LRA bootstrap works.

[Bug target/55212] [SH] Switch to LRA

2025-06-26 Thread olegendo at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55212

--- Comment #451 from Oleg Endo  ---
There have been several changes and fixes to the LRA module recently. Also by
Alex himself.  I wonder if all the hacks in the current patch set are still all
needed or not.

[Bug target/120840] CPython miscompiled with preserve_none

2025-06-26 Thread hjl.tools at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120840

H.J. Lu  changed:

   What|Removed |Added

   Assignee|hjl.tools at gmail dot com |unassigned at gcc dot 
gnu.org
 CC||lili.cui at intel dot com,
   ||liuhongt at gcc dot gnu.org,
   ||ubizjak at gmail dot com

--- Comment #8 from H.J. Lu  ---
(In reply to Sam James from comment #7)
> It starts with r16-1551-g2c30f828e45078 with your patch on top. I did the
> bisection without SSP as the default (vanilla).
> 

Lili, it looks like that your shrink-wrap change caused the regression.
Can you take a look?  Thanks.

[Bug target/55212] [SH] Switch to LRA

2025-06-26 Thread olegendo at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55212

--- Comment #453 from Oleg Endo  ---
(In reply to John Paul Adrian Glaubitz from comment #452)
> (In reply to Oleg Endo from comment #451)
> > There have been several changes and fixes to the LRA module recently. Also
> > by Alex himself.  I wonder if all the hacks in the current patch set are
> > still all needed or not.
> 
> I'll try to perform an LRA bootstrap tonight on master without any of the
> patches and report back.

As we have already learned from this PR here, a bootstrap is not sufficient
evidence that everything works normally.  If it bootstraps and can't compile
other software we're back to sqrt(1).  So it needs at least full testing of the
included gcc testsuite.

[Bug target/120835] on nvptx target with openmp, gcc 15.1 computes different results with differing -O levels.

2025-06-26 Thread schulz.benjamin at googlemail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120835

--- Comment #3 from Benjamin Schulz  ---
Created attachment 61724
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=61724&action=edit
correct results

[Bug target/120835] on nvptx target with openmp, gcc 15.1 computes different results with differing -O levels.

2025-06-26 Thread schulz.benjamin at googlemail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120835

--- Comment #4 from Benjamin Schulz  ---
Created attachment 61725
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=61725&action=edit
results with O1, now the matrix multiplication becomes crazy

[Bug target/120835] on nvptx target with openmp, gcc 15.1 computes different results with differing -O levels.

2025-06-26 Thread schulz.benjamin at googlemail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120835

--- Comment #5 from Benjamin Schulz  ---
Created attachment 61726
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=61726&action=edit
with-O2 results also wrong

[Bug target/120835] on nvptx target with openmp, gcc 15.1 computes different results with differing -O levels.

2025-06-26 Thread schulz.benjamin at googlemail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120835

--- Comment #6 from Benjamin Schulz  ---
Created attachment 61727
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=61727&action=edit
with O3

[Bug target/120828] [16 Regression] Unrecognized insn after recent RISC-V change for .vf support

2025-06-26 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120828

--- Comment #2 from GCC Commits  ---
The master branch has been updated by Paul-Antoine Arras :

https://gcc.gnu.org/g:181cb2943d53862aa41eab49a042dff991a3d94f

commit r16-1713-g181cb2943d53862aa41eab49a042dff991a3d94f
Author: Paul-Antoine Arras 
Date:   Wed Jun 25 16:42:00 2025 +

RISC-V: update prepare_ternary_operands to handle vector-scalar case
[PR120828]

This is a followup to 92e1893e0 "RISC-V: Add patterns for vector-scalar
multiply-(subtract-)accumulate" that caused an ICE in some cases where the
mult
operands were wrongly swapped.
This patch ensures that operands are not swapped in the vector-scalar case.

PR target/120828

gcc/ChangeLog:

* config/riscv/riscv-v.cc (prepare_ternary_operands): Handle the
vector-scalar case.

[Bug libstdc++/110739] std::format for chrono types compiles very slowly

2025-06-26 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110739

--- Comment #5 from GCC Commits  ---
The master branch has been updated by Tomasz Kaminski :

https://gcc.gnu.org/g:caac9489f62221da083684456c7c7ceca7425493

commit r16-1712-gcaac9489f62221da083684456c7c7ceca7425493
Author: Tomasz KamiÅski 
Date:   Wed Jun 25 16:58:31 2025 +0200

libstdc++: Lift chrono localized formatting to main chrono format loop
[PR110739]

This patch extract calls to _M_locale_fmt and construction of the struct
tm,
from the functions dedicated to each specifier, to main format loop in
_M_format_to functions. This removes duplicated code repeated for
specifiers.

To allow _M_locale_fmt to only be called if localized formatting is enabled
('L' is present in chrono-format-spec), we provide a implementations for
locale specific specifiers (%c, %r, %x, %X) that produces the same result
as locale::classic():
 * %c is implemented as separate _M_c method
 * %r is implemented as separate _M_r method
 * %x is implemented together with %D, as they provide same behavior,
 * %X is implemented together with %R as _M_R_X, as both of them do not
include
   subseconds.

The handling of subseconds was also extracted to _M_subsecs function that
is
used by _M_S and _M_T specifier. The _M_T is now implemented in terms of
_M_R_X (printing time without subseconds) and _M_subs.

The __mod parameter responsible for triggering localized formatting was
removed
from methods handling most of specifiers, except:
 * _M_S (for %S) for which it determines if subseconds should be included,
 * _M_z (for %z) for which it determines if ':' is used as separator.

PR libstdc++/110739

libstdc++-v3/ChangeLog:

* include/bits/chrono_io.h (__formatter_chrono::_M_use_locale_fmt):
Define.
(__formatter_chrono::_M_locale_fmt): Moved to front of the class.
(__formatter_chrono::_M_format_to): Construct and initialize
struct tm and call _M_locale_fmt if needed.
(__formatter_chrono::_M_c_r_x_X): Split into separate methods.
(__formatter_chrono::_M_c, __formatter_chrono::_M_r): Define.
(__formatter_chrono::_M_D): Renamed to _M_D_x.
(__formatter_chrono::_M_D_x): Renamed from _M_D.
(__formatter_chrono::_M_R_T): Split into _M_R_X and _M_T.
(__formatter_chrono::_M_R_X): Extracted from _M_R_T.
(__formatter_chrono::_M_T): Define in terms of _M_R_X and
_M_subsecs.
(__formatter_chrono::_M_subsecs): Extracted from _M_S.
(__formatter_chrono::_M_S): Replaced __mod with __subs argument,
removed _M_locale_fmt call, and delegate to _M_subsecs.
(__formatter_chrono::_M_C_y_Y, __formatter_chrono::_M_d_e)
(__formatter_chrono::_M_H_I, __formatter_chrono::_M_m)
(__formatter_chrono::_M_u_w, __formatter_chrono::_M_U_V_W): Remove
__mod argument and call to _M_locale_fmt.

Reviewed-by: Jonathan Wakely 
Signed-off-by: Tomasz KamiÅski

[Bug tree-optimization/120833] gcc does not recognize tail calls with converting between like structs

2025-06-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120833

Andrew Pinski  changed:

   What|Removed |Added

Summary|gcc does not recognize tail |gcc does not recognize tail
   |calls   |calls with converting
   ||between like structs
 Ever confirmed|0   |1
   Last reconfirmed||2025-06-26
 Status|UNCONFIRMED |NEW

[Bug tree-optimization/120833] gcc does not recognize tail calls with converting between like structs

2025-06-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120833

Andrew Pinski  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |pinskia at gcc dot 
gnu.org

--- Comment #2 from Andrew Pinski  ---
There is a 3rd function which does not happen either:
```
struct S2 get_s2_3(const char* s, long n) {
struct S1 x = get_s1(s, n);
return __builtin_bit_cast (struct S2, x);
}
```

(for C++ front-end, for portibility you would use std::bit_cast(x)
instead).

I will take a look at s2_2 and s2_3, they are easier; just need to prove that
the structs are returned in a similar fashion.

[Bug tree-optimization/120833] gcc does not recognize tail calls with converting between like structs

2025-06-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120833

Andrew Pinski  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED

--- Comment #3 from Andrew Pinski  ---
get_s2 is harder and requires store/load merging:
```
  _1 = x.data;
  _2 = x.size;
  D.4700.data = _1;
  D.4700.size = _2;
```

Which does not happen because the current store/load merging pass only takes
into account up to the register size rather than more.

[Bug fortran/120812] [regression] buffer(80:80) = C_NEW_LINE not working with gfortran 15.1 under Mac

2025-06-26 Thread christophe.peyret at onera dot fr via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120812

--- Comment #10 from Christophe Peyret  ---
Hello,

on Mac Intel, it works but not sure it still works on Mac ARM.

I test it more on tomorow but behaviour seems to be different

Sincerely,
Christophe

[Bug fortran/120637] Memory leak in finalization gfortran 9.5-16.0

2025-06-26 Thread antony at cosmologist dot info via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120637

--- Comment #7 from Antony Lewis  ---
Thanks - yes certainly makes sense not to do this if there are still double
finalizations. *Why* there are still duplicate finalizations is then I guess
another issue.
(sorry, the agent must have messed up the repo commits making the patch and the
test).

[Bug libstdc++/110739] std::format for chrono types compiles very slowly

2025-06-26 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110739

--- Comment #4 from GCC Commits  ---
The master branch has been updated by Tomasz Kaminski :

https://gcc.gnu.org/g:4b3cefed1a08344495fedec4982d85168bd8173f

commit r16-1709-g4b3cefed1a08344495fedec4982d85168bd8173f
Author: Tomasz KamiÅski 
Date:   Tue Jun 24 14:07:46 2025 +0200

libstdc++: Type-erase chrono-data for formatting [PR110739]

This patch reworks the formatting for the chrono types, such that they are
all
formatted in terms of _ChronoData class, that includes all required fields.
Populating each required field is performed in formatter for specific type,
based on the chrono-spec used.

To facilitate above, the _ChronoSpec now includes additional _M_needed
field,
that represnts the chrono data that is referenced by format spec (this
value
is also configured for __defSpec). This value differs from the value of
__parts passed to _M_parse, which does include all fields that can be
computed
from input (e.g. weekday_indexed can be computed for year_month_day). Later
it is used to fill _ChronoData, in particular _M_fill_* family of
functions,
to determine if given field needs to be set, and thus its value needs to be
computed.

In consequence _ChronoParts enum was extended with additional values, that
allows more fine grained identification:
 * _TimeOfDay is separated into _HoursMinutesSeconds and _Subseconds,
 * _TimeZone is separated into _ZoneAbbrev and _ZoneOffset,
 * _LocalDays, _WeekdayIndex are defined and in included in _Date,
 * _Duration is removed, and instead _EpochUnits and _UnitSuffix are
   introduced.
Furthermore, to avoid name conflicts _ChonoParts is now defined as enum
class,
with additional operators that simplify uses.

In addition to fields that can be printed using chrono-spec, _ChronoData
stores:
 * Total days in wall time (_M_ldays), day of year (_M_day_of_year) - used
by
   struct tm construction, and for ISO calendar computation.
 * Total seconds in wall time (_M_lseconds) - this value may be different
from
   sum of days, hours, minutes, seconds (e.g. see utc_time below). Included
   to allow future extension, like printing total minutes.
 * Total seconds since epoch - due offset different from above. Again to be
   used with future extension (e.g. %s as proposed in P2945R1).
 * Subseconds - count of attoseconds (10^(-18)), in addition to printing
can
   be used to  compute fractional hours, minutes.
The both total seconds fields use single _TotalSeconds enumerator in
_ChronoParts, that when present in combination with _EpochUnits or
_LocalDays
indicates that _M_eseconds (_EpochSeconds) or _M_lseconds (_LocalSeconds)
are
provided/required.

To handle type formatting of time since epoch ('%Q'|_EpochUnits), we use
the
format_args mechanism, where the result of +d.count() (see LWG4118) is
erased
into make_format_args to local __arg_store, that is later referenced by
_M_ereps (_M_ereps.get(0)).

To handle precision values, and in prepartion to allow user to configure
ones,
we store the precision as third element of _M_ereps (_M_ereps.get(2)), this
allows duration with precision to be printed using "{0:{2}}". For
subseconds
the precision is handled differently depending on the representation:
 * for integral reps, _M_subseconds value is used to determine fractional
value,
   precision is trimmed to 18 digits;
 * for floating-points, _M_ereps stores duration initialized with only
   fractional seconds, that is later formatted with precision.
Always using _M_subseconds fields for integral duration, means that we do
not
use formattter for user-defined durations that are considered to be
integral
(see empty_spec.cc file change). To avoid potentially expensive computation
of _M_subseconds, we make sure that _ChronoParts::_Subseconds is set only
if
_Subseconds are needed. In particular we remove this flag for localized
ouput
in _M_parse.

Construction of the _M_ereps as described above is handled by
__formatter_duration,
that is then used to format duration, hh_mm_ss and time_points
specializations.
This class also handles _UnitSuffix, the _M_units_suffix field is populated
either with predefined suffix (chrono::__detail::__units_suffix) or one
produced
locally.

Finally, formatters for types listed below contains type specific logic:
 * hh_mm_ss - we do not compute total duration and seconds, unless
explicitly
   requested, as such computation may overflow;
 * utc_time - for time during leap second insertion, the _M_seconds field
is
   increased to 60;
 * __local_time_fmt - exception is thrown if zone offset (_ZoneOffset) or
   abbrevation (_ZoneAbbrev) is requsted, but corresponding pointer is
null,
   futhermore conversion from `char` to `wchar_t` for abbreviation is

[Bug target/120835] on nvptx target with openmp, gcc 15.1 computes different results with differing -O levels.

2025-06-26 Thread schulz.benjamin at googlemail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120835

Benjamin Schulz  changed:

   What|Removed |Added

  Attachment #61721|0   |1
is obsolete||

--- Comment #10 from Benjamin Schulz  ---
Created attachment 61728
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=61728&action=edit
openmp library

one may argue that the last code was not entirely clean.
It had offloaded a matrix with data, spans and extents,

then created a sub matrix on the host, and then gave this sub matrix to a omp
target loop. The spans and extents were passed to the omp target loop as
integers, which should be implicitely mapped. the struct of the submatrix
should also have been implicitely mapped, I guess, and the data field of the
struct was already mapped when we offloaded the larger matrix. 

In order to be entirely clean, 

I have now written a better version of the openmp library. 

I now also call the mapping macros explicitely for the sub matrix before the
omp target loop. And I also release the entire struct after the sub matrix was
used.. The process of uploading, allocating and releasing maps the struct, and
the arrays, so should be OK.



Unfortunately, the problem remains. 


Once I use -O1, the results become crazy while without optimization, the output
is correct.

That looks and smells and is a compiler problem.

[Bug tree-optimization/120747] [16 Regression] 435.gromacs miscompares since r16-1550-g9244ea4bf55638

2025-06-26 Thread pheeck at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120747

--- Comment #12 from Filip Kastl  ---
gfortran -std=legacy -c -o innerf.o -Ofast -g -march=native -mtune=native
innerf.f

these are the compile options, btw

[Bug target/120835] on nvptx target with openmp, gcc 15.1 computes different results with differing -O levels.

2025-06-26 Thread schulz.benjamin at googlemail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120835

--- Comment #8 from Benjamin Schulz  ---
and please forgive me for the large test case.

It is probably difficult to shrink this to a small example...

I do not know why gcc behaves this way.

[Bug target/120835] on nvptx target with openmp, gcc 15.1 computes different results with differing -O levels.

2025-06-26 Thread schulz.benjamin at googlemail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120835

--- Comment #9 from Benjamin Schulz  ---
compile with -g  -fopenmp -foffload=nvptx-none  -fno-stack-protector lrt lm lc
lstdc++ lmpi and various -O levels, of course

[Bug fortran/120784] fortran: issue with use-association renames and interface

2025-06-26 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120784

--- Comment #6 from GCC Commits  ---
The releases/gcc-15 branch has been updated by Harald Anlauf
:

https://gcc.gnu.org/g:58323d4a03274114a09e75d7aad6d766aceff256

commit r15-9867-g58323d4a03274114a09e75d7aad6d766aceff256
Author: Harald Anlauf 
Date:   Mon Jun 23 21:33:40 2025 +0200

Fortran: fix checking of renamed-on-use interface name [PR120784]

PR fortran/120784

gcc/fortran/ChangeLog:

* interface.cc (gfc_match_end_interface): If a use-associated
symbol is renamed, use the local_name for checking.

gcc/testsuite/ChangeLog:

* gfortran.dg/interface_63.f90: New test.

(cherry picked from commit 6dd1659cf10a7ad51576f902ef3bc007db30c990)

[Bug c++/120831] Raise a diagnostic when a class/struct that is marked as final introduces a virtual method

2025-06-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120831

Andrew Pinski  changed:

   What|Removed |Added

   Severity|normal  |enhancement

[Bug tree-optimization/120747] [16 Regression] 435.gromacs miscompares since r16-1550-g9244ea4bf55638

2025-06-26 Thread pheeck at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120747

--- Comment #11 from Filip Kastl  ---
So the file that is getting "miscompiled" is innerf.f.

I found out by compiling this gromacs source file with r16-1550 GCC and all the
other source files with r16-1549 GCC and then linking that together.

I'll see if I can figure out in which part of innerf.f is the problem.

[Bug c/120780] Missed __builtin_dynamic_object_size optimization(?)

2025-06-26 Thread siddhesh at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120780

Siddhesh Poyarekar  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2025-06-26
   Assignee|unassigned at gcc dot gnu.org  |siddhesh at gcc dot 
gnu.org

--- Comment #17 from Siddhesh Poyarekar  ---
Created attachment 61729
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=61729&action=edit
Candidate fix

This is what I'm testing at the moment, basically drill down the container type
and look for the inner type at the same offset.  If it exists, adjust the
wholesize to reflect that we're looking at the size from the context of the
container object.

[Bug target/120424] [14/15/16 regression] lra-elimination issues when fp2sp elimination is disabled part-way through lra

2025-06-26 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120424

--- Comment #14 from GCC Commits  ---
The master branch has been updated by Alexandre Oliva :

https://gcc.gnu.org/g:b49473448966b045460a23794ed9a309e503fa3b

commit r16-1721-gb49473448966b045460a23794ed9a309e503fa3b
Author: Alexandre Oliva 
Date:   Thu Jun 26 21:01:24 2025 -0300

[lra] rework deactivation of fp2sp elimination [PR120424]

Deactivating the fp2sp elimination in lra_update_fp2sp_elimination
prevents update_reg_eliminate from propagating the fp2sp elimination
offset to the next chosen elimination, so it may retain -1 as the
prev_offset, and prev_offset will be taken as an already-applied
offset that needs to be compensated in the next round of spilling and
reloading.  This affects, for example, crtbegin.o's
__do_global_dtors_aux on arm-linux-gnueabihf in a {BOOT_C,T}FLAGS='-O2
-g -fnon-call-exceptions -fstack-clash-protection' bootstrap.

Alas, just retaining that elimination causes spills to use the fp2sp
elimination, including applying sp offsets, which breaks e.g. an
x86_64-linux-gnu native bootstrap with ix86_frame_pointer_required
modified to return true on nonzero frame size.

The middle-ground solution is to keep the elimination active, so that
its offsets are applied and propagated on to the subsequent fp
elimination, but without introducing sp offsets, so that
e.g. pr103973-18.c on the modified x86_64-linux-gnu doesn't get
adjacent argument pushes of two adjacent on-stack temporaries ending
up pushing the same temporary because of undesired adjustments.


for  gcc/ChangeLog

PR rtl-optimization/120424
* lra-eliminations.cc (lra_update_fp2sp_elimination):
Avoid sp offsets in further fp2sp eliminations...
(update_reg_eliminate): ... and restore to_rtx before assert
checking.

[Bug target/120424] [14/15/16 regression] lra-elimination issues when fp2sp elimination is disabled part-way through lra

2025-06-26 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120424

--- Comment #12 from GCC Commits  ---
The master branch has been updated by Alexandre Oliva :

https://gcc.gnu.org/g:6c554467623ec53ae228d127cbec9c4ba3cdc027

commit r16-1719-g6c554467623ec53ae228d127cbec9c4ba3cdc027
Author: Alexandre Oliva 
Date:   Thu Jun 26 21:01:21 2025 -0300

[genoutput] mark scratch outputs as eliminable [PR120424]

acats' fdd2a00.read is miscompiled on arm-linux-gnu with -O2
-fstack-clash-protection -march=armv7-a -marm: a clobbered scratch
register in a *iorsi3_compare0_scratch pattern gets initially assigned
to the frame pointer register, but at some point during lra the frame
size grows to nonzero, arm_frame_pointer_required flips to true, and
the fp2sp elimination has to be disabled, so the scratch register gets
spilled to a stack slot.

It needs to get the sfp elimination at that point, because later
rounds of elimination will assume the previous round's offset has
already been applied.  But since scratch matches are not regarded as
eliminable by genoutput, we don't attempt elimination in the clobbered
stack slot MEM rtx.

Later on, lra issues a reload for that slot, using a new pseudo
allocated to a hardware register, that gets stored in the stack slot
after the original insn.  Elimination in that reload store insn
eventually updates the elimination offset, but it's an incremental
update, assuming that the offset so far has already been applied.

Without applying the initial offset, the store ends up overlapping
with the function's register save area, corrupting a caller's
call-saved register.

AFAICT the old reload's elimination wouldn't be harmed by allowing
elimination in scratch operands, so I'm enabling eliminable for them
regardless.  Should it be found to make a difference, we could
presumably set a different bit in eliminable to enable reload and lra
to tell them apart and behave accordingly.


for  gcc/ChangeLog

PR rtl-optimization/120424
* genoutput.cc (scan_operands): Make MATCH_SCRATCHes eliminable.

[Bug target/120424] [14/15/16 regression] lra-elimination issues when fp2sp elimination is disabled part-way through lra

2025-06-26 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120424

--- Comment #13 from GCC Commits  ---
The master branch has been updated by Alexandre Oliva :

https://gcc.gnu.org/g:be547188b632d8c1072341c431af339b7384c4a6

commit r16-1720-gbe547188b632d8c1072341c431af339b7384c4a6
Author: Alexandre Oliva 
Date:   Thu Jun 26 21:01:22 2025 -0300

[lra] recompute ranges upon disabling fp2sp elimination [PR120424]

If the frame size grows to nonzero, arm_frame_pointer_required may
flip to true under -fstack-clash-protection -fnon-call-exceptions, and
that may disable the fp2sp elimination part-way through lra.

If pseudos had got assigned to the frame pointer register before that,
they have to be spilled, and that requires complete live range
information.  If !lra_reg_spill_p, lra_spill won't have live ranges
for such pseudos, and they could end up sharing spill slots with other
pseudos whose live ranges actually overlap.

This affects at least Ada.Strings.Wide_Superbounded.Super_Insert and
.Super_Replace_Slice in libgnat/a-stwisu.adb, when compiled with -O2
-fstack-clash-protection -march=armv7 (implied Thumb2), causing
acats-4's cdd2a01 to fail.

Recomputing live ranges including registers may renumber and compress
points, so we have to recompute the aggregated live ranges for
already-assigned spill slots as well.

As a safety net, reject empty live ranges when computing slot sharing.


for  gcc/ChangeLog

PR rtl-optimization/120424
* lra-eliminations.cc (lra_update_fp2sp_elimination):
Compute complete live ranges and recompute slots' live ranges
if needed.
* lra-lives.cc (lra_reset_live_range_list): New.
(lra_complete_live_ranges): New.
* lra-spills.cc (assign_spill_hard_regs): Reject empty live
ranges.
(add_pseudo_to_slot): Likewise.
(lra_recompute_slots_live_ranges): New.
* lra-int.h (lra_reset_live_range_list): Declare.
(lra_complete_live_ranges): Declare.
(lra_recompute_slots_live_ranges): Declare.

[Bug target/120424] [14/15/16 regression] lra-elimination issues when fp2sp elimination is disabled part-way through lra

2025-06-26 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120424

--- Comment #11 from GCC Commits  ---
The master branch has been updated by Alexandre Oliva :

https://gcc.gnu.org/g:7ce8a87f78122509334c5cfeebb624f634ccf96e

commit r16-1718-g7ce8a87f78122509334c5cfeebb624f634ccf96e
Author: Alexandre Oliva 
Date:   Thu Jun 26 21:01:19 2025 -0300

[lra] inactivate disabled fp2sp elimination [PR120424]

Even after we disable the fp2sp elimination when it is the active
elimination for the fp, spilling might use it before
update_reg_eliminate runs and inactivates it for good.  If it is used,
update_reg_eliminate will fail the check that fp2sp was not used.

Since we keep track of uses of this specific elimination, and
lra_update_fp2sp_elimination checks it before disabling it, we know it
hasn't been used, so we can inactivate it without any ill effects.

This fixes the pr118591-1.c avr-none regression exposed by the
PR120424 fix.


for  gcc/ChangeLog

PR rtl-optimization/120424
* lra-eliminations.cc (lra_update_fp2sp_elimination):
Inactivate the unused fp2sp elimination right away.

1 2 >

1 - 100 of 111 matches

Mail list logo