[Bug target/120840] Functions with no_callee_saved_registers attribute should preserve frame pointer

2025-06-29 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120840

--- Comment #14 from GCC Commits  ---
The master branch has been updated by H.J. Lu :

https://gcc.gnu.org/g:9a602ce3f4c95648c0c48d3f26fc52f69d012505

commit r16-1771-g9a602ce3f4c95648c0c48d3f26fc52f69d012505
Author: H.J. Lu 
Date:   Sat Jun 28 09:39:41 2025 +0800

x86: Preserve frame pointer for no_callee_saved_registers attribute

Update functions with no_callee_saved_registers/preserve_none attribute
to preserve frame pointer since caller may use it to save the current
stack:

pushq   %rbp
movq%rsp, %rbp
...
callfunction
...
leave
ret

If callee changes frame pointer without restoring it, caller will fail
to restore its stack after callee returns as LEAVE does

mov %rbp, %rsp
pop %rbp

The corrupted frame pointer will corrupt stack pointer in caller.

There are no regressions on Linux/x86-64.  Also tested with

https://github.com/python/cpython

configured with "./configure --with-tail-call-interp".

gcc/

PR target/120840
* config/i386/i386-expand.cc (ix86_expand_call): Don't mark
hard frame pointer as clobber.
* config/i386/i386-options.cc (ix86_set_func_type): Use
TYPE_NO_CALLEE_SAVED_REGISTERS instead of
TYPE_NO_CALLEE_SAVED_REGISTERS_EXCEPT_BP.
* config/i386/i386.cc (ix86_function_ok_for_sibcall): Remove the
TYPE_NO_CALLEE_SAVED_REGISTERS_EXCEPT_BP check.
(ix86_save_reg): Merge TYPE_NO_CALLEE_SAVED_REGISTERS and
TYPE_PRESERVE_NONE with TYPE_NO_CALLEE_SAVED_REGISTERS_EXCEPT_BP.
* config/i386/i386.h (call_saved_registers_type): Remove
TYPE_NO_CALLEE_SAVED_REGISTERS_EXCEPT_BP.
* doc/extend.texi: Update no_callee_saved_registers documentation.

gcc/testsuite/

PR target/120840
* gcc.target/i386/no-callee-saved-1.c: Updated.
* gcc.target/i386/no-callee-saved-2.c: Likewise.
* gcc.target/i386/no-callee-saved-7.c: Likewise.
* gcc.target/i386/no-callee-saved-8.c: Likewise.
* gcc.target/i386/no-callee-saved-9.c: Likewise.
* gcc.target/i386/no-callee-saved-10.c: Likewise.
* gcc.target/i386/no-callee-saved-18.c: Likewise.
* gcc.target/i386/no-callee-saved-19a.c: Likewise.
* gcc.target/i386/no-callee-saved-19c.c: Likewise.
* gcc.target/i386/no-callee-saved-19d.c: Likewise.
* gcc.target/i386/pr119784a.c: Likewise.
* gcc.target/i386/preserve-none-6.c: Likewise.
* gcc.target/i386/preserve-none-7.c: Likewise.
* gcc.target/i386/preserve-none-12.c: Likewise.
* gcc.target/i386/preserve-none-13.c: Likewise.
* gcc.target/i386/preserve-none-14.c: Likewise.
* gcc.target/i386/preserve-none-15.c: Likewise.
* gcc.target/i386/preserve-none-23.c: Likewise.
* gcc.target/i386/pr120840-1a.c: New test.
* gcc.target/i386/pr120840-1b.c: Likewise.
* gcc.target/i386/pr120840-1c.c: Likewise.
* gcc.target/i386/pr120840-1d.c: Likewise.

Signed-off-by: H.J. Lu 

[Bug target/120840] Functions with no_callee_saved_registers attribute should preserve frame pointer

2025-06-29 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120840

--- Comment #15 from H.J. Lu  ---
Fixed for GCC 16 so far.

[Bug target/120870] CPython miscompiled with preserve_none and CFLAGS="-O2 -march=znver2 -ggdb3"

2025-06-29 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120870

--- Comment #9 from H.J. Lu  ---
-march=x86-64-v3 fails.

[Bug tree-optimization/120871] missing tail call optimization on RVO

2025-06-29 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120871

Andrew Pinski  changed:

   What|Removed |Added

 Depends on||71761
   Severity|normal  |enhancement

--- Comment #7 from Andrew Pinski  ---
Looks like this depends on PR 71761 after all.

But I have a patch for the tree part which detects the tail call.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71761
[Bug 71761] missing tailcall optimization (structure return)

[Bug tree-optimization/120871] missing tail call optimization on RVO

2025-06-29 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120871

--- Comment #8 from Andrew Pinski  ---
(In reply to Andrew Pinski from comment #7)
> But I have a patch for the tree part which detects the tail call.

And it also detects the tail recusion too. So at least I could submit it
seperately for that.

[Bug target/120870] CPython miscompiled with preserve_none and CFLAGS="-O2 -march=znver2 -ggdb3"

2025-06-29 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120870

--- Comment #8 from H.J. Lu  ---
-march=x86-64-v4 also failed.

[Bug tree-optimization/120872] New: Dovecot test-json-istream test miscompiled with -ftrivial-auto-var-init={zero,pattern}

2025-06-29 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120872

Bug ID: 120872
   Summary: Dovecot test-json-istream test miscompiled with
-ftrivial-auto-var-init={zero,pattern}
   Product: gcc
   Version: 16.0
Status: UNCONFIRMED
  Keywords: false-positive, wrong-code
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: sjames at gcc dot gnu.org
  Target Milestone: ---

This comes from Dovecot's testsuite (test-json-istream.c) where it's
miscompiled with -ftrivial-auto-var-init={zero,pattern}. It also shows up as
-Wmaybe-uninitialized with those options.

```
enum { JSON_TYPE_NULL, JSON_TYPE_TEXT } type;
char *data_val;

void frobnicate();

__attribute__((pure)) char *json_node_get_data(long *size_r) {
  *size_r = 0;
  switch (type) {
  case JSON_TYPE_TEXT:
break;
  case JSON_TYPE_NULL:
return "";
  default:
frobnicate();
  }
  return 0;
}

void test_json_istream_read_string() {
  long size_val;

  frobnicate();

  data_val = json_node_get_data(&size_val);
  if (__builtin_memcmp(data_val, 0, size_val))
frobnicate();
}
```

```
$ gcc test-json-istream.i -O2 -ftrivial-auto-var-init=zero -c
-Werror=uninitialized
test-json-istream.i: In function ‘test_json_istream_read_string’:
test-json-istream.i:25:7: error: ‘size_val’ may be used uninitialized
[-Werror=maybe-uninitialized]
   25 |   if (__builtin_memcmp(data_val, 0, size_val))
  |   ^~~
test-json-istream.i:20:8: note: ‘size_val’ was declared here
   20 |   long size_val;
  |^~~~
cc1: some warnings being treated as errors
```

[Bug tree-optimization/120872] Dovecot test-json-istream test miscompiled with -ftrivial-auto-var-init={zero,pattern}

2025-06-29 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120872

--- Comment #2 from Sam James  ---
The original has:
```

static inline ATTR_PURE const unsigned char *
json_node_get_data(const struct json_node *jnode, size_t *size_r)
{
const char *literal;

switch (jnode->type) {
case JSON_TYPE_STRING:
case JSON_TYPE_NUMBER:
case JSON_TYPE_TEXT:
break;
case JSON_TYPE_TRUE:
literal = "true";
*size_r = strlen(literal);
return (const unsigned char *)literal;
case JSON_TYPE_FALSE:
literal = "false";
*size_r = strlen(literal);
return (const unsigned char *)literal;
case JSON_TYPE_NULL:
literal = "null";
*size_r = strlen(literal);
return (const unsigned char *)literal;
default:
i_unreached();
}
return json_value_get_data(&jnode->value, size_r);
}
```

That looks bogus to me indeed.

[Bug tree-optimization/120872] Dovecot test-json-istream test miscompiled with -ftrivial-auto-var-init={zero,pattern}

2025-06-29 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120872

Sam James  changed:

   What|Removed |Added

 Resolution|--- |INVALID
 Status|UNCONFIRMED |RESOLVED

--- Comment #1 from Sam James  ---
Ah, wait, it modifies its argument -> pure is invalid. I will check if the
original has this.

[Bug target/120870] CPython without GIL ("free-threaded") miscompiled with preserve_none and PGO

2025-06-29 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120870

--- Comment #4 from Sam James  ---
(In reply to H.J. Lu from comment #3)
> Does it fail without --with-tail-call-interp?

No, works then.

[Bug c/120753] is_device_ptr does not compile if given a pointer which is a member of a struct, i.e. is_device_ptr(u.ptr), where mystruct u; and struct mystruct{double *ptr;int something;}; will fail

2025-06-29 Thread schulz.benjamin at googlemail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120753

--- Comment #3 from Benjamin Schulz  ---
oh that last comment should have been made in another bug about an internal
compiler error. Sorry.

[Bug tree-optimization/120869] gcc does not eliminate short-circuiting tail calls

2025-06-29 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120869

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||missed-optimization,
   ||tail-call
 CC||pinskia at gcc dot gnu.org

--- Comment #1 from Andrew Pinski  ---
This might be a middle-end issue.

I will take a look in a few hours to see what is going wrong.

[Bug target/120870] CPython without GIL ("free-threaded") miscompiled with preserve_none and PGO

2025-06-29 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120870

H.J. Lu  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2025-06-30
 Status|UNCONFIRMED |WAITING

--- Comment #3 from H.J. Lu  ---
Does it fail without --with-tail-call-interp?

[Bug target/120714] RISC-V: incorrect frame pointer CFA address for stack-clash protection loops

2025-06-29 Thread alexey.merzlyakov at samsung dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120714

--- Comment #3 from Alexey Merzlyakov  ---
I've checked the patch from the above suggested link. Unfortunately, it does
not work for the reported in this ticket test-case: GDB is missing to get into
goo() breakpoint.

I also tested it on the test-case taken from the patch for this PR:
https://gcc.gnu.org/pipermail/gcc-patches/2025-June/687773.html, which is using
internal "backtrace()" function for verification. Produced code also crashes
with the suggested above patch:

  $ cat run.sh 
  #!/bin/bash
  CC="riscv64-rise-linux-gnu-gcc"
  QEMU="qemu-riscv64"
  SYSROOT=`$CC --print-sysroot`
  TEST="pr120714"

  OPTS="-O1 -Os -Og -Oz"
  SC_OPT="-fstack-clash-protection"
  for OPT in $OPTS; do
echo "OPT: $OPT"
set -e
$CC $OPT -g $SC_OPT -I$SYSROOT/usr/include/ -static ${TEST}.c -o ${TEST}.x
set +e
$QEMU -L $SYSROOT ${TEST}.x
  done

  $ ./run.sh
  OPT: -O1
  ./run.sh: line 14: 114555 Segmentation fault  (core dumped) $QEMU -L
$SYSROOT ${TEST}.x
  OPT: -Os
  ./run.sh: line 14: 114563 Segmentation fault  (core dumped) $QEMU -L
$SYSROOT ${TEST}.x
  OPT: -Og
  ./run.sh: line 14: 114571 Segmentation fault  (core dumped) $QEMU -L
$SYSROOT ${TEST}.x
  OPT: -Oz
  ./run.sh: line 14: 114579 Segmentation fault  (core dumped) $QEMU -L
$SYSROOT ${TEST}.x

[Bug c++/120868] "unexpected AST of kind switch_expr" in constexpr template function

2025-06-29 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120868

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
  Known to fail||10.4.0
 Ever confirmed|0   |1
   Last reconfirmed||2025-06-30

--- Comment #5 from Andrew Pinski  ---
Confirmed.

[Bug c++/120868] "unexpected AST of kind switch_expr" in constexpr template function

2025-06-29 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120868

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||error-recovery,
   ||ice-on-invalid-code

--- Comment #4 from Andrew Pinski  ---
Maybe it is only the invalid case where this ICE/sorry shows up.

[Bug c++/120868] "unexpected AST of kind switch_expr" in constexpr template function

2025-06-29 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120868

Andrew Pinski  changed:

   What|Removed |Added

 Blocks||55004

--- Comment #3 from Andrew Pinski  ---
Here is a reduced testcase for an invalid case:
```
struct AssertionResult {
explicit AssertionResult(const auto &success) {}
operator bool() const;
};
template  constexpr auto constexpr_iterator_test(U opt) -> bool {
switch (0)
case 0:
if (const AssertionResult gtest_ar{1})
return true;
return false;
};
static_assert(constexpr_iterator_test(0));
```

I have not checked if the original code is valid or not.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55004
[Bug 55004] [meta-bug] constexpr issues

[Bug tree-optimization/120871] missing tail call optimization on RVO

2025-06-29 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120871

--- Comment #9 from Andrew Pinski  ---
Created attachment 61759
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=61759&action=edit
Patch which is in testing but needs testcases

This only fixes the tree level part; there is a middle-end part which I think
is PR 71761 but I didn't test with the patch yet.

[Bug tree-optimization/120358] [15/16 regression] qtbase-6.9.0 miscompiled since r15-580-gf3e5f4c58591f5

2025-06-29 Thread holger--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120358

--- Comment #23 from Holger Hoffstätte  ---
(In reply to Holger Hoffstätte from comment #22)
> Created attachment 61758 [details]
> Readable reproducer for debugging
> 
> Slightly more reduced, no need for the join business.

..and not for the factory function either.

I'm now sure that the added std::move is not accidental state reuse but
"simply" a matter of instantiating one template working for both cases instead
of two, where the second one goes off the rails. The executable differs in
size.
Why the second form (direct rvalue reference) makes a mess is unfortunately a
mystery.

[Bug modula2/117203] Add Delete procedure function to FIO

2025-06-29 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117203

--- Comment #5 from GCC Commits  ---
The master branch has been updated by Gaius Mulley :

https://gcc.gnu.org/g:620a40fa8843dd7f80547bbd63549abc8bbe9521

commit r16-1767-g620a40fa8843dd7f80547bbd63549abc8bbe9521
Author: Gaius Mulley 
Date:   Mon Jun 30 00:26:03 2025 +0100

[PR modula2/117203] Followup add Delete procedure function

This patch provides GetFileName procedure function for
FIO.File, FileSystem.File and IOChan.ChanId.  The
return result from these procedures can be passed into
StringFileSysOp.Unlink to complete the required delete.

gcc/m2/ChangeLog:

PR modula2/117203
* gm2-libs-log/FileSystem.def (GetFileName): New
procedure function.
(WriteString): New procedure.
* gm2-libs-log/FileSystem.mod (GetFileName): New
procedure function.
(WriteString): New procedure.
* gm2-libs/SFIO.def (GetFileName): New procedure function.
* gm2-libs/SFIO.mod (GetFileName): New procedure function.
* gm2-libs-iso/IOChanUtils.def: New file.
* gm2-libs-iso/IOChanUtils.mod: New file.

libgm2/ChangeLog:

PR modula2/117203
* libm2iso/Makefile.am (M2DEFS): Add IOChanUtils.def.
(M2MODS): Add IOChanUtils.mod.
* libm2iso/Makefile.in: Regenerate.

gcc/testsuite/ChangeLog:

PR modula2/117203
* gm2/isolib/run/pass/testdelete2.mod: New test.
* gm2/pimlib/logitech/run/pass/testdelete2.mod: New test.
* gm2/pimlib/run/pass/testdelete.mod: New test.

Signed-off-by: Gaius Mulley 

[Bug tree-optimization/120869] gcc does not eliminate short-circuiting tail calls

2025-06-29 Thread kndevl at outlook dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120869

Karthik Nishanth  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|UNCONFIRMED |RESOLVED

--- Comment #2 from Karthik Nishanth  ---
Apologies! gcc trunk is able to eliminate all tail calls. 
https://godbolt.org/z/hzEs33q97

I am marking the bug resolved. I'm curious about what changed between 15.1.0
and trunk.

[Bug target/120870] CPython without GIL ("free-threaded") miscompiled with preserve_none and PGO

2025-06-29 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120870

--- Comment #1 from Sam James  ---
This is with:
```
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-pc-linux-gnu/16/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-pc-linux-gnu
Configured with:
/var/tmp/portage/sys-devel/gcc-16.0./work/gcc-16.0./configure
--host=x86_64-pc-linux-gnu --build=x86_64-pc-linux-gnu --prefix=/usr
--bindir=/usr/x86_64-pc-linux-gnu/gcc-bin/16
--includedir=/usr/lib/gcc/x86_64-pc-linux-gnu/16/include
--datadir=/usr/share/gcc-data/x86_64-pc-linux-gnu/16
--mandir=/usr/share/gcc-data/x86_64-pc-linux-gnu/16/man
--infodir=/usr/share/gcc-data/x86_64-pc-linux-gnu/16/info
--with-gxx-include-dir=/usr/lib/gcc/x86_64-pc-linux-gnu/16/include/g++-v16
--disable-silent-rules --disable-dependency-tracking
--with-python-dir=/share/gcc-data/x86_64-pc-linux-gnu/16/python
--enable-libphobos --enable-objc-gc
--enable-languages=c,c++,d,objc,obj-c++,fortran,ada,cobol,m2,rust
--enable-obsolete --enable-secureplt --disable-werror --with-system-zlib
--enable-nls --without-included-gettext --disable-libunwind-exceptions
--enable-checking=yes,extra,rtl --with-bugurl=https://bugs.gentoo.org/
--with-pkgversion='Gentoo Hardened 16.0. p, commit
ae7d2c2f05513ca58a1f4cf98220fd8710fd354c' --with-gcc-major-version-only
--enable-libstdcxx-time --enable-lto --disable-libstdcxx-pch --enable-shared
--enable-threads=posix --enable-__cxa_atexit --enable-clocale=gnu
--enable-multilib --with-multilib-list=m32,m64 --disable-fixed-point
--enable-targets=all --enable-offload-defaulted
--enable-offload-targets=nvptx-none --enable-libgomp --disable-libssp
--enable-libada --disable-cet --disable-systemtap --enable-valgrind-annotations
--disable-vtable-verify --disable-libvtv --with-zstd --with-isl
--disable-isl-version-check --enable-default-pie --enable-host-pie
--enable-host-bind-now --enable-default-ssp --disable-fixincludes
--with-gxx-libcxx-include-dir=/usr/include/c++/v1 --enable-linker-build-id
--with-build-config='bootstrap-O3 bootstrap-lto'
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 16.0.0 20250629 (experimental)
8dcb922452516ebbf362e7c202b48d8ef547edce (Gentoo Hardened 16.0. p, commit
ae7d2c2f05513ca58a1f4cf98220fd8710fd354c)
```

and with
https://inbox.sourceware.org/gcc-patches/CAMe9rOpBxdLoJrGZqt5_JhPAuvWUOpGWMLiGg=dskkytdqv...@mail.gmail.com/
on top.

[Bug c/120860] add ability to silence -Wmissing-field-initializers warnings on selected structs

2025-06-29 Thread bruno at clisp dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120860

--- Comment #4 from Bruno Haible  ---
> Attribute on the fields marking them as optional seems like a better option

I tend to agree: In most cases, we can distinguish "required" from "optional"
fields. With the attribute on the field, the warning remains useful for the
case of a missing "required" field. And the attribute on the fields is only
minimally less convenient to use than the attribute on the struct.

[Bug libgomp/120865] gimple-expr.cc:484

2025-06-29 Thread schulz.benjamin at googlemail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120865

--- Comment #3 from Benjamin Schulz  ---
Options were:

-O1 -fopenmp -foffload=nvptx-none  -fno-stack-protector  -Wall


Note that without -O i get the following:

(i.e. without optimization, the program terminates ordinarily...)


Ordinary matrix multiplication, on gpu
1 2 3 4 
5 6 7 8 
9 10 11 12 
13 14 15 16 

0 1 2 3 
4 5 6 7 
8 9 10 11 
12 13 14 15 

80 90 100 110 
176 202 228 254 
272 314 356 398 
368 426 484 542 

A Cholesky decomposition with the multiplication on gpu
4 12 -16 
12 37 -43 
-16 -43 98 

2 0 0 
6 1 0 
-8 5 3 

Now the cholesky decomposition is entirely done on gpu
2 0 0 
6 1 0 
-8 5 3 

Now we do the same with the lu decomposition
1 -2 -2 -3 
3 -9 0 -9 
-1 2 4 7 
-3 -6 26 2 

Just the multiplication on gpu
1 0 0 0 
3 1 0 0 
-1 -0 1 0 
-3 4 -2 1 

1 -2 -2 -3 
0 -3 6 0 
0 0 2 4 
0 0 0 1 

Entirely on gpu
1 0 0 0 
3 1 0 0 
-1 -0 1 0 
-3 4 -2 1 

1 -2 -2 -3 
0 -3 6 0 
0 0 2 4 
0 0 0 1 

Now we do the same with the qr decomposition
12 -51 4 
6 167 -68 
-4 24 -41 

Just the multiplication on gpu
0.857143 -0.394286 -0.331429 
0.428571 0.902857 0.0342857 
-0.285714 0.171429 -0.942857 

14 21 -14 
-2.22045e-16 175 -70 
-3.10862e-15 -4.79616e-14 35 

Entirely on gpu
0.857143 -0.394286 -0.626059 
0.428571 0.902857 -0.127334 
-0.285714 0.171429 -0.769309 

14 21 -14 
-2.22045e-16 175 -70 
-5.19947 -7.7992 37.6962 


Process returned 0 (0x0)   execution time : 0.829 s
Press ENTER to continue.

[Bug libgomp/120865] New: gimple-expr.cc:484

2025-06-29 Thread schulz.benjamin at googlemail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120865

Bug ID: 120865
   Summary: gimple-expr.cc:484
   Product: gcc
   Version: 15.1.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libgomp
  Assignee: unassigned at gcc dot gnu.org
  Reporter: schulz.benjamin at googlemail dot com
CC: jakub at gcc dot gnu.org
  Target Milestone: ---

Created attachment 61750
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=61750&action=edit
mdspan_omp.h

Im Durchlauf GIMPLE: omplower
In Datei, eingebunden von
/home/benni/projects/arraylibrary/openmp/main_omp.cpp:15:
/home/benni/projects/arraylibrary/openmp/mdspan_omp.h: In function »bool
qr_decomposition(const mdspan&, mdspan&, mdspan&, matrix_multiplication_parameters, size_t, bool, bool, int) [mit T
= double; CA = std::vector]«:
/home/benni/projects/arraylibrary/openmp/mdspan_omp.h:5230:13: interner
Compiler-Fehler: in create_tmp_var, bei gimple-expr.cc:484
 5230 | for (size_t i = 0; i < pextv[0]; ++i)
  | ^~~
Bitte senden Sie einen vollständigen Fehlerbericht auf Englisch ein;

[Bug libgomp/120865] gimple-expr.cc:484

2025-06-29 Thread schulz.benjamin at googlemail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120865

--- Comment #2 from Benjamin Schulz  ---
Created attachment 61752
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=61752&action=edit
cmakelists.txt

[Bug libgomp/120865] gimple-expr.cc:484

2025-06-29 Thread schulz.benjamin at googlemail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120865

--- Comment #1 from Benjamin Schulz  ---
Created attachment 61751
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=61751&action=edit
mein_omp.cpp

[Bug c/120753] is_device_ptr does not compile if given a pointer which is a member of a struct, i.e. is_device_ptr(u.ptr), where mystruct u; and struct mystruct{double *ptr;int something;}; will fail

2025-06-29 Thread schulz.benjamin at googlemail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120753

--- Comment #2 from Benjamin Schulz  ---
Created attachment 61753
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=61753&action=edit
mdspan_omp.h

mdspan_omp.h with the change in the #pragma omp target teams loop to the
#pragma omp target teams parallel for loop which makes the outputs of the qr
decomposition correct 

but of course the ice is still there once we optimize with -O1.

[Bug libgomp/120865] ICE in gimple-expr.cc:484

2025-06-29 Thread schulz.benjamin at googlemail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120865

--- Comment #4 from Benjamin Schulz  ---
Note that also the values in the last computation are wrong.

I can, however, make them correct by writing

 #pragma omp target  parallel for device(devicenum) 
or
 #pragma omp target teams distribute parallel for device(devicenum)

 in line 5203 of mdspan_omp.h i do not know why this is so since this is an
outer loop which should be independent for every outer iteration

[Bug libgomp/120865] ICE in gimple-expr.cc:484

2025-06-29 Thread schulz.benjamin at googlemail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120865

Benjamin Schulz  changed:

   What|Removed |Added

  Attachment #61750|0   |1
is obsolete||

--- Comment #5 from Benjamin Schulz  ---
Created attachment 61754
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=61754&action=edit
mdspan_omp.h

mdspan_omp.h with the changes that the outputs are correct without
optimization. With optimization, the ICE is still there...

[Bug tree-optimization/120871] missing tail call optimization on RVO

2025-06-29 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120871

Andrew Pinski  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=71761

--- Comment #4 from Andrew Pinski  ---
The secondary part would be PR 71761.

[Bug tree-optimization/119493] [12/13/14/15/16 Regression] missing tail call to self with struct in some cases

2025-06-29 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119493

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |FIXED
   Target Milestone|12.5|16.0
 Status|ASSIGNED|RESOLVED

--- Comment #22 from Andrew Pinski  ---
This was fixed in GCC 16 by r16-180 .

[Bug tree-optimization/119493] [12/13/14/15/16 Regression] missing tail call to self with struct in some cases

2025-06-29 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119493

Andrew Pinski  changed:

   What|Removed |Added

 CC||kndevl at outlook dot com

--- Comment #23 from Andrew Pinski  ---
*** Bug 120869 has been marked as a duplicate of this bug. ***

[Bug tree-optimization/120869] gcc does not eliminate short-circuiting tail calls

2025-06-29 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120869

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |16.0

--- Comment #3 from Andrew Pinski  ---
Eliminated tail recursion in bb 5 : _20 = go (D.21560);

Oh yes this was fixed in GCC trunk by r16-180-gb0120fa9838f8f .

[Bug tree-optimization/97640] GCC doesn't optimize-out outside-affecting lambdas within y-combinator while clang does.

2025-06-29 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97640
Bug 97640 depends on bug 119493, which changed state.

Bug 119493 Summary: [12/13/14/15/16 Regression] missing tail call to self with 
struct in some cases
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119493

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

[Bug target/120870] New: CPython without GIL ("free-threaded") miscompiled with preserve_none and PGO

2025-06-29 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120870

Bug ID: 120870
   Summary: CPython without GIL ("free-threaded") miscompiled with
preserve_none and PGO
   Product: gcc
   Version: 16.0
Status: UNCONFIRMED
  Keywords: wrong-code
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: sjames at gcc dot gnu.org
CC: hjl.tools at gmail dot com
  Target Milestone: ---

```
$ git clone https://github.com/python/cpython
$ git rev-parse HEAD
698bab5a4031c8f54e04e1dd42bcbe3e4564eba5
$ ./configure --with-tail-call-interp --disable-gil --enable-optimizations
CFLAGS="-O2 -march=znver2 -ggdb3"
$ make -j$(nproc) -l$(nproc)
[...]
./_bootstrap_python ./Programs/_freeze_module.py runpy ./Lib/runpy.py
Python/frozen_modules/runpy.h
make[2]: *** [Makefile:1948: Python/frozen_modules/runpy.h] Segmentation fault
make[2]: Leaving directory '/tmp/cpython'  
   
  make[2]: *** Waiting for unfinished jobs
```

Fortunately, it fails in the instrumentation stage with -fprofile-generate.

```
$ gdb --args ./_bootstrap_python ./Programs/_freeze_module.py runpy
./Lib/runpy.py Python/frozen_modules/runpy.h
Reading symbols from ./_bootstrap_python...
(gdb) r
Starting program: /tmp/cpython/_bootstrap_python ./Programs/_freeze_module.py
runpy ./Lib/runpy.py Python/frozen_modules/runpy.h
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib64/libthread_db.so.1".

Program received signal SIGSEGV, Segmentation fault.
0x558b277c in _Py_Check_ArgsIterable
(tstate=tstate@entry=0x5603d858 <_PyRuntime+346200>,
func=func@entry=0x7fffccf0, args=args@entry=0x77f91410)
at Python/ceval.c:3266
3266if (Py_TYPE(args)->tp_iter == NULL && !PySequence_Check(args)) {
(gdb) bt
#0  0x558b277c in _Py_Check_ArgsIterable
(tstate=tstate@entry=0x5603d858 <_PyRuntime+346200>,
func=func@entry=0x7fffccf0, args=args@entry=0x77f91410)
at Python/ceval.c:3266
#1  0x558e6382 in _TAIL_CALL_CALL_FUNCTION_EX (frame=0x77f91388,
stack_pointer=, tstate=0x5603d858 <_PyRuntime+346200>,
next_instr=0x32f623c03f8, oparg=0)
at ./Include/internal/pycore_stackref.h:364
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
(gdb)
```

[Bug tree-optimization/103720] bogus warning from -Wuninitialized + -ftrivial-auto-var-init + O2

2025-06-29 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103720

Sam James  changed:

   What|Removed |Added

   Target Milestone|--- |12.0

[Bug c++/120871] missing tail call optimization on RVO

2025-06-29 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120871

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||missed-optimization,
   ||tail-call
 CC||pinskia at gcc dot gnu.org

--- Comment #2 from Andrew Pinski  ---
This might be a dup of a bug which i am working on and posted a patch for
already.

[Bug c++/120868] "unexpected AST of kind switch_expr" in constexpr template function

2025-06-29 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120868

--- Comment #2 from Andrew Pinski  ---
Trying to reduce it ...

[Bug tree-optimization/120871] missing tail call optimization on RVO

2025-06-29 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120871

--- Comment #5 from Andrew Pinski  ---
Note clang rejects musttail here:
```
tail call requires that the return value, all parameters, and any temporaries
created by the expression are trivially destructible
```

:)

[Bug target/120870] CPython without GIL ("free-threaded") miscompiled with preserve_none and PGO

2025-06-29 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120870

--- Comment #2 from Sam James  ---
I am looking at another miscompilation first (unrelated to shrink-wrapping or
tail calls or ...) so not poking further at least right now.

[Bug c++/120871] missing tail call optimization on RVO

2025-06-29 Thread rockeet at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120871

--- Comment #1 from rockeet  ---
If delete `~Slice;`, g++ & clang recognize the tail call, but icc fails.

[Bug c++/120871] New: missing tail call optimization on RVO

2025-06-29 Thread rockeet at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120871

Bug ID: 120871
   Summary: missing tail call optimization on RVO
   Product: gcc
   Version: 16.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: rockeet at gmail dot com
  Target Milestone: ---

struct Slice {
const char* data;
unsigned long size;
~Slice();
};
Slice get_s_impl();
Slice get_s() { return get_s_impl(); }

g++ produces:

get_s():
pushrbx
mov rbx, rdi
callget_s_impl()
mov rax, rbx
pop rbx
ret

icc recognized tail call and produces:

get_s():
..B1.1: # Preds ..B1.0
jmp   get_s_impl()   #7.24

clang also missed such tail call optimization.

[Bug tree-optimization/120869] New: gcc does not eliminate short-circuiting tail calls

2025-06-29 Thread kndevl at outlook dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120869

Bug ID: 120869
   Summary: gcc does not eliminate short-circuiting tail calls
   Product: gcc
   Version: 15.1.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: kndevl at outlook dot com
  Target Milestone: ---

https://godbolt.org/z/eGYfxK4PE

Reproducer (C++23)

```
#include 
using namespace std;

template 
inline bool go(span s) {
if (s.size() <= 1) return true;
if (s.front() == s.back()) return go(s.subspan(1, s.size() -
2));
if constexpr (can_delete) {
return go(s.subspan(1, s.size() - 1))
|| go(s.subspan(0, s.size() - 1));
} else {
return false;
}
}

bool validPalindrome(span s) {
return go(s);
}
```

clang 20.1.0 for x86-64 compiles `validPalindrome` to a tight loop with `-O2
-std=c++23` whereas gcc generates explicit calls to go. clang also
results in ~5x fewer instructions.

With `-Os`, clang produces the same code as `-O2` whereas gcc eliminates all
but one tail calls to go.

[Bug tree-optimization/120871] missing tail call optimization on RVO

2025-06-29 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120871

Andrew Pinski  changed:

   What|Removed |Added

  Component|c++ |tree-optimization
   Last reconfirmed||2025-06-30
 Status|UNCONFIRMED |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |pinskia at gcc dot 
gnu.org
 Ever confirmed|0   |1

--- Comment #3 from Andrew Pinski  ---
Oh no but it is closely related to it.

Cannot tail-call: return value in memory: *_2(D) = get_s_impl (); [return slot
optimization]


I look into fixing this,

[Bug tree-optimization/120869] gcc does not eliminate short-circuiting tail calls

2025-06-29 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120869

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|FIXED   |DUPLICATE

--- Comment #4 from Andrew Pinski  ---
So just a dup then.

*** This bug has been marked as a duplicate of bug 119493 ***

[Bug libstdc++/120527] Native platform wait on darwin

2025-06-29 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120527

Andrew Pinski  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2025-06-30

--- Comment #11 from Andrew Pinski  ---
.

[Bug target/120870] CPython without GIL ("free-threaded") miscompiled with preserve_none and PGO

2025-06-29 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120870

H.J. Lu  changed:

   What|Removed |Added

 Status|WAITING |NEW

--- Comment #5 from H.J. Lu  ---
(In reply to Sam James from comment #0)
> ```
> $ git clone https://github.com/python/cpython
> $ git rev-parse HEAD
> 698bab5a4031c8f54e04e1dd42bcbe3e4564eba5
> $ ./configure --with-tail-call-interp --disable-gil --enable-optimizations
> CFLAGS="-O2 -march=znver2 -ggdb3"
>

$ ./configure --with-tail-call-interp CFLAGS="-O2 -march=znver2 -ggdb3"

also failed to build.

[Bug target/120870] CPython miscompiled with preserve_none and CFLAGS="-O2 -march=znver2 -ggdb3"

2025-06-29 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120870

--- Comment #6 from H.J. Lu  ---
(In reply to H.J. Lu from comment #5)
> (In reply to Sam James from comment #0)
> > ```
> > $ git clone https://github.com/python/cpython
> > $ git rev-parse HEAD
> > 698bab5a4031c8f54e04e1dd42bcbe3e4564eba5
> > $ ./configure --with-tail-call-interp --disable-gil --enable-optimizations
> > CFLAGS="-O2 -march=znver2 -ggdb3"
> >
> 
> $ ./configure --with-tail-call-interp CFLAGS="-O2 -march=znver2 -ggdb3"
> 
> also failed to build.

$ ./configure --with-tail-call-interp --disable-gil --enable-optimizations

works.

[Bug target/120866] [16 Regression] pdp11-aout, powerpc-ibm-aix7.1 and powerpc-ibm-aix7.2 crosscompilers fail to build

2025-06-29 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120866

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Priority|P3  |P2
   Last reconfirmed||2025-06-30
 Ever confirmed|0   |1

--- Comment #5 from Andrew Pinski  ---
.

[Bug tree-optimization/120871] missing tail call optimization on RVO

2025-06-29 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120871

--- Comment #6 from Andrew Pinski  ---
A simple tail recusion case:
```
struct Slice {
const char* data;
unsigned long size;
~Slice();
};
Slice get_s_impl();
Slice get_s(bool t)
{
  if (!t)
return get_s(!t);
  return {};
}
```

Which is also missed with `-O2 -fno-inline` (note -fno-inline is needed as GCC
knows how to handle inlining here).

[Bug target/120870] CPython miscompiled with preserve_none and CFLAGS="-O2 -march=znver2 -ggdb3"

2025-06-29 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120870

--- Comment #7 from Sam James  ---
Ah, sorry for missing it. With my usual setup, only --disable-gil failed (with
more options), so I started with a bad assumption.

[Bug target/120866] [16 Regression] pdp11-aout crosscompiler fails to build

2025-06-29 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120866

--- Comment #1 from Sam James  ---
Huh, it's really a trunk regression? I can't yet think of which change would've
done this.

[Bug target/120866] [16 Regression] pdp11-aout, powerpc-ibm-aix7.1 and powerpc-ibm-aix7.2 crosscompilers fail to build

2025-06-29 Thread pheeck at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120866

Filip Kastl  changed:

   What|Removed |Added

Summary|[16 Regression] pdp11-aout  |[16 Regression] pdp11-aout,
   |crosscompiler fails to  |powerpc-ibm-aix7.1 and
   |build   |powerpc-ibm-aix7.2
   ||crosscompilers fail to
   ||build
 Target|pdp11-aout  |pdp11-aout,
   ||powerpc-ibm-aix7.1,
   ||powerpc-ibm-aix7.2

--- Comment #2 from Filip Kastl  ---
A very similar thing also happens with powerpc-ibm-aix7.1 and
powerpc-ibm-aix7.2:

g++  -fno-PIE -c   -g -O2 -DIN_GCC -DCROSS_DIRECTORY_STRUCTURE  
-fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall
-Wno-error=narrowing -Wwrite-strings -Wcast-qual -Wmissing-format-attribute
-Wconditionally-supported -Woverloaded-virtual -pedantic -Wno-long-long
-Wno-variadic-macros -Wno-overlength-strings  -DHAVE_CONFIG_H -fno-PIE -I. -I.
-I/home/fkastl/gcc/src/gcc -I/home/fkastl/gcc/src/gcc/.
-I/home/fkastl/gcc/src/gcc/../include 
-I/home/fkastl/gcc/src/gcc/../libcpp/include
-I/home/fkastl/gcc/src/gcc/../libcody 
-I/home/fkastl/gcc/src/gcc/../libdecnumber
-I/home/fkastl/gcc/src/gcc/../libdecnumber/dpd -I../libdecnumber
-I/home/fkastl/gcc/src/gcc/../libbacktrace   -o tree.o -MT tree.o -MMD -MP -MF
./.deps/tree.TPo /home/fkastl/gcc/src/gcc/tree.cc
In file included from ./tm.h:25,
 from /home/fkastl/gcc/src/gcc/backend.h:28,
 from /home/fkastl/gcc/src/gcc/tree.cc:33:
/home/fkastl/gcc/src/gcc/tree.cc: In function ‘tree_node*
generate_internal_label(const char*)’:
/home/fkastl/gcc/src/gcc/config/rs6000/xcoff.h:206:30: error:
‘rs6000_xcoff_strip_dollar’ was not declared in this scope
  206 |   sprintf (LABEL, "*%s..%u", rs6000_xcoff_strip_dollar (PREFIX),
(unsigned) (NUM))
  |  ^
/home/fkastl/gcc/src/gcc/tree.cc:819:3: note: in expansion of macro
‘ASM_GENERATE_INTERNAL_LABEL’
  819 |   ASM_GENERATE_INTERNAL_LABEL (tmp, prefix, num++);
  |   ^~~

[Bug target/120866] [16 Regression] pdp11-aout crosscompiler fails to build

2025-06-29 Thread pheeck at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120866

Filip Kastl  changed:

   What|Removed |Added

   Target Milestone|--- |16.0

[Bug tree-optimization/120358] [15/16 regression] qtbase-6.9.0 miscompiled since r15-580-gf3e5f4c58591f5

2025-06-29 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120358

--- Comment #16 from rguenther at suse dot de  ---
> Am 29.06.2025 um 02:08 schrieb sjames at gcc dot gnu.org 
> :
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120358
> 
> --- Comment #13 from Sam James  ---
> FWIW, it fails with -fno-strict-aliasing still (and the original).

Does -fno-lifetime-dse help?

> --
> You are receiving this mail because:
> You are on the CC list for the bug.

[Bug tree-optimization/120358] [15/16 regression] qtbase-6.9.0 miscompiled since r15-580-gf3e5f4c58591f5

2025-06-29 Thread holger--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120358

--- Comment #17 from Holger Hoffstätte  ---
Created attachment 61755
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=61755&action=edit
Readable reproducer for debugging

(In reply to rguent...@suse.de from comment #16)
> Does -fno-lifetime-dse help?

It does not (in any example). I also get the impression that small.cxx now
demonstrates a different problem than before, but that might just be fallout
from what I'm about to show; this hole goes deeper.

With my "readable.cpp" attachment (still needs Qt):

holger>gdb a.out
(gdb) b dump
Breakpoint 1 at 0x148f: dump. (2 locations)
(gdb) r
Starting program: /mnt/tux/holger/Projects/qt6-gcc15/a.out 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".

Breakpoint 1.1, dump > (tok=...) at readable.cpp:8
8   for (auto &x : tok) {
(gdb) p tok.m_haystack 
$1 = {m_size = 19, m_data = 0x5556f650 u"a.ab.abc.abcd.abcde"}

--> looks fine

(gdb) c
Continuing.
1 2 3 4 5 

--> as expected

Breakpoint 1.2, dump > (tok=...) at
readable.cpp:8
8   for (auto &x : tok) {
(gdb) p tok.m_haystack 
$2 = {m_size = 93824992343616, m_data = 0x5556f650 u"a.ab.abc.abcd.abcde"}

--> m_size is obviously bogus

(gdb) c
Continuing.
1 2 3 4 5 18446744073709551615 

The interesting difference here is that the tokenizer "pins" moved-into
haystacks and that's when things go wrong.

Unpinned (good):

Breakpoint 1.1, dump > (tok=...) at readable.cpp:8
8   for (auto &x : tok) {
(gdb) p tok
$1 = (QStringTokenizer &) :
{> =
{> = {}, }, > =
{> = {}, }, > =
{ = {
  m_sb = {> =
{> = {static IntegerSize = 4, 
i = }, }, }, m_cs =
}, m_haystack = {m_size = 19, m_data = 0x5556f650
u"a.ab.abc.abcd.abcde"}, 
m_needle = {ucs = }}, }
(gdb) c
Continuing.
1 2 3 4 5 

Pinned (wrong):

Breakpoint 1.2, dump > (tok=...) at
readable.cpp:8
8   for (auto &x : tok) {
(gdb) p tok
$2 = (QStringTokenizer &) @0x7fffde20:
{> =
{> = {m_string = {d = {
  d = 0x002e, ptr = 0x5556f650 u"a.ab.abc.abcd.abcde", 
  size = 19}}}, },
> = {> = {}, },
> = { = {
  m_sb = {> =
{> = {static IntegerSize = 4, 
i = 1}, }, }, m_cs =
Qt::CaseSensitive}, m_haystack = {m_size = 93824992343616, m_data =
0x5556f650 u"a.ab.abc.abcd.abcde"}, 
m_needle = {ucs = 46 u'.'}}, }
(gdb) c
Continuing.
1 2 3 4 5 18446744073709551615 

We can see that Pinning has size=19 (correct), but m_haystack
has m_size = 93824992343616 (wrong).

[Bug tree-optimization/120358] [15/16 regression] qtbase-6.9.0 miscompiled since r15-580-gf3e5f4c58591f5

2025-06-29 Thread holger--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120358

--- Comment #18 from Holger Hoffstätte  ---
Furthermore, using an explicit std::move to the first test like this:

auto joined = expected.join(u'.');
auto tok = qTokenize(std::move(joined), QLatin1Char('.'),
Qt::CaseSensitive, Qt::SkipEmptyParts);

results in *both* tests using HaystackPinning and working as expected - though
in the second case that might be accidental state reuse, which is also
concerning:

Breakpoint 1, dump > (tok=...) at
/usr/include/qt6/QtCore/qstringtokenizer.h:377
377 auto QStringTokenizerBase::next(tokenizer_state
state) const noexcept -> next_result
(gdb) p tok
$6 = (QStringTokenizer &) @0x7fffde40:
{> =
{> = {m_string = {d = {
  d = 0x5556f640, ptr = 0x5556f650 u"a.ab.abc.abcd.abcde", 
  size = 19}}}, },
> = {> = {}, },
> = { = {
  m_sb = {> =
{> = {static IntegerSize = 4, 
i = 1}, }, }, m_cs =
Qt::CaseSensitive}, m_haystack = {m_size = 19, m_data = 0x5556f650
u"a.ab.abc.abcd.abcde"}, m_needle = {
  ucs = 46 u'.'}}, }
(gdb) c
Continuing.
1 2 3 4 5 

Breakpoint 1, dump > (tok=...) at
/usr/include/qt6/QtCore/qstringtokenizer.h:377
377 auto QStringTokenizerBase::next(tokenizer_state
state) const noexcept -> next_result
(gdb) p tok
$7 = (QStringTokenizer &) @0x7fffde40:
{> =
{> = {m_string = {d = {
  d = 0x5556f640, ptr = 0x5556f650 u"a.ab.abc.abcd.abcde", 
  size = 19}}}, },
> = {> = {}, },
> = { = {
  m_sb = {> =
{> = {static IntegerSize = 4, 
i = 1}, }, }, m_cs =
Qt::CaseSensitive}, m_haystack = {m_size = 19, m_data = 0x5556f650
u"a.ab.abc.abcd.abcde"}, m_needle = {
  ucs = 46 u'.'}}, }
(gdb) c
Continuing.
1 2 3 4 5 

Consequently the problematic second test (testNotOK) can be "fixed" by:

  auto tok = qTokenize(std::move(expected.join(u'.')), QLatin1Char('.'),
Qt::CaseSensitive, Qt::SkipEmptyParts);

Hope this helps.

[Bug tree-optimization/120358] [15/16 regression] qtbase-6.9.0 miscompiled since r15-580-gf3e5f4c58591f5

2025-06-29 Thread holger--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120358

--- Comment #19 from Holger Hoffstätte  ---
(In reply to Holger Hoffstätte from comment #18)
> Consequently the problematic second test (testNotOK) can be "fixed" by:
> 
>   auto tok = qTokenize(std::move(expected.join(u'.')), QLatin1Char('.'),
> Qt::CaseSensitive, Qt::SkipEmptyParts);

Sorry, no - this does *not* fix it. Sorry.

[Bug c++/120868] New: "unexpected AST of kind switch_expr" in constexpr template function

2025-06-29 Thread sdowney at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120868

Bug ID: 120868
   Summary: "unexpected AST of kind switch_expr" in constexpr
template function
   Product: gcc
   Version: 16.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: sdowney at gmail dot com
  Target Milestone: ---

Calling the google test macro EXPECT_EQ in a constexpr template function
results in:

:5:16: sorry, unimplemented: unexpected AST of kind switch_expr
:5:16: internal compiler error: in potential_constant_expression_1, at
cp/constexpr.cc:11183
0x2844425 diagnostic_context::diagnostic_impl(rich_location*,
diagnostic_metadata const*, diagnostic_option_id, char const*, __va_list_tag
(*) [1], diagnostic_t)
???:0
0x2866fc6 internal_error(char const*, ...)
???:0
0xaede72 fancy_abort(char const*, int, char const*)
???:0
0xb6afc5 require_potential_rvalue_constant_expression(tree_node*)
???:0
0xb6b1a5 explain_invalid_constexpr_fn(tree_node*)
???:0
0xb66b53 cxx_constant_value(tree_node*, tree_node*, int)
???:0
0xda189d finish_static_assert(tree_node*, tree_node*, unsigned long, bool,
bool)
???:0
0xd166e3 c_parse_file()
???:0
0xe7e469 c_common_parse_file()
???:0

https://compiler-explorer.com/z/nE3x3dzE5 (macros expanded) 

Removing the switch reveals a different constexpr error. 

Reproduced with trunk as well as with GCC-15 built locally. 

```c++
#include 
#include 

template 
constexpr auto constexpr_iterator_test(U opt) -> bool {
using iterator = typename std::remove_reference_t::iterator;
switch (0)
case 0:
default:
if (const ::testing ::AssertionResult gtest_ar =
(::testing ::internal ::EqHelper ::Compare(
"opt.begin()", "iterator()", opt.begin(), iterator(
;
else
::testing ::internal ::AssertHelper(
::testing ::TestPartResult ::kNonFatalFailure,
"/home/sdowney/src/Optional26/constexpr-20/tests/beman/"
"optional/optional_range_support.t.cpp",
444, gtest_ar.failure_message()) = ::testing ::Message();
return true;
};

int main() {
static_assert(constexpr_iterator_test(std::vector{}));
EXPECT_TRUE(constexpr_iterator_test(std::vector{}));
}
```

[Bug target/120866] New: [16 Regression] pdp11-aout crosscompiler fails to build

2025-06-29 Thread pheeck at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120866

Bug ID: 120866
   Summary: [16 Regression] pdp11-aout crosscompiler fails to
build
   Product: gcc
   Version: 16.0
Status: UNCONFIRMED
  Keywords: build
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pheeck at gcc dot gnu.org
  Target Milestone: ---
  Host: x86_64-pc-linux-gnu
Target: pdp11-aout

If I try to build pdp11-aout crosscompiler (without bootstrap)

/home/fkastl/gcc/src/configure --disable-bootstrap --enable-checking
--disable-libsanitizer --prefix=/home/fkastl/gcc/inst --config-cache
--target=pdp11-aout

I run into this error:

g++  -fno-PIE -c   -g -O2 -DIN_GCC -DCROSS_DIRECTORY_STRUCTURE  
-fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall
-Wno-error=narrowing -Wwrite-strings -Wcast-qual -Wmissing-format-attribute
-Wconditionally-supported -Woverloaded-virtual -pedantic -Wno-long-long
-Wno-variadic-macros -Wno-overlength-strings  -DHAVE_CONFIG_H -fno-PIE -I. -I.
-I/home/fkastl/gcc/src/gcc -I/home/fkastl/gcc/src/gcc/.
-I/home/fkastl/gcc/src/gcc/../include 
-I/home/fkastl/gcc/src/gcc/../libcpp/include
-I/home/fkastl/gcc/src/gcc/../libcody 
-I/home/fkastl/gcc/src/gcc/../libdecnumber
-I/home/fkastl/gcc/src/gcc/../libdecnumber/dpd -I../libdecnumber
-I/home/fkastl/gcc/src/gcc/../libbacktrace   -o tree.o -MT tree.o -MMD -MP -MF
./.deps/tree.TPo /home/fkastl/gcc/src/gcc/tree.cc
In file included from ./tm.h:21,
 from /home/fkastl/gcc/src/gcc/backend.h:28,
 from /home/fkastl/gcc/src/gcc/tree.cc:33:
/home/fkastl/gcc/src/gcc/tree.cc: In function ‘tree_node*
generate_internal_label(const char*)’:
/home/fkastl/gcc/src/gcc/config/pdp11/pdp11.h:561:3: error:
‘pdp11_gen_int_label’ was not declared in this scope
  561 |   pdp11_gen_int_label ((LABEL), (PREFIX), (NUM))
  |   ^~~
/home/fkastl/gcc/src/gcc/tree.cc:819:3: note: in expansion of macro
‘ASM_GENERATE_INTERNAL_LABEL’
  819 |   ASM_GENERATE_INTERNAL_LABEL (tmp, prefix, num++);
  |   ^~~

[Bug target/120866] [16 Regression] pdp11-aout, powerpc-ibm-aix7.1 and powerpc-ibm-aix7.2 crosscompilers fail to build

2025-06-29 Thread schwab--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120866

Andreas Schwab  changed:

   What|Removed |Added

 Blocks||118904

--- Comment #4 from Andreas Schwab  ---
commit 0337e3c2743 added a call to ASM_GENERATE_INTERNAL_LABEL without
including "tm_p.h".


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118904
[Bug 118904] [modules] ICE with std::source_location::current in inline
function

[Bug target/120866] [16 Regression] pdp11-aout, powerpc-ibm-aix7.1 and powerpc-ibm-aix7.2 crosscompilers fail to build

2025-06-29 Thread pheeck at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120866

--- Comment #3 from Filip Kastl  ---
(In reply to Sam James from comment #1)
> Huh, it's really a trunk regression? I can't yet think of which change
> would've done this.

It seems to be.  I've just tested this with trunk.

[Bug c++/120868] "unexpected AST of kind switch_expr" in constexpr template function

2025-06-29 Thread sdowney at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120868

--- Comment #1 from Steve Downey  ---
Created attachment 61756
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=61756&action=edit
-freport-bug preprocessed output -- gzipped for size

[Bug libstdc++/120527] Native platform wait on darwin

2025-06-29 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120527

--- Comment #10 from Jonathan Wakely  ---
Oh nice, there's an official, documented API for it now:
https://developer.apple.com/documentation/os/os_sync_wait_on_address

[Bug middle-end/120858] __builtin_rev_crc64_data64 poorly optimised when computing crc32

2025-06-29 Thread sh1.gccbug at tikouka dot nz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120858

--- Comment #5 from Simon H.  ---
Created attachment 61757
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=61757&action=edit
Source code confirming equivalent behaviour for four different variants.

As well as the generic clmul optimisation for 64-bit, there's also the need to
detect that the operation can be replaced with a single instruction.  The
example implementation of crc_04C11DB7_u64() using the crc64 builtin gives the
same results as the crc32x instruction.  It merely stuffs zeroes in the bottom
of the polynomial to make it emulate a 32-bit crc because there's no 32-bit crc
builtin with 64-bit argument.

Demonstration code attached (same as the godbolt page but with an extra
column).

[Bug fortran/119905] [OpenMP] Fortran deep mapping of allocatable components: Recursive types and FIRSTPRIVATE/PRIVATE not handled

2025-06-29 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119905

--- Comment #2 from Tobias Burnus  ---
Needs to be also handled with ALLOCATE clause, cf. PR113436, see also PR95506

[Bug middle-end/113436] [OpenMP] 'allocate' clause has no effect for (first)private on 'target' directives

2025-06-29 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113436

--- Comment #3 from Tobias Burnus  ---
Patch by Kwok: https://gcc.gnu.org/pipermail/gcc-patches/2025-June/687685.html

Follow up for allocatables,see PR119905.

[Bug testsuite/77684] many tree-prof testsuite failures in parallel make check

2025-06-29 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77684

--- Comment #11 from Jan Hubicka  ---
*** Bug 86404 has been marked as a duplicate of this bug. ***

[Bug testsuite/86404] UNRESOLVED/UNSUPPORTED gcov test results due to Permission error mapping pages

2025-06-29 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86404

Jan Hubicka  changed:

   What|Removed |Added

 CC||hubicka at gcc dot gnu.org
 Resolution|--- |DUPLICATE
 Status|NEW |RESOLVED

--- Comment #5 from Jan Hubicka  ---
This is a dup.

*** This bug has been marked as a duplicate of bug 77684 ***

[Bug middle-end/120608] [15/16 regression] error: cannot tail-call: other reasons when using address sanitizer with musttail

2025-06-29 Thread carlosgalvezp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120608

--- Comment #21 from Carlos Galvez  ---
> Why are you using the attribute at -O0?

We typically run our sanitizer builds at -O0 to ensure no UB is optimized away
before the sanitizer gets a chance to detect it. Is there a more suitable
optimization level for sanitizer?

I can also mention the problematic code is not ours, but as good practice we
aim to build all source code (internal and external) with the same flags.

[Bug testsuite/77684] many tree-prof testsuite failures in parallel make check

2025-06-29 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77684

Jan Hubicka  changed:

   What|Removed |Added

 Blocks||120867
 CC||hubicka at gcc dot gnu.org
 Status|NEW |WAITING

--- Comment #10 from Jan Hubicka  ---
For me parallel check is quite good.  I get 3 failures in peeling testcases
that probably should be disable for AutoFDO


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120867
[Bug 120867] [metabug] AutoFDO issues

[Bug lto/66229] LTO fails with -fauto-profile on mcf

2025-06-29 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66229

Jan Hubicka  changed:

   What|Removed |Added

 Status|WAITING |RESOLVED
 Resolution|--- |FIXED
 CC||hubicka at gcc dot gnu.org

--- Comment #5 from Jan Hubicka  ---
Lets say it is fixed. Mcf builds for me now.

[Bug tree-optimization/120867] New: [metabug] AutoFDO issues

2025-06-29 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120867

Bug ID: 120867
   Summary: [metabug] AutoFDO issues
   Product: gcc
   Version: 16.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hubicka at gcc dot gnu.org
  Target Milestone: ---

[Bug tree-optimization/120867] [metabug] AutoFDO issues

2025-06-29 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120867

Jan Hubicka  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
  Alias||AutoFDO
   Last reconfirmed||2025-06-29

[Bug middle-end/120614] 525.x264_r is ~30% slower with AutoFDO

2025-06-29 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120614

Jan Hubicka  changed:

   What|Removed |Added

   Last reconfirmed||2025-06-29
 Blocks||120867
 Ever confirmed|0   |1
 Status|UNCONFIRMED |WAITING

--- Comment #14 from Jan Hubicka  ---
x264 is around 4.5% slower on my setup, is the large regression still
reproducible?


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120867
[Bug 120867] [metabug] AutoFDO issues

[Bug gcov-profile/120229] [GCOV] AutoFDO cannot distinguish privatized functions within an LTO partition

2025-06-29 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120229

Jan Hubicka  changed:

   What|Removed |Added

 Blocks||120867
 Ever confirmed|0   |1
   Last reconfirmed||2025-06-29
 Status|UNCONFIRMED |NEW
 CC||hubicka at gcc dot gnu.org

--- Comment #1 from Jan Hubicka  ---
confirmed


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120867
[Bug 120867] [metabug] AutoFDO issues

[Bug libstdc++/120527] Native platform wait on darwin

2025-06-29 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120527

Andrew Pinski  changed:

   What|Removed |Added

   See Also|https://github.com/llvm/llv |https://github.com/llvm/llv
   |m-project/issues/146223 |m-project/issues/146142

--- Comment #9 from Andrew Pinski  ---
I dont know how that happened but i copied the wrong issue #.
https://github.com/llvm/llvm-project/issues/146142

[Bug libstdc++/120527] Native platform wait on darwin

2025-06-29 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120527

Andrew Pinski  changed:

   What|Removed |Added

   See Also||https://github.com/llvm/llv
   ||m-project/issues/146223

--- Comment #8 from Andrew Pinski  ---
See https://github.com/llvm/llvm-project/issues/146223

[Bug tree-optimization/120358] [15/16 regression] qtbase-6.9.0 miscompiled since r15-580-gf3e5f4c58591f5

2025-06-29 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120358

--- Comment #20 from Sam James  ---
(In reply to Holger Hoffstätte from comment #17)
>  I also get the impression that small.cxx now demonstrates a different 
> problem than before, but that might just be fallout from what I'm about to 
> show; this hole goes deeper.

I can always re-run it with some specific condition keeping the list
comparisons, but given it still kept the QList and dies w/o
-fno-tree-pta, I figured it was the same. But I'm happy to spend some CPU on it
if it provides some insight if people want ofc.

(I had the same thought and went back and forth on it.)

Next step if nobody has any insight yet is to do the debug counter I think and
also see if anything leaps out from dumps, now that the file is much smaller.

[Bug tree-optimization/120358] [15/16 regression] qtbase-6.9.0 miscompiled since r15-580-gf3e5f4c58591f5

2025-06-29 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120358

--- Comment #21 from Sam James  ---
(In reply to Holger Hoffstätte from comment #18)
> results in *both* tests using HaystackPinning and working as expected -
> though in the second case that might be accidental state reuse, which is
> also concerning:

-fstack-reuse=none doesn't help, but that doesn't prove much if it's a C++ FE
issue.

[Bug tree-optimization/120358] [15/16 regression] qtbase-6.9.0 miscompiled since r15-580-gf3e5f4c58591f5

2025-06-29 Thread holger--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120358

Holger Hoffstätte  changed:

   What|Removed |Added

  Attachment #61755|0   |1
is obsolete||

--- Comment #22 from Holger Hoffstätte  ---
Created attachment 61758
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=61758&action=edit
Readable reproducer for debugging

Slightly more reduced, no need for the join business.