[Bug target/118890] ubsan bootstrap failure for powerpc64le-unknown-linux-gnu

2025-06-20 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118890

Sam James  changed:

   What|Removed |Added

   Last reconfirmed||2025-06-20
 Status|UNCONFIRMED |ASSIGNED
 Ever confirmed|0   |1

[Bug c++/120716] [16 regression] ICE on https://eel.is/c++draft/expr.const#example-3 in C++23 since r16-149

2025-06-20 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120716

Jakub Jelinek  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
 CC||jason at gcc dot gnu.org
   Last reconfirmed||2025-06-20
Summary|[16 regression] ICE on  |[16 regression] ICE on
   |https://eel.is/c++draft/exp |https://eel.is/c++draft/exp
   |r.const#example-3 in C++23  |r.const#example-3 in C++23
   ||since r16-149

--- Comment #2 from Jakub Jelinek  ---
Indeed, started with r16-149-g44e31eb265ba1984638908466a88095744a88709

[Bug c++/120716] [16 regression] ICE on https://eel.is/c++draft/expr.const#example-3 in C++23

2025-06-20 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120716

Sam James  changed:

   What|Removed |Added

   Target Milestone|--- |16.0
Summary|ICE on  |[16 regression] ICE on
   |https://eel.is/c++draft/exp |https://eel.is/c++draft/exp
   |r.const#example-3 in C++23  |r.const#example-3 in C++23

--- Comment #1 from Sam James  ---
15 (ofc with checking) doesn't ICE for me.

[Bug target/120708] ix86_expand_set_or_cpymem ignores MOVE_MAX

2025-06-20 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120708

--- Comment #2 from GCC Commits  ---
The master branch has been updated by H.J. Lu :

https://gcc.gnu.org/g:050b1708ea532ea4840e97d85fad4ca63d4cd631

commit r16-1588-g050b1708ea532ea4840e97d85fad4ca63d4cd631
Author: H.J. Lu 
Date:   Thu Jun 19 05:03:48 2025 +0800

x86: Get the widest vector mode from MOVE_MAX

Since MOVE_MAX defines the maximum number of bytes that an instruction
can move quickly between memory and registers, use it to get the widest
vector mode in vector loop when inlining memcpy and memset.

gcc/

PR target/120708
* config/i386/i386-expand.cc (ix86_expand_set_or_cpymem): Use
MOVE_MAX to get the widest vector mode in vector loop.

gcc/testsuite/

PR target/120708
* gcc.target/i386/memcpy-pr120708-1.c: New test.
* gcc.target/i386/memcpy-pr120708-2.c: Likewise.
* gcc.target/i386/memcpy-pr120708-3.c: Likewise.
* gcc.target/i386/memcpy-pr120708-4.c: Likewise.
* gcc.target/i386/memcpy-pr120708-5.c: Likewise.
* gcc.target/i386/memcpy-pr120708-6.c: Likewise.
* gcc.target/i386/memset-pr120708-1.c: Likewise.
* gcc.target/i386/memset-pr120708-2.c: Likewise.
* gcc.target/i386/memcpy-strategy-1.c: Drop dg-skip-if.  Replace
-march=atom with -mno-avx -msse2 -mtune=generic
-mtune-ctrl=^sse_typeless_stores.
* gcc.target/i386/memcpy-strategy-2.c: Likewise.
* gcc.target/i386/memcpy-vector_loop-1.c: Likewise.
* gcc.target/i386/memcpy-vector_loop-2.c: Likewise.
* gcc.target/i386/memset-vector_loop-1.c: Likewise.
* gcc.target/i386/memset-vector_loop-2.c: Likewise.

Signed-off-by: H.J. Lu 

[Bug c++/120716] [16 regression] ICE on https://eel.is/c++draft/expr.const#example-3 in C++23 since r16-149

2025-06-20 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120716

--- Comment #3 from Jakub Jelinek  ---
Guess the important part of the patch is
  tree **use = const_vars.get (var);
- if (use && TREE_CODE (**use) == DECL_EXPR)
+ if (TREE_CODE (**use) == DECL_EXPR)   
{   
  /* All uses of this capture were folded away, leaving only the
 proxy declaration.  */ 
So, shall it instead
  if (use == NULL)
gcc_assert (seen_error ());
  else if (TREE_CODE (**use) == DECL_EXPR)
?

[Bug tree-optimization/120639] vect: Strided memory access type, stores with gaps?

2025-06-20 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120639

--- Comment #3 from Robin Dapp  ---
> We could use scatter stores, building the index vector somehow cleverly with
> i_width contiguous indexes interspaced by i_dst_stride.  In fact this vector
> could be built as inductions when building the i_height number of vectors
> to store and concatenated the same way?

Interesting, so you mean having a strided index vector
[0, 1, ..., vector_size, vector_size + 1, ..., i_width, 0 + stride, 1 + stride,
...]?

What about something like i_width = 12 and a 64-bit strided element (that
doesn't cover all of i_width but would require another 32-bit strided element)?
 Wouldn't we still need a mechanism to "fill" up to i_width?

[Bug tree-optimization/120598] Compiler is unable to vectorise scalar code

2025-06-20 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120598

Richard Biener  changed:

   What|Removed |Added

 Status|WAITING |NEW

--- Comment #9 from Richard Biener  ---
(In reply to Segher Boessenkool from comment #8)
> (In reply to Jeevitha from comment #6)
> > The following dot_product function gets vectorized with the latest GCC trunk
> > and gcc 15.1.0:
> > 
> > #include 
> > #include 
> > extern float dot_product(const int16_t *v1, const int16_t *v2, size_t len);
> > float dot_product(const int16_t *v1, const int16_t *v2, size_t len)
> > {
> > int64_t d = 0;
> > for (size_t i = 0; i < len; i++)
> > d += int32_t(v1[i]) * int32_t(v2[i]);
> > return static_cast(d);
> > }
> > 
> > 
> > I observed that -O2 was used during compilation. However, for GCC versions
> > earlier than 15, vectorization of this loop requires -O3. Since they are
> > using the -O2 flag, GCC 15 necessary in this case.
> 
> Is that what the original code does?  Or does it convert every number to
> float
> and then sum over that?

The above is from the preprocessed source.

> And, can you try to find out what patch to GCC 15 made this work at -O2?  In
> case we want to backport anything, but also just to get a better grip on what
> is happening  here :-)

With GCC 15 we allow peeling for niter at -O2, with GCC 14 and earlier at -O2
we effectively only ever vectorize loops with constant number of iterations
(divisible by vector size).

I'd say this is "fixed" (it was reported against GCC 15), but the function
is 'static' in the preprocessed sources and thus likely inlined.  I'll
also note that plain SSE2 is a bit inefficient for this loop.

So maybe the reporter can clarify "We’ve observed that while functions in the
PGVector library benefit from both loop unrolling and auto-vectorization (even
with earlier versions of GCC, like 13.3 and 11.5), the same does not hold true
for the dot_product function in the MariaDB library" - does this mean
the autovectorization makes the function slower?  That would mean our cost
model isn't good enough here.

[Bug cobol/120730] parse.cc doesn't compile with bison 3.5.1

2025-06-20 Thread ro at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120730

Rainer Orth  changed:

   What|Removed |Added

   Target Milestone|--- |16.0

[Bug libfortran/114895] [Y2038] Build failure with !HAVE_WORKING_STAT

2025-06-20 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114895

Richard Biener  changed:

   What|Removed |Added

   Keywords||build
   Last reconfirmed||2025-6-20
  Known to fail||14.3.0, 15.1.1

--- Comment #12 from Richard Biener  ---
Re-confirmed with GCC 15.

[Bug tree-optimization/120639] vect: Strided memory access type, stores with gaps?

2025-06-20 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120639

--- Comment #4 from rguenther at suse dot de  ---
On Fri, 20 Jun 2025, rdapp at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120639
> 
> --- Comment #3 from Robin Dapp  ---
> > We could use scatter stores, building the index vector somehow cleverly with
> > i_width contiguous indexes interspaced by i_dst_stride.  In fact this vector
> > could be built as inductions when building the i_height number of vectors
> > to store and concatenated the same way?
> 
> Interesting, so you mean having a strided index vector
> [0, 1, ..., vector_size, vector_size + 1, ..., i_width, 0 + stride, 1 + 
> stride,
> ...]?
> 
> What about something like i_width = 12 and a 64-bit strided element (that
> doesn't cover all of i_width but would require another 32-bit strided 
> element)?
>  Wouldn't we still need a mechanism to "fill" up to i_width?

Well, consider the desired index vector being a real induction (just
store it somewhere).  If we can handle that, we should be able to
handle the scatter.  If not, we can't handle the scatter.

[Bug target/70308] memset generates rep stosl instead of rep stosq

2025-06-20 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70308

Sam James  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED

[Bug target/120427] [12/13/14/15 Regression] "and $0,mem" is generated without -Oz since r12-6106-gef26c151c14a87

2025-06-20 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120427

Sam James  changed:

   What|Removed |Added

Summary|[12/13/14/15/16 Regression] |[12/13/14/15 Regression]
   |"and $0,mem" is generated   |"and $0,mem" is generated
   |without -Oz since   |without -Oz since
   |r12-6106-gef26c151c14a87|r12-6106-gef26c151c14a87
 Status|NEW |ASSIGNED
  Known to work||16.0
   Assignee|unassigned at gcc dot gnu.org  |hjl.tools at gmail dot 
com

[Bug target/118276] memset 88 uses rep stosq while 80 uses SSE

2025-06-20 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118276

Sam James  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED

[Bug target/120728] New: vmovdqu8 is used on YMM unnecessarily

2025-06-20 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120728

Bug ID: 120728
   Summary: vmovdqu8 is used on YMM unnecessarily
   Product: gcc
   Version: 16.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hjl.tools at gmail dot com
CC: liuhongt at gcc dot gnu.org
  Target Milestone: ---
Target: x86-64

[hjl@gnu-tgl-3 tmp]$ cat x.c
typedef char __v32qi __attribute__ ((__vector_size__ (32)));
typedef char __v32qi_u __attribute__ ((__vector_size__ (32),
   __aligned__ (1)));

extern __v32qi_u y;

void
foo (__v32qi x)
{
  y = x;
}
[hjl@gnu-tgl-3 tmp]$ gcc -S -march=x86-64-v4 -O2 x.c
[hjl@gnu-tgl-3 tmp]$ cat x.s
.file   "x.c"
.text
.p2align 4
.globl  foo
.type   foo, @function
foo:
.LFB0:
.cfi_startproc
vmovdqu8%ymm0, y(%rip)
 Should be vmovdqa
ret
.cfi_endproc
.LFE0:
.size   foo, .-foo
.ident  "GCC: (GNU) 15.1.1 20250521 (Red Hat 15.1.1-2)"
.section.note.GNU-stack,"",@progbits
[hjl@gnu-tgl-3 tmp]$

[Bug c++/120557] [14/15/16 Regression] ICE: tree check: expected record_type or union_type or qual_union_type, have integer_type in finish_non_static_data_member, at cp/semantics.cc:2809

2025-06-20 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120557

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org,
   ||waffl3x at gcc dot gnu.org

--- Comment #2 from Jakub Jelinek  ---
I wonder if it wouldn't be better in case we have erroneous this argument to
just change it to pointer or reference to current class.
Or perhaps
--- gcc/cp/parser.cc.jj 2025-06-19 08:55:04.0 +0200
+++ gcc/cp/parser.cc2025-06-20 10:22:47.571385574 +0200
@@ -20985,7 +20985,8 @@ cp_parser_simple_type_specifier (cp_pars
  break;
}

- if (cxx_dialect >= cxx14)
+ if (cxx_dialect >= cxx14
+ || (current_class_type && LAMBDA_TYPE_P (current_class_type)))
{
  type = synthesize_implicit_template_parm (parser, NULL_TREE);
  type = TREE_TYPE (type);

(or maybe do it for all the cases, so call synthesize_implicit_template_parm
unconditionally instead of setting it to error_mark_node for C++11?

[Bug target/120728] vmovdqu8 is used on YMM unnecessarily

2025-06-20 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120728

H.J. Lu  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |hjl.tools at gmail dot 
com

--- Comment #3 from H.J. Lu  ---
Created attachment 61670
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=61670&action=edit
A patch

I am testing this.

[Bug target/120728] vmovdqu8 is used on YMM unnecessarily

2025-06-20 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120728

H.J. Lu  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2025-06-20
 Ever confirmed|0   |1

--- Comment #2 from H.J. Lu  ---
(In reply to Richard Biener from comment #1)
> Huh?  You made 'y' 1-byte aligned, so I'd expect vmovdqu?

Correct.

[Bug middle-end/120721] [16 regression] ICE when building llvm-20.1.7 on arm64 (instantiate_virtual_regs_in_insn, at function.cc:1737)

2025-06-20 Thread rsandifo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120721

Richard Sandiford  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |rsandifo at gcc dot 
gnu.org
 Status|NEW |ASSIGNED

--- Comment #9 from Richard Sandiford  ---
Looks like another case for force_subreg.

[Bug gcov-profile/120634] Memory leak in prime-paths.cc selftests (and possibly in general?)

2025-06-20 Thread j at lambda dot is via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120634

--- Comment #2 from Jørgen Kvalsvik  ---
I posted a fix on gcc-patches

https://gcc.gnu.org/pipermail/gcc-patches/2025-June/687131.html

[Bug target/120728] vmovdqu8 is used on YMM unnecessarily

2025-06-20 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120728

Richard Biener  changed:

   What|Removed |Added

 CC||rguenth at gcc dot gnu.org
   Keywords||missed-optimization

--- Comment #1 from Richard Biener  ---
Huh?  You made 'y' 1-byte aligned, so I'd expect vmovdqu?

[Bug tree-optimization/120724] [missed optimization] bit test in a loop

2025-06-20 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120724

Richard Biener  changed:

   What|Removed |Added

   Last reconfirmed||2025-06-20
  Component|target  |tree-optimization
 CC||rguenth at gcc dot gnu.org
   Keywords||missed-optimization
 Status|UNCONFIRMED |NEW
 Target|X86_64  |x86_64-*-*
 Ever confirmed|0   |1

--- Comment #1 from Richard Biener  ---
bool isvalid3(uint64_t state) {
return ((state & state<<1) & 0xull) == 0;
}

is what the reporter expects this to reduce to.  While this might be
viewed as a reduction and thus fit to be handled in SCCP pattern detection
we have the issue is that the "reduction" result is control computed.

For better vectorizing this we'd have to think of 'state' in state & 0xf as
input and >>= 4 / & 0xf as indexing.  Then in principle we could "unpack"
the nibbles to bytes and have a vector compare against 11 so there's the
chance that this could be more generally useful for vectorizing a loop
working on some bits of a scalar at a time.

I do not see a very good place to handle the whole thing as pattern.  The
loop is dead apart from the PHI <0, 1> that's dependent on the two exits,
so it still would be SCCP of some sorts.

[Bug c++/120729] New: Compilation takes forever with -Wuninitialized

2025-06-20 Thread harald at gigawatt dot nl via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120729

Bug ID: 120729
   Summary: Compilation takes forever with -Wuninitialized
   Product: gcc
   Version: 14.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: harald at gigawatt dot nl
  Target Milestone: ---

Created attachment 61669
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=61669&action=edit
X86RecognizableInstr.cpp.ii.gz

I'm not really sure where the problem is and how to reduce this. When building
LLVM for ARM, I'm seeing a file that appears to take forever to compile, it's
been going for over seven hours.

Preprocessed non-reduced source attached, I am not sure how to reduce this.

$ arm-linux-gnueabihf-g++ --version
arm-linux-gnueabihf-g++ (Debian 14.2.0-19) 14.2.0
Copyright (C) 2024 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ arm-linux-gnueabihf-g++ -v
Using built-in specs.
COLLECT_GCC=arm-linux-gnueabihf-g++
COLLECT_LTO_WRAPPER=/usr/libexec/gcc-cross/arm-linux-gnueabihf/14/lto-wrapper
Target: arm-linux-gnueabihf
Configured with: ../src/configure -v --with-pkgversion='Debian 14.2.0-19'
--with-bugurl=file:///usr/share/doc/gcc-14/README.Bugs
--enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --prefix=/usr
--with-gcc-major-version-only --program-suffix=-14 --enable-shared
--enable-linker-build-id --libexecdir=/usr/libexec --without-included-gettext
--enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/
--enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes
--with-default-libstdcxx-abi=new --enable-libstdcxx-backtrace
--enable-gnu-unique-object --disable-libitm --disable-libquadmath
--disable-libquadmath-support --enable-plugin --enable-default-pie
--with-system-zlib --enable-libphobos-checking=release
--without-target-system-zlib --enable-multiarch --disable-sjlj-exceptions
--with-arch=armv7-a+fp --with-float=hard --with-mode=thumb --disable-werror
--enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu
--target=arm-linux-gnueabihf --program-prefix=arm-linux-gnueabihf-
--includedir=/usr/arm-linux-gnueabihf/include
--with-build-config=bootstrap-lto-lean --enable-link-serialization=3
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 14.2.0 (Debian 14.2.0-19)
$ time arm-linux-gnueabihf-g++ -O3 -std=c++17 -o X86RecognizableInstr.cpp.o -c
X86RecognizableInstr.cpp.ii
2.13user 0.03system 0:02.17elapsed 99%CPU (0avgtext+0avgdata
255492maxresident)k
0inputs+144outputs (0major+8012minor)pagefaults 0swaps
$ time arm-linux-gnueabihf-g++ -O3 -std=c++17 -o X86RecognizableInstr.cpp.o -c
X86RecognizableInstr.cpp.ii -Wall
(wait seemingly forever)

GDB does not load all the symbols of this build, but hopefully enough that it
is still useful:

$ DEBUGINFOD_URLS="https://debuginfod.debian.net"; gdb -p 1216625
GNU gdb (Debian 16.3-1) 16.3
Copyright (C) 2024 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
.
Find the GDB manual and other documentation resources online at:
.

For help, type "help".
Type "apropos word" to search for commands related to "word".
Attaching to process 1216625
Reading symbols from /usr/libexec/gcc-cross/arm-linux-gnueabihf/14/cc1plus...
(No debugging symbols found in
/usr/libexec/gcc-cross/arm-linux-gnueabihf/14/cc1plus)
Reading symbols from /lib/x86_64-linux-gnu/libisl.so.23...
Downloading 2.67 M separate debug info for /lib/x86_64-linux-gnu/libisl.so.23
Reading symbols from
/home/harald/.cache/debuginfod_client/fc68a4fd6db9308fb652d1ad0fdd4888d37701e7/debuginfo...
Reading symbols from /lib/x86_64-linux-gnu/libmpc.so.3...
Downloading 172.84 K separate debug info for /lib/x86_64-linux-gnu/libmpc.so.3
Reading symbols from
/home/harald/.cache/debuginfod_client/11f66948b38ade8fe459238cf606fe1f09f89779/debuginfo...
Reading symbols from /lib/x86_64-linux-gnu/libmpfr.so.6...
Downloading 658.44 K separate debug info for /lib/x86_64-linux-gnu/libmpfr.so.6
Reading symbols from
/home/harald/.cache/debuginfod_client/9a93db6d26747abdd9785fe0372dce524ba7c6f4/debuginfo...
Reading symbols from /lib/x86_64-linux-gnu/libgmp.so.10...
Downloading 724.35 K separate debug info for /lib/x86_64-linux-gnu/libgmp.so.10
Reading symbols from
/home/harald/.cache/debuginfod_client/dff5c2156ec812613c5e4431005c576b21

[Bug modula2/120731] Possible error in Strings.Pos causing sigsegv

2025-06-20 Thread gaius at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120731

Gaius Mulley  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2025-06-20

--- Comment #1 from Gaius Mulley  ---
Confirmed behaviour.

[Bug target/120722] [16 Regression][gcn] ICE in gen_highpart, at emit-rtl.cc:1674 since r16-1565-g2dcc6dbd8a00ca when building libgcc/strub.c for gfx1036

2025-06-20 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120722

--- Comment #2 from Andrew Stubbs  ---
Previously simplify_gen_subreg worked with this pattern:

(reg:DI 106 vcc_lo [orig:693 loop_mask_26 ] [693])

gave

(subreg:SI (reg:DI 106 vcc_lo [orig:693 loop_mask_26 ] [693]) 4)

Now it fails.

[Bug tree-optimization/120639] vect: Strided memory access type, stores with gaps?

2025-06-20 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120639

--- Comment #5 from Robin Dapp  ---
> Well, consider the desired index vector being a real induction (just
> store it somewhere).  If we can handle that, we should be able to
> handle the scatter.  If not, we can't handle the scatter.

Hmm, I think I misunderstood.  You are arguing that we could build an induction
variable based on the i_height loop, right?  So roughly like 

  vect_vec_iv = {0, 1, ..., i_width};
  for (... i_height)
   {
  ...
  idxs = "[vect_vec_iv, vect_vec_iv + {i_dst_stride, ...}, ...]"
  IFN_SCATTER_STORE (dst, idxs);
  vect_vec_iv += {i_dst_stride, i_dst_stride, ...};
   }?

I guess this can always be implemented as a scatter one way or another?

But my objective is actually two-fold in that I want to use the full vector
size and also conflate as many elements as possible into a single one (i.e. 8
chars into one uint64_t).  The second part helps gather/scatter as well as
strided loads/stores independently as it reduces the number of individual
elements (thus reducing the scatter/gather latency).

So I think in order to make full use of the vector size the induction approach
can work as we construct the index vector appropriately.

For conflating/reinterpreting a subset of dynamic indices we IMHO need static
code that is dynamically dispatched as described in my previous message.

I.e. a loop over i_width:
  while (rem > 0)
   {
 if (rem == 8)
"scatter/strided store with 64-bit elements"
 if (rem == 4)
"scatter/strided store with 32-bit elements"
 rem -= elsz;
   }

I realize that's not something we do at all right now, hence my initial
question.  Irrespective of how/if something like that could be implemented (I
can only imagine virtual/composition modes right now), is it even desirable in
any way?  I know that it would help our uarch at least.

[Bug libstdc++/120717] Passing reference to incomplete type to std::move_only_function emits false-positive '-Wsfinae-incomplete' warning

2025-06-20 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120717

Patrick Palka  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
 CC||ppalka at gcc dot gnu.org
   Last reconfirmed||2025-06-20

--- Comment #1 from Patrick Palka  ---
> 1. Refine the implementation-detail metafunction 
> std::__is_complete_or_unbounded so that it correctly handles such reference 
> to incomplete types.
Makes sense, the warning demonstrates that __is_complete_or_unbounded
needlessly tries to instantiate the underlying type of a reference type (or
array of unknown bound).

> Option 1 appears to be unimplementable because such metafunction can be 
> called several times in different contexts, which can result in ODR-violation.
I think ODR violation is only a risk if the sizeof check happens inside the
template definition. If the sizeof check only happens during template argument
deduction (e.g. encoded as a default template argument as it is now), that
shouldn't introduce ODR violations since the result of template argument
deduction can vary across a program, unlike template instantiation, IIUC.

[Bug tree-optimization/120729] Compilation takes forever with -Wuninitialized

2025-06-20 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120729

Richard Biener  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org

--- Comment #4 from Richard Biener  ---
So the most useful thing might be still the bisection on what "fixed" this. 
I'm testing a patch.

[Bug tree-optimization/120729] Compilation takes forever with -Wuninitialized

2025-06-20 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120729

--- Comment #3 from Richard Biener  ---
So the code checks _all_ paths through the CFG, visited_flag_phis only prunes
backedges during the recursion of of processing one path.  To be unlucky
one has to have a very deep back-to-back PHI def chain, with each two args
you then get 2^N exponential number of paths to check.

I have a patch to limit the search, but then you get a diagnostic:

/home/harald/llvm-project/main/llvm/utils/TableGen/X86RecognizableInstr.cpp: In
static member function ‘static llvm::X86Disassembler::OperandType
llvm::X86Disassembler::RecognizableInstr::typeFromString(llvm::StringRef, bool,
uint8_t)’:
/home/harald/llvm-project/main/llvm/utils/TableGen/X86RecognizableInstr.cpp:1171:12:
warning: ‘*(unsigned int*)((char*)&Switch +
offsetof(llvm::StringSwitch,llvm::StringSwitch::Result.std::optional::.std::_Optional_base::))’ may be used uninitialized [-Wmaybe-uninitialized]
/home/harald/llvm-project/main/llvm/utils/TableGen/X86RecognizableInstr.cpp:1026:29:
note: ‘*(unsigned int*)((char*)&Switch +
offsetof(llvm::StringSwitch,llvm::StringSwitch::Result.std::optional::.std::_Optional_base::))’ was declared here

Note with GCC 15+ we might simply have optimized enough to avoid this.  Even
with a very very large amount of work (1000 pruned path elements allowed)
we diagnose this.  With an -O0 compiler uninit analysis the takes 30%
of the compile-time.

[Bug c++/120557] [14/15/16 Regression] ICE: tree check: expected record_type or union_type or qual_union_type, have integer_type in finish_non_static_data_member, at cp/semantics.cc:2809

2025-06-20 Thread waffl3x at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120557

--- Comment #3 from Alex  ---
If I remember correctly, originally we decided that deducing this as a
pre-c++23 extension should be pretty benign.  Fixing this to work in
c++11 would be going beyond that though, as generic lambdas are c++14,
so maybe we shouldn't do that.

I've rewritten my explanation like 3 times now and end up including too
many details, I'm just going to try to stick it and hopefully be
concise this time.  I also haven't looked at the code so I'm mostly
going by memory and inferring from context posted, and this part of
deducing this was largely written by Jason, I have a hard time
reasoning about some of it.

(In reply to Jakub Jelinek from comment #2)
> I wonder if it wouldn't be better in case we have erroneous this argument to
> just change it to pointer or reference to current class.

I'm pretty sure this wouldn't work, we don't store a current class
ref/ptr in xobj member functions, and the error is (at least I am
fairly sure} happening before substitution so we can't possibly know
what it is yet.

I have a vague idea of what's happening but I'm unable to articulate it
properly, the dummy object needs a parameter for substitution later or
something.  I'm not going to try to articulate it, but having the class
isn't good enough, It's done this way because we don't know the class
type until the function is instantiated.

> Or perhaps
> --- gcc/cp/parser.cc.jj   2025-06-19 08:55:04.0 +0200
> +++ gcc/cp/parser.cc  2025-06-20 10:22:47.571385574 +0200
> @@ -20985,7 +20985,8 @@ cp_parser_simple_type_specifier (cp_pars
> break;
>   }
>  
> -   if (cxx_dialect >= cxx14)
> +   if (cxx_dialect >= cxx14
> +   || (current_class_type && LAMBDA_TYPE_P (current_class_type)))
>   {
> type = synthesize_implicit_template_parm (parser, NULL_TREE);
> type = TREE_TYPE (type);
> 
> (or maybe do it for all the cases, so call synthesize_implicit_template_parm
> unconditionally instead of setting it to error_mark_node for C++11?

For here, as I said I'm pretty sure current_class_type won't be
populated in an xobj member function.  Other than that, if we are happy
to extend limited support for generic lambdas in c++11 the spirit of
this change seems fine.  If the FUNCTION_DECL is available there, you
can just check for DECL_XOBJ_MEMBER_FUNCTION_P on it.

With that said, I still think it's probably going too far to support
generic lambdas in c++11, so I think the correct fix is to just bail
out of add_default_capture/add_capture for error_mark_node.  If I had
to guess, it's probably better in add_capture as I assume that will
handle explicit captures as well.  I am somewhat surprised it wasn't
already checked for, but I guess it just never happens outside of
-std=c++11.

[Bug target/119971] [15 Regression] RISC-V: Wrong code with bitmanip extension

2025-06-20 Thread law at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119971

Jeffrey A. Law  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #7 from Jeffrey A. Law  ---
I've cherry-picked the patch to the release branch which should resolve this
issue.

[Bug target/119971] [15 Regression] RISC-V: Wrong code with bitmanip extension

2025-06-20 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119971

--- Comment #6 from GCC Commits  ---
The releases/gcc-15 branch has been updated by Jeff Law :

https://gcc.gnu.org/g:bf284e8f5edf0d039b1dd9af5d62e355ce85a7ba

commit r15-9850-gbf284e8f5edf0d039b1dd9af5d62e355ce85a7ba
Author: Jeff Law 
Date:   Mon May 5 17:14:29 2025 -0600

[RISC-V][PR target/119971] Avoid losing shift count masking

As is outlined in the PR, we have a few define_insn_and_split patterns
which
optimize away explicit masking of shift/bit positions when the masking
matches
what the hardware's behavior.

A small number of those define_insn_and_split patterns generate a single
instruction.  It's fairly elegant in that we were essentially just
rewriting
the RTL to match an existing pattern.

In one case we'd do the rewriting and later turn a 32bit shift into a bset.
That's not safe because the masking of a 32bit shift uses 0x1f while
masking on
bset uses 0x3f on rv64.   The net was incorrect code as seen in the BZ
entry.

The fix is pretty simple.  There's no real reason we need to use a
define_insn_and_split.  It was just convenient.  Instead we can use a
simple
define_insn.  That avoids a change in the masking behavior for the shift
count/bit position and the masking stays in the RTL.

I quickly scanned the entire port and didn't see any additional
define_insn_and_splits that obviously generated a single instruction
outside
the shift/rotate space, though in the vector space that's nontrivial to
ascertain.

This was been run through my tester for the cross configurations, but not
the
native bootstrap/regression test (yet).

PR target/119971
gcc/
* config/riscv/bitmanip.md (rotation with masked count): Rewrite
as define_insn patterns.  Fix formatting.
* config/riscv/riscv.md (shift with masked count): Similarly.

gcc/testsuite
* gcc.target/riscv/pr119971.c: New test.
* gcc.target/riscv/zbb-rol-ror-03.c: Adjust test slightly.

(cherry picked from commit 05d75c5bfcf923bc0258b79a08c5861590c5a2b9)

[Bug middle-end/118443] [Meta bug] Bugs triggered by and blocking more smtgcc testing

2025-06-20 Thread law at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118443
Bug 118443 depends on bug 119971, which changed state.

Bug 119971 Summary: [15 Regression] RISC-V: Wrong code with bitmanip extension
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119971

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

[Bug target/120722] [16 Regression][gcn] ICE in gen_highpart, at emit-rtl.cc:1674 since r16-1565-g2dcc6dbd8a00ca when building libgcc/strub.c for gfx1036

2025-06-20 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120722

--- Comment #4 from GCC Commits  ---
The master branch has been updated by Andrew Stubbs :

https://gcc.gnu.org/g:95752870fb5c51868a084b94705a83d728be52c8

commit r16-1593-g95752870fb5c51868a084b94705a83d728be52c8
Author: Andrew Stubbs 
Date:   Fri Jun 20 16:43:37 2025 +

amdgcn: allow SImode in VCC_HI [PR120722]

This patch isn't fully tested yet, but it fixes the build failure, so that
will do for now.  SImode was not allowed in VCC_HI because there were
issues,
way back before the port went upstream, so it's possible we'll find out
what
those issues were again soon.

gcc/ChangeLog:

PR target/120722
* config/gcn/gcn.cc (gcn_hard_regno_mode_ok): Allow SImode in
VCC_HI.

[Bug tree-optimization/120639] vect: Strided memory access type, stores with gaps?

2025-06-20 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120639

--- Comment #6 from rguenther at suse dot de  ---
> Am 20.06.2025 um 16:17 schrieb rdapp at gcc dot gnu.org 
> :
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120639
> 
> --- Comment #5 from Robin Dapp  ---
>> Well, consider the desired index vector being a real induction (just
>> store it somewhere).  If we can handle that, we should be able to
>> handle the scatter.  If not, we can't handle the scatter.
> 
> Hmm, I think I misunderstood.  You are arguing that we could build an 
> induction
> variable based on the i_height loop, right?  So roughly like
> 
>  vect_vec_iv = {0, 1, ..., i_width};
>  for (... i_height)
>   {
>  ...
>  idxs = "[vect_vec_iv, vect_vec_iv + {i_dst_stride, ...}, ...]"
>  IFN_SCATTER_STORE (dst, idxs);
>  vect_vec_iv += {i_dst_stride, i_dst_stride, ...};
>   }?
> 
> I guess this can always be implemented as a scatter one way or another?
> 
> But my objective is actually two-fold in that I want to use the full vector
> size and also conflate as many elements as possible into a single one (i.e. 8
> chars into one uint64_t).  The second part helps gather/scatter as well as
> strided loads/stores independently as it reduces the number of individual
> elements (thus reducing the scatter/gather latency).
> 
> So I think in order to make full use of the vector size the induction approach
> can work as we construct the index vector appropriately.
> 
> For conflating/reinterpreting a subset of dynamic indices we IMHO need static
> code that is dynamically dispatched as described in my previous message.
> 
> I.e. a loop over i_width:
>  while (rem > 0)
>   {
> if (rem == 8)
>"scatter/strided store with 64-bit elements"
> if (rem == 4)
>"scatter/strided store with 32-bit elements"
> rem -= elsz;
>   }
> 
> I realize that's not something we do at all right now, hence my initial
> question.  Irrespective of how/if something like that could be implemented (I
> can only imagine virtual/composition modes right now), is it even desirable in
> any way?  I know that it would help our uarch at least.

It would be possible to devise a versioning scheme plus eventually an in-loop
dispatch for this.  We currently cannot version for multiple vector variants,
but we need to ensure rem is handled?

> 
> --
> You are receiving this mail because:
> You are on the CC list for the bug.

[Bug tree-optimization/120729] Compilation takes forever with -Wuninitialized

2025-06-20 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120729

Jakub Jelinek  changed:

   What|Removed |Added

   Keywords|needs-bisection |
 CC||jakub at gcc dot gnu.org

--- Comment #5 from Jakub Jelinek  ---
r15-571-g1e0ae1f52741f7e0133661659ed2d210f939a398 fixed it.

[Bug c++/120720] __builtin_rev_crc32_data8 (and family) should be constexpr functions

2025-06-20 Thread sh1.gccbug at tikouka dot nz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120720

--- Comment #2 from Simon H.  ---
I'm not sure what the canonical non-builtin entrypoint would be in this case. 
Some kind of idiom detection?

I tried the is_constant_evaluated() test but it only picks up things which are
explicitly constexpr rather than optimising away all such calls (see the
difference between tests f3() and g3()).

[Bug target/101366] memset codegen for constant sized does not use SSE instructions

2025-06-20 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101366

Sam James  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED

[Bug target/108585] memset uses SSE stores but afterwards does not but if used "" will use them

2025-06-20 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108585

Sam James  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED

[Bug tree-optimization/120654] [14/15/16 Regression] Crash at -O3: during GIMPLE pass: ivopts

2025-06-20 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120654

--- Comment #3 from Richard Biener  ---
So we get

[irange] UNDEFINED

into range_fits_type_p and then

  /* We can only handle integral and pointer types.  */
  src_type = vr->type (); 
  if (!INTEGRAL_TYPE_P (src_type)
  && !POINTER_TYPE_P (src_type))
return false;

this check is a bit dubious given an 'irange' should guarantee this.  We
are later checking

 /* Now we can only handle ranges with constant bounds.  */
  if (vr->undefined_p () || vr->varying_p ())
return false;

anyway, so do this early.

[Bug tree-optimization/120654] [14/15/16 Regression] Crash at -O3: during GIMPLE pass: ivopts

2025-06-20 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120654

--- Comment #4 from GCC Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:6bd1223bd55ed60fa5dbfd4a8444e133e5e933f5

commit r16-1589-g6bd1223bd55ed60fa5dbfd4a8444e133e5e933f5
Author: Richard Biener 
Date:   Fri Jun 20 11:14:38 2025 +0200

tree-optimization/120654 - ICE with range query from IVOPTs

The following ICEs as we hand down an UNDEFINED range to where it
isn't expected.  Put the guard that's there earlier.

PR tree-optimization/120654
* vr-values.cc (range_fits_type_p): Check for undefined_p ()
before accessing type ().

* gcc.dg/torture/pr120654.c: New testcase.

[Bug tree-optimization/116674] [15 regression] ICE in vectorizable_simd_clone_call bisected to r15-3509-gd34cda72098867

2025-06-20 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116674

--- Comment #7 from GCC Commits  ---
The releases/gcc-14 branch has been updated by Richard Biener
:

https://gcc.gnu.org/g:638e90e5e8000b6b6b320b02229310c63c441b9f

commit r14-11855-g638e90e5e8000b6b6b320b02229310c63c441b9f
Author: Richard Biener 
Date:   Wed Sep 11 13:54:33 2024 +0200

tree-optimization/116674 - vectorizable_simd_clone_call and re-analysis

When SLP analysis scraps an instance because it fails to analyze we
can end up calling vectorizable_* in analysis mode on a node that
was analyzed during the analysis of that instance again.
vectorizable_simd_clone_call wasn't expecting that and instead
guarded analysis/transform code on populated data structures.
The following changes it so it survives re-analysis.

PR tree-optimization/116674
* tree-vect-stmts.cc (vectorizable_simd_clone_call): Support
re-analysis.

* g++.dg/vect/pr116674.cc: New testcase.

(cherry picked from commit 09a514fbb67caf7e33a6ceddf524ee21024c33c5)

[Bug tree-optimization/120654] [14/15 Regression] Crash at -O3: during GIMPLE pass: ivopts

2025-06-20 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120654

Richard Biener  changed:

   What|Removed |Added

Summary|[14/15/16 Regression] Crash |[14/15 Regression] Crash at
   |at -O3: during GIMPLE pass: |-O3: during GIMPLE pass:
   |ivopts  |ivopts
  Known to work||16.0

--- Comment #5 from Richard Biener  ---
Fixed on trunk sofar.

[Bug tree-optimization/120666] ICE when building with fast-math, O3 and LTO

2025-06-20 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120666

Richard Biener  changed:

   What|Removed |Added

 Resolution|--- |DUPLICATE
 Status|ASSIGNED|RESOLVED

--- Comment #4 from Richard Biener  ---
I've picked it, so this one is a duplicate now.

*** This bug has been marked as a duplicate of bug 116674 ***

[Bug cobol/120730] New: parse.cc doesn't compile with bison 3.5.1

2025-06-20 Thread ro at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120730

Bug ID: 120730
   Summary: parse.cc doesn't compile with bison 3.5.1
   Product: gcc
   Version: 16.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: cobol
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ro at gcc dot gnu.org
CC: jklowden at gcc dot gnu.org, rdubner at gcc dot gnu.org
  Target Milestone: ---

When trying to build COBOL on Ubuntu 20.04 with the bundled bison 3.5.1,
compilation fails:

/vol/gcc/src/hg/master/local/gcc/cobol/parse.y: In function ‘const char*
keyword_str(int)’:
/vol/gcc/src/hg/master/local/gcc/cobol/parse.y:11412:8: error: ‘YYerror’ was
not declared in this scope; did you mean ‘yyerror’?
11412 |   case YYerror: return "YYerror";
  |^~~
  |yyerror
/vol/gcc/src/hg/master/local/gcc/cobol/parse.y:11413:8: error: ‘YYUNDEF’ was
not declared in this scope
11413 |   case YYUNDEF: return "invalid token";
  |^~~

While both symbols are defined in parse.h when using bison 3.8.2, they are
missing with 3.5.1.

Either install.texi should be updated to document the corrent requirements
(it currently states that 3.5.1 is enough) or parse.y changed to avoid the
dependency.

[Bug tree-optimization/116674] [15 regression] ICE in vectorizable_simd_clone_call bisected to r15-3509-gd34cda72098867

2025-06-20 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116674

Richard Biener  changed:

   What|Removed |Added

 CC||eebssk1 at godaftwithebk dot 
pub

--- Comment #8 from Richard Biener  ---
*** Bug 120666 has been marked as a duplicate of this bug. ***

[Bug gcov-profile/120634] Memory leak in prime-paths.cc selftests (and possibly in general?)

2025-06-20 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120634

--- Comment #3 from GCC Commits  ---
The master branch has been updated by Jorgen Kvalsvik :

https://gcc.gnu.org/g:246c33ac8e8e1967c74ae20c07454a24ef02822a

commit r16-1590-g246c33ac8e8e1967c74ae20c07454a24ef02822a
Author: Jørgen Kvalsvik 
Date:   Thu Jun 19 20:56:30 2025 +0200

Free buffer on function exit [PR120634]

Using auto_vec ensures that the buffer is always free'd when the
function returns.

PR gcov-profile/120634

gcc/ChangeLog:

* prime-paths.cc (trie::paths): Use auto_vec.

[Bug gcov-profile/120634] Memory leak in prime-paths.cc selftests (and possibly in general?)

2025-06-20 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120634

--- Comment #4 from GCC Commits  ---
The master branch has been updated by Jorgen Kvalsvik :

https://gcc.gnu.org/g:69725b13e9dc8bdb17ec8a7d554071b6b517ad47

commit r16-1591-g69725b13e9dc8bdb17ec8a7d554071b6b517ad47
Author: Jørgen Kvalsvik 
Date:   Thu Jun 19 21:00:07 2025 +0200

Use auto_vec in prime paths selftests [PR120634]

The selftests had a bunch of memory leaks that showed up in make
selftest-valgrind as a result of not using auto_vec or other
explicitly calling release. Replacing vec with auto_vec makes the
problem go away.  The auto_vec_vec helper is made constructable from a
vec so that objects returned from functions can be automatically
managed too.

PR gcov-profile/120634

gcc/ChangeLog:

* prime-paths.cc (struct auto_vec_vec): Add constructor from
vec.
(test_split_components): Use auto_vec_vec.
(test_scc_internal_prime_paths): Ditto.
(test_scc_entry_exit_paths): Ditto.
(test_complete_prime_paths): Ditto.
(test_entry_prime_paths): Ditto.
(test_singleton_path): Ditto.

[Bug c++/120729] Compilation takes forever with -Wuninitialized

2025-06-20 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120729

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
   Keywords||compile-time-hog
 CC||rguenth at gcc dot gnu.org
   Last reconfirmed||2025-06-20

--- Comment #1 from Richard Biener  ---
I see that the recursion to PHI defs isn't limited (all paths should be), so
this is likely the obvious issue here.  Re-using
param_uninit_control_dep_attempts might be a good fit here (but it's default is
quite high).

[Bug tree-optimization/120704] [14/15/16 regression] Wrong code at -O3

2025-06-20 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120704

Richard Biener  changed:

   What|Removed |Added

   Assignee|rguenth at gcc dot gnu.org |unassigned at gcc dot 
gnu.org
 Status|ASSIGNED|NEW
   Keywords||needs-reduction

--- Comment #10 from Richard Biener  ---
-O2 -funswitch-loops is enough to trigger

t.c:6:11: runtime error: signed integer overflow: -2147203927 * -14 cannot be
represented in type 'int'
t.c:6:11: runtime error: signed integer overflow: -2146205926 * -14 cannot be
represented in type 'int'
t.c:6:23: runtime error: signed integer overflow: -2840684 - 2147483647 cannot
be represented in type 'int'

the unswitching looks OK, so it must be some followup transform messing
things up.  Disabling phiopt and LIM after unswitching doesn't cure things
though.  We do (late) thread the hell out of this though, creating many
loop (copies?).  -fdisable-tree-thread2 -fdisable-tree-threadfull2 removes one
UBSAN diagnostic.

The flow through the program is quite complicated (at -O0), so it's difficult
to verify correctness :/  I'm not working on this for the moment.

[Bug target/115842] [15/16 Regression] 6.5% slowdown of 548.exchange2_r on Intel Ice Lake

2025-06-20 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115842

--- Comment #12 from Tamar Christina  ---
(In reply to Hongtao Liu from comment #11)
> (In reply to Tamar Christina from comment #9)
> > (In reply to Hongtao Liu from comment #8)
> > > (In reply to Tamar Christina from comment #7)
> > > > (In reply to Hongtao Liu from comment #6)
> > > > >  I noticed some double-counting of cost in group-candidate (regarding 
> > > > > loop
> > > > > invariant expressions), this modification reduces the number of 
> > > > > instructions
> > > > > executed by ~8% for exchange_r binary compiled with -march=x86-64-v3 
> > > > > -O2.
> > > > > 
> > > > 
> > > > Note that this patch causes regressions on AArch64.  While exchange 
> > > > improves
> > > > slightly I see regressions in: leela, -5%, mcf, xz, x264, deepsjeng -2%,
> > > > geomean -1%
> > > 
> > > What options do you use, we have an AmpereOne machine, like to try to see 
> > > if
> > > it's reproduciable on it.
> > 
> > This was on Neoverse-V2, but probably reproducible on AmpereOne, the flags
> > was -mcpu=native -Ofast -fomit-framepointer -flto=auto
> 
> I tested my patch against latest trunk, and use the same option, can't
> reproduce those regression on AWS graviton4.
> 

Sorry for the slow response. I did rebase and retry with latest trunk and
indeed  I no longer see any slowdowns with current trunk.

[Bug fortran/120723] [13/14/15/16 Regression][OpenACC] '!$acc ... attach(scalar)' – ICE 'unexpected pointer mapping node'

2025-06-20 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120723

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |13.5

[Bug target/120722] [16 Regression][gcn] ICE in gen_highpart, at emit-rtl.cc:1674 since r16-1565-g2dcc6dbd8a00ca when building libgcc/strub.c for gfx1036

2025-06-20 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120722

--- Comment #1 from Tobias Burnus  ---
Fails for:

Breakpoint 1, gen_highpart (mode=E_SImode, x=0x77195378) at
/home/tob/repos/gcc/gcc/emit-rtl.cc:1674
1674  gcc_assert (result && !MEM_P (result));

(gdb) p result
$1 = (rtx) 0x0

(gdb) p debug_rtx(x)
(reg:DI 106 vcc_lo [orig:693 loop_mask_26 ] [693])


with:

1670  result = simplify_gen_subreg (mode, x, GET_MODE (x),
1671   subreg_highpart_offset (mode, GET_MODE
(x)));

Caller:
#1  0x00cb7376 in gen_highpart_mode (outermode=E_SImode,
innermode=E_DImode, exp=0x77195348)

called in #1  0x00cb7376 in gen_highpart_mode (outermode=E_SImode,
innermode=E_DImode, exp=0x77195348)

668   [RM  ,v   ;flat ,12,*,*  ] global_store_dwordx2\t%A0, %1%O0%g0
669   [RM  ,a   ;flat ,12,cdna2,*  ] ^
670   }
671   "reload_completed
672&& ((!MEM_P (operands[0]) && !MEM_P (operands[1])
673 && !gcn_sgpr_move_p (operands[0], operands[1]))
674|| (GET_CODE (operands[1]) == CONST_INT
(gdb) 
675&& !gcn_constant64_p (operands[1])))"
676   [(set (match_dup 0) (match_dup 1))
677(set (match_dup 2) (match_dup 3))]
678   {
679 rtx inlo = gen_lowpart (SImode, operands[1]);
680 rtx inhi = gen_highpart_mode (SImode, mode, operands[1]);
681 rtx outlo = gen_lowpart (SImode, operands[0]);
682 rtx outhi = gen_highpart_mode (SImode, mode, operands[0]);
683
684 /* Ensure that overlapping registers aren't corrupted.  */

[Bug c++/120729] Compilation takes forever with -Wuninitialized

2025-06-20 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120729

Richard Biener  changed:

   What|Removed |Added

   Keywords||needs-bisection
  Component|tree-optimization   |c++
  Known to fail||14.3.1
 Blocks|24639   |
  Known to work||15.1.1, 16.0

--- Comment #2 from Richard Biener  ---
Interestingly compiling the preprocessed source with a arm cross from x86-64
does not show the problem with GCC trunk (where I happen to have a built cross
lying around).  -ftime-report there shows

 uninit var analysis:   0.01 (  0%)  1400  (  0%)
 TOTAL  :  18.00  277M

The uninit code didn't see any changes since GCC 14 (surrounding optimization
of course could).

I can confirm this on the GCC 14 branch with a cross.  We're doing analysis
in typeFromString.  GCC 15 isn't affected either.  It would be nice to know
what fixed this.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=24639
[Bug 24639] [meta-bug] bug to track all Wuninitialized issues

[Bug tree-optimization/120701] [16 regression] ICE at -O{2,3} on x86_64-linux-gnu: in verify_range, at value-range.cc:1546 since r16-1550-g9244ea4bf55638

2025-06-20 Thread amacleod at redhat dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120701

Andrew Macleod  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED

--- Comment #4 from Andrew Macleod  ---
During operator_mult::fold_range, the result of the operation is

[irange] int [-INF, -INF][1101344938, 1101344938][+INF, +INF]
and it determines the resulting bitmask should also be

MASK 0xfffe VALUE 0x0

When update_bitmask () is called, the bound snapping routine was suppose to be
checking for overflows and underflows, and removing the ranges if that happens.
 Unfortunately the way I was checking did not work. 

The new patch is in testing and corrects this oversight.  The [+INF, +INF] will
be correctly removed instead of being replaced with nonsense.

It was tricky to figure out exactly where it was happening because it turns out
that verify_range () only checks that each suubrange pair has a lower bound
that is lower than the upper bound of the pair.   It does NOT check if the
current pair lower bound is greater than the previous pair upper bound.  THus
the failure wasnt detected until mich furtehr in the pipeline when a vale weas
loaded from storage.  

I have added that check to verify_range as well, and tested that this test case
would have trapped in exactly the correct spot as the bad range was created.

Patch is undergoing bootstrap/regression testing now

[Bug modula2/120731] New: Possible error in Strings.Pos causing sigsegv

2025-06-20 Thread gaius at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120731

Bug ID: 120731
   Summary: Possible error in Strings.Pos causing sigsegv
   Product: gcc
   Version: 16.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: modula2
  Assignee: gaius at gcc dot gnu.org
  Reporter: gaius at gcc dot gnu.org
  Target Milestone: ---

Forwarded from the gm2 mailing list:

$ cat essai5.mod

MODULE essai5;

IMPORT InOut,Strings;

VAR
   ligne : ARRAY[1..256] OF CHAR;
   position : CARDINAL;

(* the ligne content is just a random line *)

BEGIN
   ligne := "erreur: In program module « essai3 »: attempting to pass
(1) parameters to procedure";
   InOut.WriteString(ligne);
   InOut.WriteLn;
   position := Strings.Pos ("IMPORT", ligne);
END essai5.

$ gm2 -o essai5 essai5.mod
$ ./essai5
erreur: In program module « essai5 »: attempting to pass (1) parameters
to procedure
Erreur de segmentation

[Bug c++/120732] New: Compiler doesn't generate a call to a vector function

2025-06-20 Thread ibogosavljevic at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120732

Bug ID: 120732
   Summary: Compiler doesn't generate a call to a vector function
   Product: gcc
   Version: 15.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ibogosavljevic at gmail dot com
  Target Milestone: ---

Created attachment 61671
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=61671&action=edit
Test example

The compiler doesn't generate a call to a vector function. The example is
attached.

I declared a function 
__attribute__ ((__simd__ ("notinbranch"))) 
double square(double x);

However, the loop which is lower


for (size_t i = 0; i < s; i++) {
d[i] = square(d[i]);
}


doesn't call vectorized version of sqaure, but just a plain old scalar version.
If I put #pragma omp simd before the loop, the compiler generates a call to
vector version of square.

Compilation line:
 g++ -c -O3 -fopenmp-simd -ffast-math -mavx2 test.cpp -S -o test.asm

Why doesn't the compiler generate a call to vectorized version of square, since
all prerequisites are met?

[Bug target/120722] [16 Regression][gcn] ICE in gen_highpart, at emit-rtl.cc:1674 since r16-1565-g2dcc6dbd8a00ca when building libgcc/strub.c for gfx1036

2025-06-20 Thread ams at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120722

--- Comment #3 from Andrew Stubbs  ---
It looks like this is the hunk that changes the behaviour.

--- a/gcc/emit-rtl.cc
+++ b/gcc/emit-rtl.cc
@@ -975,8 +975,9 @@ validate_subreg (machine_mode omode, machine_mode imode,

   /* Verify that the offset is representable.  */

-  /* For hard registers, we already have most of these rules collected in
- subreg_offset_representable_p.  */
+  /* Ensure that subregs of hard registers can be folded.  In other words,
+ the hardware register must be valid in the subreg's outer mode,
+ and consequently the subreg can be replaced with a hardware register.  */
   if (reg && REG_P (reg) && HARD_REGISTER_P (reg))
 {
   unsigned int regno = REGNO (reg);
@@ -987,7 +988,9 @@ validate_subreg (machine_mode omode, machine_mode imode,
   else if (!REG_CAN_CHANGE_MODE_P (regno, imode, omode))
return false;

-  return subreg_offset_representable_p (regno, imode, offset, omode);
+  /* Pass true to allow_stack_regs because targets like x86
+expect to be able to take subregs of the stack pointer.  */
+  return simplify_subreg_regno (regno, imode, offset, omode, true) >= 0;
 }
   /* Do not allow normal SUBREG with stricter alignment than the inner MEM.

The actual code that rejects the pattern is this (unchanged in the patch):

  /* See whether (reg:YMODE YREGNO) is valid.

 ??? We allow invalid registers if (reg:XMODE XREGNO) is also invalid.
 This is a kludge to work around how complex FP arguments are passed
 on IA-64 and should be fixed.  See PR target/49226.  */
  if (!targetm.hard_regno_mode_ok (yregno, ymode)
  && targetm.hard_regno_mode_ok (xregno, xmode))
return -1;

The problem is that GCN vcc_lo is an SImode hardreg, so DImode also uses
vcc_hi, but vcc_hi is not allowed to be used as a register on its own, because
the compiler did all sorts of buggy nonsense if it was allowed. We achieved
this, years ago, by saying SImode is disallowed (even though it would
technically work), but there's an assumption here that the upper part of a
DImode register pair must allow SImode. Until now it was fine because it could
still be expressed as a subreg when necessary.

[Bug tree-optimization/120598] Compiler is unable to vectorise scalar code

2025-06-20 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120598

Segher Boessenkool  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #10 from Segher Boessenkool  ---
(In reply to Richard Biener from comment #9)
> (In reply to Segher Boessenkool from comment #8)
> > (In reply to Jeevitha from comment #6)
> > > The following dot_product function gets vectorized with the latest GCC 
> > > trunk
> > > and gcc 15.1.0:
> > > 
> > > #include 
> > > #include 
> > > extern float dot_product(const int16_t *v1, const int16_t *v2, size_t 
> > > len);
> > > float dot_product(const int16_t *v1, const int16_t *v2, size_t len)
> > > {
> > > int64_t d = 0;
> > > for (size_t i = 0; i < len; i++)
> > > d += int32_t(v1[i]) * int32_t(v2[i]);
> > > return static_cast(d);
> > > }
> > > 
> > > 
> > > I observed that -O2 was used during compilation. However, for GCC versions
> > > earlier than 15, vectorization of this loop requires -O3. Since they are
> > > using the -O2 flag, GCC 15 necessary in this case.
> > 
> > Is that what the original code does?  Or does it convert every number to
> > float
> > and then sum over that?
> 
> The above is from the preprocessed source.

Ah heh.

> > And, can you try to find out what patch to GCC 15 made this work at -O2?  In
> > case we want to backport anything, but also just to get a better grip on 
> > what
> > is happening  here :-)
> 
> With GCC 15 we allow peeling for niter at -O2,

Yup, I added that :-)  Well, Jiu Fu did (48f657953fe5), but heh.

> with GCC 14 and earlier at -O2
> we effectively only ever vectorize loops with constant number of iterations
> (divisible by vector size).
> 
> I'd say this is "fixed" (it was reported against GCC 15), but the function
> is 'static' in the preprocessed sources and thus likely inlined.

Yup.

> I'll
> also note that plain SSE2 is a bit inefficient for this loop.

On Power that thankfully does not exist at all, and this bug was reported as
a target bug ;-)

> So maybe the reporter can clarify "We’ve observed that while functions in
> the PGVector library benefit from both loop unrolling and auto-vectorization
> (even with earlier versions of GCC, like 13.3 and 11.5), the same does not
> hold true for the dot_product function in the MariaDB library" - does this
> mean
> the autovectorization makes the function slower?  That would mean our cost
> model isn't good enough here.

All of that was tested on Power.  P9 probably, not sure.

I'll mark it as fixed.  Thanks everyone!

[Bug tree-optimization/53947] [meta-bug] vectorizer missed-optimizations

2025-06-20 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
Bug 53947 depends on bug 120598, which changed state.

Bug 120598 Summary: Compiler is unable to vectorise scalar code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120598

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

[Bug target/120737] New: #pragma omp atomic fails on nvptx

2025-06-20 Thread schulz.benjamin at googlemail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120737

Bug ID: 120737
   Summary: #pragma omp atomic fails on nvptx
   Product: gcc
   Version: 15.1.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: schulz.benjamin at googlemail dot com
  Target Milestone: ---

Created attachment 61673
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=61673&action=edit
compile with -g -O3 -fopenmp -foffload=nvptx-none  -fno-stack-protector
-std=c++23 -lm -lc result will yield 0 instead of 2740. removing the target
statement in the last loop will yield correct result

Hi there, I noticed that 

#pragma omp target teams distribute does, by the OpenMP standard 5x, not
support a reduction clause 

https://www.openmp.org/spec-html/5.0/openmpse15.html#x57-910002.7
https://www.openmp.org/spec-html/5.0/openmpsu73.html

GCC neverhteless seems to allow it as an extension and seems to work correctly 

if we fill an array:

size_t elements=20;
std::vector v1(elements),v2(elements);
#pragma omp parallel for simd
for(size_t i=1;ihttps://www.openmp.org/spec-html/5.0/openmpse15.html#x57-910002.7


#pragma omp target teams distribute shared(tmp) is standards compilant if tmp
was mapped.

And now, one has #pragma omp atomic so that only one thread can access a single
variable at a time. Since teams distribute is just a league of threads, omp
atomic should work in the same way.

The following should then mimic the reduction in a OpenMP standard compilant
way. It just fails


#include 
#include 
#include 
int main()
{
size_t elements=20;
std::vector v1(elements),v2(elements);

#pragma omp parallel for simd
for(size_t i=1;i

[Bug target/120741] New: [16 Regression] ICE on mingw-w64-12.0.0: during RTL pass: pro_and_epilogue ICE in ix86_expand_prologue, at config/i386/i386.cc:9446

2025-06-20 Thread slyfox at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120741

Bug ID: 120741
   Summary: [16 Regression] ICE on mingw-w64-12.0.0: during RTL
pass: pro_and_epilogue ICE in ix86_expand_prologue, at
config/i386/i386.cc:9446
   Product: gcc
   Version: 16.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: slyfox at gcc dot gnu.org
  Target Milestone: ---

After a PR120697 fix (I did not check before) gcc-master ICEs on
mingw-w64-12.0.0 for `i686-w64-mingw32` target as.

Extracted example:

// $ cat mingw_wvfscanf.c.c
typedef __builtin_va_list va_list;
short *__mingw_swformat_format;
va_list __mingw_swformat_arg;
int __mingw_swformat_fc;
double __mingw_wcstod();
void *memset();
typedef struct {
  void *fp;
  int bch[1024];
} _IFP;
static int __mingw_swformat(_IFP *s, va_list argp) {
  short *f = __mingw_swformat_format;
  __mingw_swformat_arg = argp;
  if (!s || s->fp)
return '.';
  while (f)
switch (__mingw_swformat_fc) {
case 'A':
  double d = __mingw_wcstod();
  *__builtin_va_arg(__mingw_swformat_arg, double *) = d;
}
  _IFP ifp;
  return __mingw_swformat(&ifp, argp);
}
void __mingw_vswscanf(va_list argp) {
  _IFP ifp;
  memset(&ifp, 0, sizeof(_IFP));
  __mingw_swformat(&ifp, argp);
}

$ i686-w64-mingw32-gcc -std=gnu99   -O2 -c mingw_wvfscanf.c.c -o bug.o

during RTL pass: pro_and_epilogue
mingw_wvfscanf.c.c: In function '__mingw_vswscanf':
mingw_wvfscanf.c.c:29:1: internal compiler error: in ix86_expand_prologue, at
config/i386/i386.cc:9446
   29 | }
  | ^
0x21b2e98 diagnostic_context::diagnostic_impl(rich_location*,
diagnostic_metadata const*, diagnostic_option_id, char const*, __va_list_tag
(*) [1], diagnostic_t)
???:0
0x21cd32a internal_error(char const*, ...)
???:0
0x7990c9 fancy_abort(char const*, int, char const*)
???:0
0x7816db ix86_expand_prologue() [clone .cold]
???:0
0x1a96d1d gen_prologue()
???:0
0x12cbcdc target_gen_prologue()
???:0
0xad1d1f make_prologue_seq()
???:0
0xad2425 thread_prologue_and_epilogue_insns()
???:0
0xad252a (anonymous
namespace)::pass_late_thread_prologue_and_epilogue::execute(function*)
???:0
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See  for instructions.

$ i686-w64-mingw32-gcc -v |& unnix
Using built-in specs.
COLLECT_GCC=/<>/i686-w64-mingw32-nolibc-gcc-16.0.0./bin/i686-w64-mingw32-gcc
COLLECT_LTO_WRAPPER=/<>/i686-w64-mingw32-nolibc-gcc-16.0.0./libexec/gcc/i686-w64-mingw32/16.0.0/lto-wrapper
Target: i686-w64-mingw32
Configured with: ../source/configure
--prefix=/<>/i686-w64-mingw32-nolibc-gcc-16.0.0.
--with-gmp-include=/<>/gmp-with-cxx-6.3.0-dev/include
--with-gmp-lib=/<>/gmp-with-cxx-6.3.0/lib
--with-mpfr-include=/<>/mpfr-4.2.2-dev/include
--with-mpfr-lib=/<>/mpfr-4.2.2/lib --with-mpc=/<>/libmpc-1.3.1
--program-prefix=i686-w64-mingw32- --enable-lto --disable-libstdcxx-pch
--without-included-gettext --with-system-zlib --enable-checking=release
--enable-static --enable-languages=c --disable-multilib --disable-shared
--enable-plugin --with-isl=/<>/isl-0.20
--with-as=/<>/i686-w64-mingw32-binutils-wrapper-2.44/bin/i686-w64-mingw32-as
--disable-libssp --disable-nls --without-headers --disable-threads
--disable-libgomp --disable-libquadmath --disable-shared --disable-libatomic
--disable-decimal-float --disable-libmpx
--with-headers=/<>/mingw_w64-headers-12.0.0/include --with-gcc
--with-gnu-as --with-gnu-ld --disable-debug --disable-win32-registry
--enable-hash-synchronization --enable-libssp --disable-nls
--enable-fully-dynamic-string --enable-sjlj-exceptions --with-dwarf2
--disable-bootstrap --build=x86_64-unknown-linux-gnu
--host=x86_64-unknown-linux-gnu --target=i686-w64-mingw32
--with-build-sysroot=/build/source/..
Thread model: single
Supported LTO compression algorithms: zlib
gcc version 16.0.0  (experimental) (GCC)

[Bug c++/120742] New: ICE: tree check: expected tree_vec, have type_pack_expansion in coerce_template_parameter_pack, at cp/pt.cc:9042

2025-06-20 Thread rush102333 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120742

Bug ID: 120742
   Summary: ICE: tree check: expected tree_vec, have
type_pack_expansion in coerce_template_parameter_pack,
at cp/pt.cc:9042
   Product: gcc
   Version: 16.0
Status: UNCONFIRMED
  Keywords: ice-checking, ice-on-invalid-code
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: rush102333 at gmail dot com
  Target Milestone: ---

Simplified test:

~~

template  struct A
{
  template  class... TP, class U> struct B;
  template <, typename U> class B{ };

};

~~

Stack Dump: 

~~

:4:13: error: expected identifier before ',' token [-Wtemplate-body]
4 |   template <, typename U> class B{ };
  | ^
:4:35: error: 'TP' was not declared in this scope; did you mean 'T'?
[-Wtemplate-body]
4 |   template <, typename U> class B{ };
  |   ^~
  |   T
:4:40: internal compiler error: tree check: expected tree_vec, have
type_pack_expansion in coerce_template_parameter_pack, at cp/pt.cc:9042
4 |   template <, typename U> class B{ };
  |^
0x2839bc5 diagnostic_context::diagnostic_impl(rich_location*,
diagnostic_metadata const*, diagnostic_option_id, char const*, __va_list_tag
(*) [1], diagnostic_t)
???:0
0x285b356 internal_error(char const*, ...)
???:0
0x9eb228 tree_check_failed(tree_node const*, char const*, int, char const*,
...)
???:0
0xd3b3ff lookup_template_class(tree_node*, tree_node*, tree_node*, tree_node*,
int)
???:0
0xd88e1c finish_template_type(tree_node*, tree_node*, int)
???:0
0xd07843 c_parse_file()
???:0
0xe6f329 c_common_parse_file()
???:0
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See  for instructions.

~~

See https://godbolt.org/z/EMcf6P1cE

[Bug c++/80151] Add a warning to catch implicit string to bool conversion (-Wstring-conversion)

2025-06-20 Thread doug at cs dot dartmouth.edu via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80151

doug mcilroy  changed:

   What|Removed |Added

 CC||doug at cs dot dartmouth.edu

--- Comment #3 from doug mcilroy  ---
Generalize this bug to "inconsistently warning about converting a pointer to a
bool". Only the second of the initializations below draws a warning that the
conversion must yield `true'. 

extern int f(bool);
int main() {
int a = f("hello");
int b = f(&a)
return 0;
}

[Bug tree-optimization/120701] [16 regression] ICE at -O{2,3} on x86_64-linux-gnu: in verify_range, at value-range.cc:1546 since r16-1550-g9244ea4bf55638

2025-06-20 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120701

--- Comment #7 from GCC Commits  ---
The master branch has been updated by Andrew Macleod :

https://gcc.gnu.org/g:b03e0d69b37f6ea7aef220652635031a89f56a11

commit r16-1594-gb03e0d69b37f6ea7aef220652635031a89f56a11
Author: Andrew MacLeod 
Date:   Fri Jun 20 08:50:39 2025 -0400

Fix range wrap check and enhance verify_range.

when snapping range bounds to satidsdaybitmask constraints, end bound
overflow
and underflow checks were not working properly.
Also Adjust some comments, and enhance verify_range to make sure range
pairs
are sorted properly.

PR tree-optimization/120701
gcc/
* value-range.cc (irange::verify_range): Verify range pairs are
sorted properly.
(irange::snap): Check for over/underflow properly.

gcc/testsuite/
* gcc.dg/pr120701.c: New.

[Bug target/118241] RISC-V ICE: internal compiler error: in int_mode_for_mode, at stor-layout.cc:407 caused by prefetch instructions

2025-06-20 Thread law at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118241

--- Comment #7 from Jeffrey A. Law  ---
Ah nuts!  My mistake, I conflated this issue which is "prefetch" with the other
issue in the "prefetchi" space.  "prefetchi" is broken and likely requires a
deeper fix than what we need here.

I don't see a good path for Andrew's suggestion.  The "prefetch" pattern name
is special and does not include a mode.  So there's can be one and only one. 
That in turn means that there's no way to support different mode operands,
which is important as a pointer might be SI or DI.  So while we could have a
prefetch expander, its operands would continue to be modeless and would then
match a X mode on their associated insn.  I'm not sure that's an improvement
over what we've currently got.

The most straightforward path here is to tighten up the operands which probably
needs to be done anyway.  I think the biggest question is whether or not we try
to support offsets.  They don't seem difficult, we just need to reject those
where the bottom 4 bits are nonzero.  I'll play a bit with that to see if there
are any surprises.

[Bug tree-optimization/120701] [16 regression] ICE at -O{2,3} on x86_64-linux-gnu: in verify_range, at value-range.cc:1546 since r16-1550-g9244ea4bf55638

2025-06-20 Thread amacleod at redhat dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120701

Andrew Macleod  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #5 from Andrew Macleod  ---
fixed

[Bug tree-optimization/120701] [16 regression] ICE at -O{2,3} on x86_64-linux-gnu: in verify_range, at value-range.cc:1546 since r16-1550-g9244ea4bf55638

2025-06-20 Thread amacleod at redhat dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120701

--- Comment #6 from Andrew Macleod  ---
Not sure why the commit isn't showing up... regardless:

commit b03e0d69b37f6ea7aef220652635031a89f56a11
Author: Andrew MacLeod 
Date:   Fri Jun 20 08:50:39 2025 -0400

Fix range wrap check and enhance verify_range.

when snapping range bounds to satisfy bitmask constraints, end bound
overflow
and underflow checks were not working properly.
Also Adjust some comments, and enhance verify_range to make sure range
pairs
are sorted properly.

PR tree-optimization/120701
gcc/
* value-range.cc (irange::verify_range): Verify range pairs are
sorted properly.
(irange::snap): Check for over/underflow properly.

gcc/testsuite/
* gcc.dg/pr120701.c: New.

[Bug target/120734] New: -m32 -march=i686 doesn't use "mov $0, mem"

2025-06-20 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120734

Bug ID: 120734
   Summary: -m32 -march=i686 doesn't use "mov $0, mem"
   Product: gcc
   Version: 16.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hjl.tools at gmail dot com
  Target Milestone: ---
Target: i386

[hjl@gnu-tgl-3 gcc]$ cat /tmp/x.c 
extern char *aligned_heap_area;

void
bar (void)
{
  aligned_heap_area = 0;
}
[hjl@gnu-tgl-3 gcc]$ ./xgcc -B./ -S -m32 -march=i686 -O2 /tmp/x.c
[hjl@gnu-tgl-3 gcc]$ cat x.s
.file   "x.c"
.text
.p2align 4
.globl  bar
.type   bar, @function
bar:
.LFB0:
.cfi_startproc
xorl%eax, %eax
movl%eax, aligned_heap_area
^^^ "movl $0, aligned_heap_area" should be used.
ret
.cfi_endproc
.LFE0:
.size   bar, .-bar
.ident  "GCC: (GNU) 16.0.0 20250620 (experimental)"
.section.note.GNU-stack,"",@progbits
[hjl@gnu-tgl-3 gcc]$

[Bug target/120733] New: [16 Regression][aarch64] ICE in gen_highpart, at lra.cc:1484 since r16-1565-g2dcc6dbd8a00ca

2025-06-20 Thread dimitar at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120733

Bug ID: 120733
   Summary: [16 Regression][aarch64] ICE in gen_highpart, at
lra.cc:1484 since r16-1565-g2dcc6dbd8a00ca
   Product: gcc
   Version: 16.0
Status: UNCONFIRMED
  Severity: major
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: dimitar at gcc dot gnu.org
  Target Milestone: ---

Commit r16-1565-g2dcc6dbd8a00ca,
   emit-rtl: Use simplify_subreg_regno to validate hardware subregs [PR119966]

causes ICE when building the following (reduced from
gcc.c-torture/compile/pass.c):

  int foo (int a)
  {
return a + 1;
  }
  int bar (void)
  {
int q;
return foo ((int) & q);
  }

Expand produces the following RTL:
  (debug_insn 7 6 8 2 (var_location:SI aD.4606 (plus:SI (subreg:SI (reg/f:DI 96
virtual-stack-vars) 0)
  (const_int -4 [0xfffc])))
"../../gcc/gcc/testsuite/gcc.c-torture/compile/pass.c":16:10 -1
 (nil))

which contains invalid subreg, according to aarch64_hard_regno_mode_ok:
if (regno == FRAME_POINTER_REGNUM || regno == ARG_POINTER_REGNUM)
  return mode == Pmode;

LRA replaces the invalid subreg with nil:
  (debug_insn 7 6 8 2 (var_location:SI aD.4606 (plus:SI (nil)
  (const_int 12 [0xc])))
"../../gcc/gcc/testsuite/gcc.c-torture/compile/pass.c":16:10 -1
   (nil))

which in turn leads to ICE:

0x2a74282 internal_error(char const*, ...)
   
/mnt/nvme/dinux/local-workspace/gcc/gcc/diagnostic-global-context.cc:517
0xe94332 crash_signal
/mnt/nvme/dinux/local-workspace/gcc/gcc/toplev.cc:321
0xbc98a5 add_regs_to_insn_regno_info
/mnt/nvme/dinux/local-workspace/gcc/gcc/lra.cc:1484
0xbc9c21 add_regs_to_insn_regno_info
/mnt/nvme/dinux/local-workspace/gcc/gcc/lra.cc:1568
0xbca0f0 lra_update_insn_regno_info(rtx_insn*)
/mnt/nvme/dinux/local-workspace/gcc/gcc/lra.cc:1661
0xbeea48 process_insn_for_elimination
/mnt/nvme/dinux/local-workspace/gcc/gcc/lra-eliminations.cc:1398
0xb5 lra_eliminate(bool, bool)
/mnt/nvme/dinux/local-workspace/gcc/gcc/lra-eliminations.cc:1506
0xbe2b6f lra_constraints(bool)
/mnt/nvme/dinux/local-workspace/gcc/gcc/lra-constraints.cc:5368
0xbcbfa6 lra(_IO_FILE*, int)
/mnt/nvme/dinux/local-workspace/gcc/gcc/lra.cc:2455
0xb73ddc do_reload
/mnt/nvme/dinux/local-workspace/gcc/gcc/ira.cc:5986
0xb74292 execute
/mnt/nvme/dinux/local-workspace/gcc/gcc/ira.cc:6174

[Bug target/119966] [16 regression] pru: Invalid register in RTL expression starting with r16-160-ge6f89d78c1a752

2025-06-20 Thread dimitar at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119966

Dimitar Dimitrov  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=120733
  Known to work|15.1.0  |

--- Comment #17 from Dimitar Dimitrov  ---
(In reply to Jakub Jelinek from comment #15)
> For debug insns, we generally want to be far more tolerant in what subregs
> they accept.

[Bug target/119966] [16 regression] pru: Invalid register in RTL expression starting with r16-160-ge6f89d78c1a752

2025-06-20 Thread dimitar at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119966

--- Comment #18 from Dimitar Dimitrov  ---
(In reply to Jakub Jelinek from comment #15)
> For debug insns, we generally want to be far more tolerant in what subregs
> they accept.

I filed PR120733 to track this particular fallout. I'm not sure if it is "too
much breakage" to warrant a revert, as suggested in
https://gcc.gnu.org/pipermail/gcc-patches/2025-June/685492.html

[Bug target/120734] -m32 -march=i686 doesn't use "mov $0, mem"

2025-06-20 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120734

H.J. Lu  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2025-06-20

--- Comment #1 from H.J. Lu  ---
It is done on purpose with

/* X86_TUNE_SPLIT_LONG_MOVES: Avoid instructions moving immediates
   directly to memory.  */
DEF_TUNE (X86_TUNE_SPLIT_LONG_MOVES, "split_long_moves", m_PPRO)

However, it only applies to MOV, not other store instructions, like ADD, XOR,
AND, ...

[Bug target/120734] -m32 -march=i686 doesn't use "mov $0, mem"

2025-06-20 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120734

--- Comment #2 from H.J. Lu  ---
(In reply to H.J. Lu from comment #1)
> It is done on purpose with
> 
> /* X86_TUNE_SPLIT_LONG_MOVES: Avoid instructions moving immediates
>directly to memory.  */
> DEF_TUNE (X86_TUNE_SPLIT_LONG_MOVES, "split_long_moves", m_PPRO)
> 
> However, it only applies to MOV, not other store instructions, like ADD, XOR,
> AND, ...

This was added by

commit e075ae69f9d1263e78376038ab138c03e279f391
Author: Richard Henderson 
Date:   Wed Sep 1 21:20:21 1999 -0700

Merge new ia32 backend from the branch!

From-SVN: r29044

[Bug c++/120735] New: -Warray-bounds error via std::vector::data after unsigned int overflow potential

2025-06-20 Thread drahflow at gmx dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120735

Bug ID: 120735
   Summary: -Warray-bounds error via std::vector::data after
unsigned int overflow potential
   Product: gcc
   Version: 15.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: drahflow at gmx dot de
  Target Milestone: ---

Created attachment 61672
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=61672&action=edit
Reproducer

Consider https://godbolt.org/z/7hne4hfva

unsigned char f(unsigned int n) {
unsigned int dataSize = 4;
//assert(dataSize + n > dataSize); /* works with this assert enabled */
dataSize += n;
//assert(dataSize > 7);/* works with this assert enabled */
vector buf;
buf.resize(dataSize);
unsigned char *ptr = buf.data();   /* works with a raw array (cf. godbolt)
*/
ptr += 1;  /* works without this calculation */
memcpy(ptr, "abc", 3);

return *ptr;
}

it results in
:17:11: error: 'void* memcpy(void*, const void*, size_t)' offset [0, 2]
is out of the bounds [0, 0] [-Werror=array-bounds=]
   17 | memcpy(ptr, "abc", 3);

This error appears for 11.1 up to trunk.

[Bug target/120736] New: [16 Regression] RISC-V: Miscompile at -O[23] since r15-5375-gbeec291225b

2025-06-20 Thread ewlu at rivosinc dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120736

Bug ID: 120736
   Summary: [16 Regression] RISC-V: Miscompile at -O[23] since
r15-5375-gbeec291225b
   Product: gcc
   Version: 16.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ewlu at rivosinc dot com
  Target Milestone: ---

Testcase:
unsigned char aa (unsigned char ab, int o) { return ab > o ? ab : 0; }
int p;
int s;
static unsigned char q = 255;
int r;
int *v = &s;
int main() {
  p = v != 0;
  for (; r < 8; ++r) {
if (s)
  break;
s = aa(p * q++, 6) <= 0;
  }
  __builtin_printf("%d\n", q);
}

Commands:
# -O3
> /scratch/ewlu/daily-upstream-build/build-gcv/bin/riscv64-unknown-linux-gnu-gcc
>  -O3 ./runtime-bisect/red.c -o user-config.out -fsigned-char 
> -fno-strict-aliasing -fwrapv 
> QEMU_CPU=rv64,vlen=128,rvv_ta_all_1s=true,rvv_ma_all_1s=true,v=true,vext_spec=v1.0,zve32f=true,zve64f=true
>  timeout --verbose -k 0.1 4 
> /scratch/ewlu/daily-upstream-build/build-gcv/bin/qemu-riscv64 user-config.out 
> 1
7

# -O2
> /scratch/ewlu/daily-upstream-build/build-gcv/bin/riscv64-unknown-linux-gnu-gcc
>  -O2 ./runtime-bisect/red.c -o user-config.out -fsigned-char 
> -fno-strict-aliasing -fwrapv 
> QEMU_CPU=rv64,vlen=128,rvv_ta_all_1s=true,rvv_ma_all_1s=true,v=true,vext_spec=v1.0,zve32f=true,zve64f=true
>  timeout --verbose -k 0.1 4 
> /scratch/ewlu/daily-upstream-build/build-gcv/bin/qemu-riscv64 user-config.out 
> 1
7

# -O1
> /scratch/ewlu/daily-upstream-build/build-gcv/bin/riscv64-unknown-linux-gnu-gcc
>  -O1 ./runtime-bisect/red.c -o user-config.out -fsigned-char 
> -fno-strict-aliasing -fwrapv 
> QEMU_CPU=rv64,vlen=128,rvv_ta_all_1s=true,rvv_ma_all_1s=true,v=true,vext_spec=v1.0,zve32f=true,zve64f=true
>  timeout --verbose -k 0.1 4 
> /scratch/ewlu/daily-upstream-build/build-gcv/bin/qemu-riscv64 user-config.out 
> 1
1

Found via fuzzer

[Bug debug/120739] New: ctf: Emit CTF_K_ARRAY for GNU vector types

2025-06-20 Thread bruce.mcculloch at oracle dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120739

Bug ID: 120739
   Summary: ctf: Emit CTF_K_ARRAY for GNU vector types
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: debug
  Assignee: unassigned at gcc dot gnu.org
  Reporter: bruce.mcculloch at oracle dot com
  Target Milestone: ---

Currently, there is a check in gen_ctf_array_type that prevents GNU vectors
generated by the vector attribute from being emitted (e.g. typedef int v8si
__attribute__ ((vector_size (32)));). Because this check happens in
dwarf2ctf.cc, this prevents GNU vectors from being emitted not only in CTF,
but also in BTF. This is a problem, as there are a handful of GNU vectors
present in the kernel that are not being accurately represented in the
vmlinux.{ctfa,btfa}. Additionally, BTF generated by clang emits these vectors
as arrays.

[Bug target/120652] [16 Regression] RISC-V: ICE in require, at machmode.h:323 with rv64gcv_zvl256b

2025-06-20 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120652

--- Comment #2 from GCC Commits  ---
The master branch has been updated by Pan Li :

https://gcc.gnu.org/g:52582b40a9bf839ae3771de1557ce6691eb8eedd

commit r16-1597-g52582b40a9bf839ae3771de1557ce6691eb8eedd
Author: Pan Li 
Date:   Thu Jun 19 18:58:17 2025 +0800

RISC-V: Fix ICE for expand_select_vldi [PR120652]

The will be one ICE when expand pass, the bt similar as below.

during RTL pass: expand
red.c: In function 'main':
red.c:20:5: internal compiler error: in require, at machmode.h:323
   20 | int main() {
  | ^~~~
0x2e0b1d6 internal_error(char const*, ...)
../../../gcc/gcc/diagnostic-global-context.cc:517
0xd0d3ed fancy_abort(char const*, int, char const*)
../../../gcc/gcc/diagnostic.cc:1803
0xc3da74 opt_mode::require() const
../../../gcc/gcc/machmode.h:323
0xc3de2f opt_mode::require() const
../../../gcc/gcc/poly-int.h:1383
0xc3de2f riscv_vector::expand_select_vl(rtx_def**)
../../../gcc/gcc/config/riscv/riscv-v.cc:4218
0x21c7d22 gen_select_vldi(rtx_def*, rtx_def*, rtx_def*)
../../../gcc/gcc/config/riscv/autovec.md:1344
0x134db6c maybe_expand_insn(insn_code, unsigned int, expand_operand*)
../../../gcc/gcc/optabs.cc:8257
0x134db6c expand_insn(insn_code, unsigned int, expand_operand*)
../../../gcc/gcc/optabs.cc:8288
0x11b21d3 expand_fn_using_insn
../../../gcc/gcc/internal-fn.cc:318
0xef32cf expand_call_stmt
../../../gcc/gcc/cfgexpand.cc:3097
0xef32cf expand_gimple_stmt_1
../../../gcc/gcc/cfgexpand.cc:4264
0xef32cf expand_gimple_stmt
../../../gcc/gcc/cfgexpand.cc:4411
0xef95b6 expand_gimple_basic_block
../../../gcc/gcc/cfgexpand.cc:6472
0xefb66f execute
../../../gcc/gcc/cfgexpand.cc:7223

The select_vl op_1 and op_2 may be the same const_int like (const_int 32).
And then maybe_legitimize_operands will:

1. First mov the const op_1 to a reg.
2. Resue the reg of op_1 for op_2 as the op_1 and op_2 is equal.

That will break the assumption that the op_2 of select_vl is immediate,
or something like CONST_INT_POLY.

The below test suites are passed for this patch series.
* The rv64gcv fully regression test.

PR target/120652

gcc/ChangeLog:

* config/riscv/autovec.md: Add immediate_operand for
select_vl operand 2.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/pr120652-1.c: New test.
* gcc.target/riscv/rvv/autovec/pr120652-2.c: New test.
* gcc.target/riscv/rvv/autovec/pr120652-3.c: New test.
* gcc.target/riscv/rvv/autovec/pr120652.h: New test.

Signed-off-by: Pan Li 

[Bug target/120603] Improve addition/subtraction on RISC-V for out of range constants

2025-06-20 Thread law at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120603

Jeffrey A. Law  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |law at gcc dot gnu.org

--- Comment #1 from Jeffrey A. Law  ---
Just a note, Shreya is working on this.  She has applied for a bugzilla
account, but until that's ready I'll just assign this to me.

[Bug debug/120738] ctf: use CTF_FP_LDOUBLE encoding for 128-bit floats

2025-06-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120738

--- Comment #4 from Andrew Pinski  ---
Long double could be 64bit while having _Float128 too. This is allowed with c23
even. Even _Float16 is not able to be represented.

[Bug debug/120738] ctf: use CTF_FP_LDOUBLE encoding for 128-bit floats

2025-06-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120738

Andrew Pinski  changed:

   What|Removed |Added

 Status|WAITING |SUSPENDED

--- Comment #3 from Andrew Pinski  ---
Looks like ctf is not descriptive enough yet.

So suspended.

[Bug debug/120738] ctf: use CTF_FP_LDOUBLE encoding for 128-bit floats

2025-06-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120738

Andrew Pinski  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2025-06-21
 Status|UNCONFIRMED |WAITING

--- Comment #2 from Andrew Pinski  ---
The specs says long double. So i think this is wrong.

[Bug debug/120739] ctf: Emit CTF_K_ARRAY for GNU vector types

2025-06-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120739

--- Comment #2 from Andrew Pinski  ---
A patch for this was already rejected:
https://patchwork.sourceware.org/project/gcc/patch/20250501213426.2252847-2-bruce.mccull...@oracle.com/

[Bug debug/120739] ctf: Emit CTF_K_ARRAY for GNU vector types

2025-06-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120739

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||btf-debug

--- Comment #1 from Andrew Pinski  ---
Is there a spec that says vectors should be represented as this? Or you saying
based on what clang does?

[Bug target/120734] *mov_(and|or) can be used for TARGET_SPLIT_LONG_MOVES

2025-06-20 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120734

H.J. Lu  changed:

   What|Removed |Added

 CC||liuhongt at gcc dot gnu.org
Summary|-m32 -march=i686 doesn't|*mov_(and|or) can be
   |use "mov $0, mem"   |used for
   ||TARGET_SPLIT_LONG_MOVES

--- Comment #3 from H.J. Lu  ---
Since there is

/* X86_TUNE_SPLIT_LONG_MOVES: Avoid instructions moving immediates
   directly to memory.  */  
DEF_TUNE (X86_TUNE_SPLIT_LONG_MOVES, "split_long_moves", m_PPRO)

to avoid long move instructions, like

c7 02 00 00 00 00   movl   $0x0,(%rdx)
c7 02 ff ff ff ff   movl   $0x,(%rdx)

*mov_(and|or) can be enable for TARGET_SPLIT_LONG_MOVES with

83 22 00andl   $0x0,(%rdx)
83 0a fforl$0x,(%rdx)

[Bug target/120734] *mov_(and|or) can be used for TARGET_SPLIT_LONG_MOVES

2025-06-20 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120734

H.J. Lu  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |hjl.tools at gmail dot 
com

--- Comment #4 from H.J. Lu  ---
Created attachment 61675
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=61675&action=edit
A patch

[Bug debug/120738] New: ctf: use CTF_FP_LDOUBLE encoding for 128-bit floats

2025-06-20 Thread bruce.mcculloch at oracle dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120738

Bug ID: 120738
   Summary: ctf: use CTF_FP_LDOUBLE encoding for 128-bit floats
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: debug
  Assignee: unassigned at gcc dot gnu.org
  Reporter: bruce.mcculloch at oracle dot com
  Target Milestone: ---

Currently, gen_ctf_base_type will obtain the bit_size of a given DWARF DIE
based
on the system GCC is compiling for. For DIEs with a DW_ATE_float encoding, this
is used to determine whether to classify a given DIE as a single, double, or
long double. However, on some systems, a long double will not have a bit_size
of 128 (usually it will be 80). This means that a __float128 (_Float128) type
will not be classified as a long double, and will actually be dropped entirely.

[Bug target/120659] ICE: in riscv_sched_variable_issue, at config/riscv/riscv.cc:9879 with -O2 -mcpu=sifive-x280

2025-06-20 Thread law at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120659

Jeffrey A. Law  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2025-06-21
 CC||kito at gcc dot gnu.org
 Status|UNCONFIRMED |NEW

--- Comment #1 from Jeffrey A. Law  ---
Someone from SiFive really needs to fix their pipeline descriptions.

Every insn must have a type and every type must map to an insn reservation,
even if the port doesn't typically use those instructions.

[Bug cobol/120621] COBOL isn't built with STRICT_WARN

2025-06-20 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120621

--- Comment #10 from GCC Commits  ---
The master branch has been updated by James K. Lowden :

https://gcc.gnu.org/g:007392c0f93cf46b9e87aebdd04e123e3381fc07

commit r16-1595-g007392c0f93cf46b9e87aebdd04e123e3381fc07
Author: James K. Lowden 
Date:   Fri Jun 20 12:43:51 2025 -0400

cobol: Correct diagnostic strings for 32-bit builds.

Avoid %z for printf-family.  Cast pid_t to long.  Avoid use of YYUNDEF
for old Bison versions.

PR cobol/120621

gcc/cobol/ChangeLog:

* genapi.cc (parser_compile_ecs): Cast argument to unsigned long.
(parser_compile_dcls): Same.
(parser_division): RAII.
(inspect_tally): Cast argument to unsigned long.
* lexio.cc (cdftext::lex_open): Cast pid_t to long.
* parse.y: hard-code values for old versions of Bison, and message
format.
* scan_ante.h (wait_for_the_child): Cast pid_t to long.

[Bug target/120737] #pragma omp atomic fails on nvptx

2025-06-20 Thread schulz.benjamin at googlemail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120737

Benjamin Schulz  changed:

   What|Removed |Added

  Attachment #61673|0   |1
is obsolete||

--- Comment #1 from Benjamin Schulz  ---
Created attachment 61674
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=61674&action=edit
main.cpp

compile with 

-g -O3 -fopenmp -foffload=nvptx-none  -fno-stack-protector -std=c++23 -lm -lc

If one replaces the last

#pragma omp target parallel for shared(tmp)

#pragma omp parallel for shared(tmp)

Then the omp critical statement is working and yielding the correct result
2740. Similarly, omp atomic does not work correctly on the target in this
example. With target distribute it is the same. 

Strangely, however,a reduction clause works instead, even if not supported with
target distribute by the standard officiallly...  But one still should have
critical and atomic working correctly with shared variables on the target...

[Bug target/118241] RISC-V ICE: internal compiler error: in int_mode_for_mode, at stor-layout.cc:407 caused by prefetch instructions

2025-06-20 Thread law at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118241

--- Comment #8 from Jeffrey A. Law  ---
So I think the way to go for the data prefetching issue is:

1. Define a proper predicate for valid prefetch operands.
   a. REG
   b. REG+D where D is a simm12 with D & 0xf == 0
   c. Immediate wehre immediate is simm12 with D & 0xf == 0

2. Define a suitable constraint which matches the same.

3. Adjust the two patterns to use the new predicate & constraint.

4. Testcase which verifies those cases all work as expected.