[Bug middle-end/117173] can_vec_perm_const_p does not consider costs

2024-10-16 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117173

--- Comment #1 from Richard Biener  ---
So on the GIMPLE side this is mostly canonicalization - the "theory" is that
the backend should be able to implement VEC_PERM_EXPR 
using the same two cheap permutes facing this premute request.

I don't see where the first permute has a single source vector though?

So to answer your question:

 - yes!  costing permutes more accurately would be good
 - no!  match.pd should not look at costs

can you clarify with the very concrete permutes that happen in x264
how the original two are cheap and the new single is not?

[Bug target/117170] [15 regression] Failed bootstrap comparison in tree-vect-data-refs.o on sparcv9-sun-solaris2.11

2024-10-16 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117170

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |15.0
 Target||sparcv9-sun-solaris2.11
   Keywords||needs-bisection

[Bug target/117169] Missed opportunity to combine sign and bitmask tests

2024-10-16 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117169

Richard Biener  changed:

   What|Removed |Added

   Last reconfirmed||2024-10-16
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1

--- Comment #1 from Richard Biener  ---
Confirmed.  I think there's both code in reassoc and match.pd to handle this,
eventually, if written

 if (x < 0)
   return 1;
 if (x & 3)
   return 1;

also in gimple-fold.cc maybe_fold_or_comparisons (though nowadays match.pd
is prefered).

Usually x < 0 "combines" better so it's chosen as canonical form.

[Bug c++/116731] Incorrect behavior of -Wrange-loop-construct in GCC 14

2024-10-16 Thread sunil.dora1988 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116731

--- Comment #4 from Sunil Dora  ---
Dear Community,

Are there any plans to backport to gcc 13.3 ?

[Bug tree-optimization/117050] [15 Regression] ice in vect_build_slp_tree_2

2024-10-16 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117050

--- Comment #8 from GCC Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:ae224de0631a7fcac37ac1384f457f1dc1a487b2

commit r15-4382-gae224de0631a7fcac37ac1384f457f1dc1a487b2
Author: Richard Biener 
Date:   Thu Oct 10 11:02:47 2024 +0200

tree-optimization/117050 - fix ICE with non-grouped .MASK_LOAD SLP

The following is a more complete fix for PR117050, restoring the
ability to permute non-grouped .MASK_LOAD with.

PR tree-optimization/117050
* tree-vect-slp.cc (vect_build_slp_tree_2): Properly handle
non-grouped masked loads when handling permutations.

[Bug middle-end/117173] can_vec_perm_const_p does not consider costs

2024-10-16 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117173

--- Comment #2 from Robin Dapp  ---
In x264, before the optimization we have:

_42 = VEC_PERM_EXPR ;
...
_44 = VEC_PERM_EXPR ;
_45 = VEC_PERM_EXPR ;

The first one (_42) is "monotonic" and can be implemented by a vmerge.  This
implies a load and one instruction.

_44 and _45 can be implemented by one vrgather each because they have a single
source.


After the optimization we have: 

_838 = VEC_PERM_EXPR ;
_846 = VEC_PERM_EXPR ;

Both of those have two sources and, generally, require two vrgathers (each, one
possibly masked) and we need to arrange the indices properly.
I don't think our current implementation for this generic approach is ideal but
it will never be as cheap as the two non-merged permutes.

Of course we could try deconstructing the index to arrive at the "before"
but... :)

[Bug target/117173] can_vec_perm_const_p does not consider costs

2024-10-16 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117173

Andrew Pinski  changed:

   What|Removed |Added

  Component|middle-end  |target
   Keywords||missed-optimization
 Target||riscv

--- Comment #3 from Andrew Pinski  ---
I think the target should unmerge it during expand since an user could just use
the perm values themselves (via __builtin_shuffle/__builtin_shufflevector).

[Bug rtl-optimization/116550] [lra][avr] internal compiler error: in final_scan_insn_1, at final.cc:2807

2024-10-16 Thread denisc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116550

--- Comment #15 from denisc at gcc dot gnu.org ---
(In reply to Segher Boessenkool from comment #13)
> Yeah :-)  So post an actual patch, to gcc-patches@?  :-)

PING ...
I sent a patch.

[Bug middle-end/117123] [14/15 regression] Generated code at -Os on trunk is larger than GCC 14.4 since r14-6536-gcd794c39610177 (sccopy)

2024-10-16 Thread pheeck at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117123

Filip Kastl  changed:

   What|Removed |Added

   Last reconfirmed||2024-10-16
 Ever confirmed|0   |1
   Assignee|unassigned at gcc dot gnu.org  |pheeck at gcc dot 
gnu.org
 Status|UNCONFIRMED |ASSIGNED

--- Comment #4 from Filip Kastl  ---
Currently looking into this.  Looks like disabling the first instance of the
sccopy pass gets us the same code as GCC 13.3 https://godbolt.org/z/E5M3bMqnx

[Bug tree-optimization/117172] New: FAIL: gcc.target/i386/pr111820-2.c and gcc.target/i386/pr111820-3.c with forced SLP

2024-10-16 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117172

Bug ID: 117172
   Summary: FAIL: gcc.target/i386/pr111820-2.c and
gcc.target/i386/pr111820-3.c with forced SLP
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: rguenth at gcc dot gnu.org
  Target Milestone: ---

There is still

  /* TODO: Support slp for nonlinear iv. There should be separate vector iv
 update for each iv and a permutation to generate wanted vector iv.  */
  if (slp_node)
{  
  if (dump_enabled_p ())
dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
 "SLP induction not supported for nonlinear"
 " induction.\n"); 
  return false;

[Bug tree-optimization/117172] FAIL: gcc.target/i386/pr111820-2.c and gcc.target/i386/pr111820-3.c with forced SLP

2024-10-16 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117172

Richard Biener  changed:

   What|Removed |Added

   Keywords||missed-optimization
 Blocks||116578

--- Comment #1 from Richard Biener  ---
A single-lane variant is supposedly "easy".


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116578
[Bug 116578] vectorizer SLP transition issues / dependences

[Bug middle-end/117173] New: can_vec_perm_const_p does not consider costs

2024-10-16 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117173

Bug ID: 117173
   Summary: can_vec_perm_const_p does not consider costs
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: rdapp at gcc dot gnu.org
CC: law at gcc dot gnu.org, rguenth at gcc dot gnu.org
  Target Milestone: ---

I only noticed this on riscv but it's actually a target-independent issue.

In match.pd:10927 we have the following pattern

/* Merge
 c = VEC_PERM_EXPR ;
 d = VEC_PERM_EXPR ;
   to
 d = VEC_PERM_EXPR ;  */

which merges a permutation of a permutation into a single one.
It correctly checks that the newly built permutation is supported by the target
via can_vec_perm_const_p.

On riscv we have a generic fallback for permutations that is more costly than
simpler permutations which, of course, returns true when asked whether a
certain permutation is supported.  I understand that's the way most targets
handle it.

The match pattern above causes a (GCC 15) regression in x264 because we replace
two simple permutations (both have a single source vector) with a complex one
that is more expensive.

We statically know, roughly, how expensive a certain permutation constant is
and in this case our costing would indicate that it's not profitable to merge
the permutations.

Would it be acceptable to add a cost parameter to can_vec_perm_const_p?
I realize the original intent was for this function to only return true if the
requested permutation is directly supported (as in by a single instruction) but
IMHO that's not how it is actually used nowadays.

[Bug tree-optimization/117171] New: FAIL: gcc.dg/vect/vect-early-break_82.c with forced SLP

2024-10-16 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117171

Bug ID: 117171
   Summary: FAIL: gcc.dg/vect/vect-early-break_82.c with forced
SLP
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: rguenth at gcc dot gnu.org
  Target Milestone: ---

gcc.dg/vect/vect-early-break_82.c fails because we are asked to create a
vector(2)  external.  The

 _23 = _21 & _22

with the external _22 = x$imag_6 == $t$imag_16 stmt isn't covered by
pattern detection.  For non-SLP we are doing the "obvious" here:

note:  created new init_stmt: _81 = _22 ? -1 : 0;
note:  created new init_stmt: vect_cst__82 = {_81, _81};

but with AVX512 this shows

  vector(8)  vect_cst__82;
  _81 = _22 ? -1 : 0;
  vect_cst__82 = {_81, _81, _81, _81, _81, _81, _81, _81};

which is sub-optimal, esp. when there'll be more than one def (SLP!)
involved.  Instead we'd want to generate

  _81 = { _22, _22, ... };
  vect_cst__82 = _81 != { 0, 0, ... };

thus a vector compare we need to have support for.

This is in the end the famous vect_maybe_update_slp_op_vectype

  /* For external defs refuse to produce VECTOR_BOOLEAN_TYPE_P, those
 should be handled by patters.  Allow vect_constant_def for now.  */
  if (VECTOR_BOOLEAN_TYPE_P (vectype)
  && SLP_TREE_DEF_TYPE (op) == vect_external_def)
return false;

we'd need to improve.  Alternatively operations on mask bools need to be
pattern-handled as well.

[Bug tree-optimization/117171] FAIL: gcc.dg/vect/vect-early-break_82.c with forced SLP

2024-10-16 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117171

Richard Biener  changed:

   What|Removed |Added

 Blocks||116578
 CC||tnfchris at gcc dot gnu.org
   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org
 Ever confirmed|0   |1
 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2024-10-16

--- Comment #1 from Richard Biener  ---
I am going to try to fix this on the vect_maybe_update_slp_op_vectype and
constant code generation side.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116578
[Bug 116578] vectorizer SLP transition issues / dependences

[Bug libstdc++/117094] ranges::fill misses std::move for output_iterator

2024-10-16 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117094

--- Comment #2 from GCC Commits  ---
The releases/gcc-14 branch has been updated by Jonathan Wakely
:

https://gcc.gnu.org/g:cbb1814ffa29acc390bb0de46be49a24d09948d1

commit r14-10789-gcbb1814ffa29acc390bb0de46be49a24d09948d1
Author: Jonathan Wakely 
Date:   Sun Oct 13 22:48:43 2024 +0100

libstdc++: Use std::move for iterator in ranges::fill [PR117094]

Input iterators aren't required to be copyable.

libstdc++-v3/ChangeLog:

PR libstdc++/117094
* include/bits/ranges_algobase.h (__fill_fn): Use std::move for
iterator that might not be copyable.
* testsuite/25_algorithms/fill/constrained.cc: Check
non-copyable iterator with sized sentinel.

(cherry picked from commit 03623fa91ff36ecb9faa3b55f7842a39b759594e)

[Bug libstdc++/117085] chrono formatting: %c does not honor locale after expansion

2024-10-16 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117085

--- Comment #4 from GCC Commits  ---
The releases/gcc-14 branch has been updated by Jonathan Wakely
:

https://gcc.gnu.org/g:f1436fde43215659554418220aa45830a5e7ae61

commit r14-10792-gf1436fde43215659554418220aa45830a5e7ae61
Author: Jonathan Wakely 
Date:   Fri Oct 11 09:40:38 2024 +0100

libstdc++: Fix localized %c formatting for  [PR117085]

When formatting a time point with %c we call std::vformat_to using the
formatting locale's D_T_FMT string, but we weren't adding the L option
to the format string. This meant we always interpreted D_T_FMT in the C
locale, instead of using the formatting locale as obviously intended
when %c is used.

libstdc++-v3/ChangeLog:

PR libstdc++/117085
* include/bits/chrono_io.h (__formatter_chrono::_M_c): Add L
option to format string.
* testsuite/std/time/format.cc: Move to...
* testsuite/std/time/format/format.cc: ...here.
* testsuite/std/time/format/pr117085.cc: New test.

(cherry picked from commit 4ad697bb7f1aad252e1398c6f13eed3fa6d0ca5b)

[Bug libstdc++/90276] PSTL tests fail in Debug Mode

2024-10-16 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90276

--- Comment #20 from GCC Commits  ---
The releases/gcc-14 branch has been updated by Jonathan Wakely
:

https://gcc.gnu.org/g:f1cee9d1a049a3bc7cae24245fcc3c415fd12764

commit r14-10794-gf1cee9d1a049a3bc7cae24245fcc3c415fd12764
Author: Jonathan Wakely 
Date:   Wed Jun 12 17:11:23 2024 +0100

libstdc++: Increase timeouts for PSTL tests in debug mode [PR90276]

These tests compile very slowly in debug mode.

libstdc++-v3/ChangeLog:

PR libstdc++/90276
*
testsuite/25_algorithms/pstl/alg_modifying_operations/rotate_copy.cc:
Increase timeout for debug mode.
*
testsuite/25_algorithms/pstl/alg_modifying_operations/transform_binary.cc:
Likewise.
* testsuite/25_algorithms/pstl/alg_nonmodifying/mismatch.cc:
Likewise.
*
testsuite/25_algorithms/pstl/alg_sorting/lexicographical_compare.cc:
Likewise.
* testsuite/25_algorithms/pstl/alg_sorting/minmax_element.cc:
Likewise.
*
testsuite/25_algorithms/pstl/alg_sorting/set_symmetric_difference.cc:
Likewise.

(cherry picked from commit e65b6627a36869b01bbe128a5324e4b415b28880)

[Bug libstdc++/117135] 22_locale/time_get/get/wchar_t/5.cc fails on arm since gcc-15-4016-gc534e37facc

2024-10-16 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117135

--- Comment #14 from GCC Commits  ---
The releases/gcc-14 branch has been updated by Jonathan Wakely
:

https://gcc.gnu.org/g:8f181a2f878e8b97a91d68214161cb96a2b7

commit r14-10790-g8f181a2f878e8b97a91d68214161cb96a2b7
Author: Jonathan Wakely 
Date:   Tue Sep 24 23:20:56 2024 +0100

libstdc++: Populate generic std::time_get's wide %c format [PR117135]

I missed out the __timepunct specialization for the "generic"
implementation when defining the %c format in r15-4016-gc534e37faccf48.

libstdc++-v3/ChangeLog:

PR libstdc++/117135
* config/locale/generic/time_members.cc
(__timepunct::_M_initialize_timepunc): Set
_M_date_time_format for C locale. Set %Ex formats to the same
values as the %x formats.

(cherry picked from commit 707d84efee7f7eb5a336935f386e094402f267a6)

[Bug tree-optimization/116973] FAIL: gcc.dg/vect/pr52252-st.c with forced SLP

2024-10-16 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116973

--- Comment #3 from Richard Biener  ---
Similar case for

FAIL: gcc.target/i386/pr70021.c

where the group is power-of-two size.

[Bug tree-optimization/117172] FAIL: gcc.target/i386/pr111820-2.c and gcc.target/i386/pr111820-3.c with forced SLP

2024-10-16 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117172

Richard Biener  changed:

   What|Removed |Added

   Last reconfirmed||2024-10-16
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW

--- Comment #2 from Richard Biener  ---
Same issue for

FAIL: gcc.target/i386/pr103144-mul-1.c
FAIL: gcc.target/i386/pr103144-neg-1.c
FAIL: gcc.target/i386/pr103144-shift-1.c

[Bug target/113932] [meta-bug] Targets which should be ported to LRA

2024-10-16 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113932
Bug 113932 depends on bug 113952, which changed state.

Bug 113952 Summary: Finish LRA transition for sparc by removing -mlra
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113952

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

[Bug target/113952] Finish LRA transition for sparc by removing -mlra

2024-10-16 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113952

--- Comment #3 from GCC Commits  ---
The master branch has been updated by Sam James :

https://gcc.gnu.org/g:b388f65abc71c951167175aa502476f1bfaa2a83

commit r15-4373-gb388f65abc71c951167175aa502476f1bfaa2a83
Author: Sam James 
Date:   Mon Oct 14 11:53:52 2024 -0700

sparc: drop -mlra

The sparc port gained LRA support in r7-5076-gf99bd883fb0d05 and has
defaulted to LRA since r7-5642-g70a6dbe7e37e69.

Let's finish the transition by dropping -mlra entirely.

Tested on sparc64-unknown-linux-gnu with no regressions.

gcc/ChangeLog:
PR target/113952
* config/sparc/sparc.cc (sparc_lra_p): Delete.
(TARGET_LRA_P): Ditto.
(sparc_option_override): Don't use MASK_LRA.
* config/sparc/sparc.md (disabled,enabled): Drop lra attribute.
* config/sparc/sparc.opt: Delete -mlra.
* config/sparc/sparc.opt.urls: Ditto.
* doc/invoke.texi (SPARC options): Drop -mlra and -mno-lra.

[Bug target/113952] Finish LRA transition for sparc by removing -mlra

2024-10-16 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113952

Sam James  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
   Target Milestone|--- |15.0
 Resolution|--- |FIXED

--- Comment #4 from Sam James  ---
Done (and thanks to Eric for the speedy review too) for 15.

[Bug target/116655] RISC-V: ICE with -mrvv-max-lmul=dynamic in compute_nregs_for_mode

2024-10-16 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116655

--- Comment #3 from GCC Commits  ---
The master branch has been updated by Robin Dapp :

https://gcc.gnu.org/g:cc217a1ecb04c9234b2cce7ba3c27701a050e402

commit r15-4378-gcc217a1ecb04c9234b2cce7ba3c27701a050e402
Author: Robin Dapp 
Date:   Tue Oct 15 12:10:48 2024 +0200

RISC-V: Use biggest_mode as mode for constants.

In compute_nregs_for_mode we expect that the current variable's mode is
at most as large as the biggest mode to be used for vectorization.

This might not be true for constants as they don't actually have a mode.
In that case, just use the biggest mode so max_number_of_live_regs
returns 1.

This fixes several test cases in the test suite.

gcc/ChangeLog:

PR target/116655

* config/riscv/riscv-vector-costs.cc (max_number_of_live_regs):
Use biggest mode instead of constant's saved mode.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/pr116655.c: New test.

[Bug target/117168] Bootstrap fails with ICE: in curr_insn_transform, at lra-constraints.cc:4283

2024-10-16 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117168

--- Comment #1 from Sam James  ---
Will look.

[Bug tree-optimization/117140] [15 regression] RISC-V: ICE in initialize_flags_in_bb for rv32gcv

2024-10-16 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117140

--- Comment #7 from Tamar Christina  ---
For this statement somehow the location of the gsi ends up having
first == last, so gsi_insert_before just silently ignores the insert.

The ICE happens because for this one BB, no vector statement is ever emitted to
the BB.

Reverting to the original code I had

diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
index 8727246c27a..c028594e18b 100644
--- a/gcc/tree-vect-slp.cc
+++ b/gcc/tree-vect-slp.cc
@@ -11128,7 +11128,8 @@ vectorize_slp_instance_root_stmt (vec_info *vinfo,
slp_tree node, slp_instance i
 can't support lane > 1 at this time.  */
   gcc_assert (instance->root_stmts.length () == 1);
   auto root_stmt_info = instance->root_stmts[0];
-  auto last_stmt = STMT_VINFO_STMT (root_stmt_info);
+  auto last_stmt = vect_find_first_scalar_stmt_in_slp (node)->stmt;
   gimple_stmt_iterator rgsi = gsi_for_stmt (last_stmt);
   gimple *vec_stmt = NULL;
   gcc_assert (!SLP_TREE_VEC_DEFS (node).is_empty ());

works correctly, but I'd like to figure out why.  So looking into that first.

[Bug target/117168] Bootstrap fails with ICE: in curr_insn_transform, at lra-constraints.cc:4283

2024-10-16 Thread ebotcazou at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117168

Eric Botcazou  changed:

   What|Removed |Added

 CC||ebotcazou at gcc dot gnu.org

--- Comment #6 from Eric Botcazou  ---
Created attachment 59359
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59359&action=edit
Tentative fix

[Bug target/117168] Bootstrap fails with ICE: in curr_insn_transform, at lra-constraints.cc:4283

2024-10-16 Thread ebotcazou at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117168

--- Comment #7 from Eric Botcazou  ---
I totally forgot about the quirks of the 'U' constraint with LRA when reviewing
the original patch, sorry about that.  If you can give a try to this one...

[Bug tree-optimization/116758] [15 Regression] 25-40% binary size increase and up to 177% compile time increase for SPEC CPU wrf with Ofast since r15-3529-g506417dbc8b1cb

2024-10-16 Thread pheeck at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116758

Filip Kastl  changed:

   What|Removed |Added

Summary|[15 Regression] 25-40%  |[15 Regression] 25-40%
   |binary size increase and up |binary size increase and up
   |to 177% compile time|to 177% compile time
   |increase for SPEC CPU wrf   |increase for SPEC CPU wrf
   |with Ofast  |with Ofast since
   ||r15-3529-g506417dbc8b1cb
 CC||amacleod at redhat dot com

--- Comment #4 from Filip Kastl  ---
Bisected to r15-3529-g506417dbc8b1cb.  Cc-ing Andrew.

This is strange though.  Andrew's change is in VRP.  Is it possible that VRP
would generate so much more code?  Maybe VRP optimizations caused a different
pass to produce the excess code.

[Bug c/117167] New: ICE: ‘verify_type’ failed with attribute const and -flto during IPA pass

2024-10-16 Thread iamanonymous.cs at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117167

Bug ID: 117167
   Summary: ICE: ‘verify_type’ failed with attribute const and
-flto during IPA pass
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: iamanonymous.cs at gmail dot com
  Target Milestone: ---

***
OS and Platform:
$ uname -a:
Linux 65dac7c84719 4.15.0-213-generic #224-Ubuntu SMP Mon Jun 19 13:30:12 UTC
2023 x86_64 x86_64 x86_64 GNU/Linux
***
gcc version:
Using built-in specs.
COLLECT_GCC=/home/software/gcc-trunk-3aa004f/bin/gcc
COLLECT_LTO_WRAPPER=/home/software/gcc-trunk-3aa004f/libexec/gcc/x86_64-pc-linux-gnu/15.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc/configure --disable-multilib --disable-bootstrap
--enable-languages=c,c++ --prefix=/home/software/gcc-trunk-3aa004f
--enable-coverage
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 15.0.0 20240630 (experimental) (GCC) 

***
Program:
$ cat mutant.c
struct {
} typedef *a;
a __attribute__((const)) b(const a) {}
void main() { a c = b(c); }

It was reduced by Creduce.
***
Command Lines:
$ gcc -flto mutant.c
mutant.c:4:1: error: type variant has different 'TREE_TYPE'
4 | void main() { a c = b(c); }
  | ^~~~
 >
unsigned DI
size 
unit-size 
align:64 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type
0x7f15294115e8>
QI
size  constant 8>
unit-size  constant 1>
align:8 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type
0x7f1529411930
arg-types 
chain >>
pointer_to_this >
mutant.c:4:1: error: type variant's 'TREE_TYPE'
 >
unsigned DI
size  constant 64>
unit-size  constant 8>
align:64 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type
0x7f15294115e8>
mutant.c:4:1: error: type's 'TREE_TYPE'
 >
unsigned DI
size  constant 64>
unit-size  constant 8>
align:64 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type
0x7f15294115e8>
 >
unsigned DI
size 
unit-size 
align:64 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type
0x7f15294115e8>
readonly QI
size  constant 8>
unit-size  constant 1>
align:8 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type
0x7f15294390a8
arg-types 
unsigned DI size  unit-size

align:64 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type
0x7f15294115e8>
chain >>
pointer_to_this >
during IPA pass: *free_lang_data
mutant.c:4:1: internal compiler error: 'verify_type' failed
0x5071bcf diagnostic_context::report_diagnostic(diagnostic_info*)
???:0
0x50724a1 diagnostic_context::diagnostic_impl(rich_location*,
diagnostic_metadata const*, int, char const*, __va_list_tag (*) [1],
diagnostic_t)
???:0
0x50924c7 internal_error(char const*, ...)
???:0
0x26ada95 verify_type(tree_node const*)
???:0
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See  for instructions.

It can be compiled normally without const attribute or without -flto.
Also ICE on trunk.
Compiler Explorer: https://godbolt.org/z/chq8a9Yaj

[Bug target/116994] [15 regression] GCC trunk generates larger code than GCC 14 at -Os since r15-313-gd826f794560904

2024-10-16 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116994

Sam James  changed:

   What|Removed |Added

Summary|[15 regression] GCC trunk   |[15 regression] GCC trunk
   |generates larger code than  |generates larger code than
   |GCC 14 at -Os   |GCC 14 at -Os since
   ||r15-313-gd826f794560904

--- Comment #3 from Sam James  ---
r15-313-gd826f794560904

[Bug middle-end/117123] [14/15 regression] Generated code at -Os on trunk is larger than GCC 14.4 since r14-6536-gcd794c39610177 (sccopy)

2024-10-16 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117123

Sam James  changed:

   What|Removed |Added

   Keywords||missed-optimization
Summary|[14/15 regression]  |[14/15 regression]
   |Generated code at -Os on|Generated code at -Os on
   |trunk is larger than GCC|trunk is larger than GCC
   |14.4|14.4 since
   ||r14-6536-gcd794c39610177
   ||(sccopy)
   Target Milestone|--- |14.3

--- Comment #3 from Sam James  ---
r14-6536-gcd794c39610177

[Bug middle-end/117160] [15 regression] GCC trunk generates larger code than GCC 14 at -Os/-Oz (progressed in 14) since r15-3986-g3e1bd6470e4deb

2024-10-16 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117160

Sam James  changed:

   What|Removed |Added

   Keywords||missed-optimization
   Target Milestone|--- |15.0
   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=116906
 CC||sjames at gcc dot gnu.org
Summary|[15 regression] GCC trunk   |[15 regression] GCC trunk
   |generates larger code than  |generates larger code than
   |GCC 14 at -Os/-Oz   |GCC 14 at -Os/-Oz
   |(progressed in 14)  |(progressed in 14) since
   ||r15-3986-g3e1bd6470e4deb

--- Comment #1 from Sam James  ---
r15-3986-g3e1bd6470e4deb

[Bug target/117168] Bootstrap fails with ICE: in curr_insn_transform, at lra-constraints.cc:4283

2024-10-16 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117168

--- Comment #8 from Sam James  ---
No, thank you for helping (and sorry for the mess).

Trying it now, although with a revert of my commit, I get a comparison failure
later on in gcc/tree-vect-data-refs.o which I'll report separately.

[Bug target/117170] New: [15 regression] Failed bootstrap comparison in tree-vect-data-refs.o on sparcv9-sun-solaris2.11

2024-10-16 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117170

Bug ID: 117170
   Summary: [15 regression] Failed bootstrap comparison in
tree-vect-data-refs.o on sparcv9-sun-solaris2.11
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Keywords: build
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: sjames at gcc dot gnu.org
CC: glaubitz at physik dot fu-berlin.de, ro at gcc dot gnu.org
  Target Milestone: ---

When reverting my LRA commit for PR117168, I hit a separate issue w/
bootstrapping w/ tree-vect-data-refs.o miscomparison.

```
$ diff -u <(~/build/binutils-gdb/binutils/objdump -S
./stage2-gcc/tree-vect-data-refs.o) <(~/build/binutils-gdb/binutils/objdump -S
./stage3-gcc/tree-vect-data-refs.o)
[...]
 Disassembly of section .text:
@@ -20065,21 +20065,21 @@
 f1cc:  90 10 00 1a mov  %i2, %o0
 f1d0:  40 00 00 00 call  f1d0
<_Z30vect_analyze_data_ref_accessesP8vec_infoP3vecIi7va_heap6vl_ptrE+0x698>
 f1d4:  c4 3f bf 90 sttw  %g2, [ %fp + -112 ]
-f1d8:  94 10 00 08 mov  %o0, %o2
+f1d8:  b4 10 00 08 mov  %o0, %i2
  if (type_size_a == 0
-f1dc:  80 92 80 09 orcc  %o2, %o1, %g0
+f1dc:  80 96 80 09 orcc  %i2, %o1, %g0
 f1e0:  02 40 00 78 be,pn   %icc, f3c0
<_Z30vect_analyze_data_ref_accessesP8vec_infoP3vecIi7va_heap6vl_ptrE+0x888>
-f1e4:  b4 10 00 09 mov  %o1, %i2
+f1e4:  96 10 00 09 mov  %o1, %o3
  || (((unsigned HOST_WIDE_INT)init_b - init_a)
 f1e8:  c4 1f bf 90 ldtw  [ %fp + -112 ], %g2
 f1ec:  98 10 00 1c mov  %i4, %o4
  % type_size_a != 0))
-f1f0:  96 10 00 09 mov  %o1, %o3
+f1f0:  d2 27 bf 7c st  %o1, [ %fp + -132 ]
  || (((unsigned HOST_WIDE_INT)init_b - init_a)
 f1f4:  9a 10 00 10 mov  %l0, %o5
 f1f8:  92 a4 00 03 subcc  %l0, %g3, %o1
  % type_size_a != 0))
-f1fc:  d4 27 bf 7c st  %o2, [ %fp + -132 ]
+f1fc:  94 10 00 1a mov  %i2, %o2
  || (((unsigned HOST_WIDE_INT)init_b - init_a)
 f200:  90 67 00 02 subc  %i4, %g2, %o0
 f204:  d8 3f bf 80 sttw  %o4, [ %fp + -128 ]
@@ -20100,10 +20100,10 @@
 f230:  d8 1f bf 80 ldtw  [ %fp + -128 ], %o4
 f234:  86 a3 40 0b subcc  %o5, %o3, %g3
 f238:  84 63 00 0a subc  %o4, %o2, %g2
-f23c:  d4 07 bf 7c ld  [ %fp + -132 ], %o2
-f240:  80 a0 80 0a cmp  %g2, %o2
-f244:  12 40 00 5f bne,pn   %icc, f3c0
<_Z30vect_analyze_data_ref_accessesP8vec_infoP3vecIi7va_heap6vl_ptrE+0x888>
-f248:  80 a0 c0 1a cmp  %g3, %i2
+f23c:  80 a0 80 1a cmp  %g2, %i2
+f240:  12 40 00 60 bne,pn   %icc, f3c0
<_Z30vect_analyze_data_ref_accessesP8vec_infoP3vecIi7va_heap6vl_ptrE+0x888>
+f244:  d6 07 bf 7c ld  [ %fp + -132 ], %o3
+f248:  80 a0 c0 0b cmp  %g3, %o3
 f24c:  32 40 00 5e bne,a,pn   %icc, f3c4
<_Z30vect_analyze_data_ref_accessesP8vec_infoP3vecIi7va_heap6vl_ptrE+0x88c>
 f250:  a4 10 00 18 mov  %i0, %l2
  if (tree_fits_shwi_p (DR_STEP (dra)))
```

On cfarm216.

[Bug target/117168] Bootstrap fails with ICE: in curr_insn_transform, at lra-constraints.cc:4283

2024-10-16 Thread glaubitz at physik dot fu-berlin.de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117168

--- Comment #3 from John Paul Adrian Glaubitz  ---
(In reply to Sam James from comment #2)
> Reproduced. Reverting my r15-4373-gb388f65abc71c9 seems to help. I'll spend
> time on it today and see how I get on. If I hit a dead-end, I'll revert that
> for now.

Could that be simply a missed case of using LRA instead of Reload on Solaris?

[Bug target/117168] Bootstrap fails with ICE: in curr_insn_transform, at lra-constraints.cc:4283

2024-10-16 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117168

--- Comment #4 from Sam James  ---
(In reply to John Paul Adrian Glaubitz from comment #3)
> (In reply to Sam James from comment #2)
> > Reproduced. Reverting my r15-4373-gb388f65abc71c9 seems to help. I'll spend
> > time on it today and see how I get on. If I hit a dead-end, I'll revert that
> > for now.
> 
> Could that be simply a missed case of using LRA instead of Reload on Solaris?

This is what I'm wondering -- if we were somehow using reload all this time on
Solaris.

[Bug target/117168] Bootstrap fails with ICE: in curr_insn_transform, at lra-constraints.cc:4283

2024-10-16 Thread ebotcazou at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117168

--- Comment #5 from Eric Botcazou  ---
The problem very likely comes from the 3 patterns that had:

   (set_attr "lra"

attributes: the alternatives corresponding to the "disabled" setting must be
removed now.

[Bug target/117169] New: Missed opportunity to combine sign and bitmask tests

2024-10-16 Thread rsandifo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117169

Bug ID: 117169
   Summary: Missed opportunity to combine sign and bitmask tests
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: enhancement
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: rsandifo at gcc dot gnu.org
  Target Milestone: ---
Target: aarch64*-*-*

int f1(int x) { return x < 0 || x & 3; }

on aarch64 produces:

f1:
tbnzw0, #31, .L3
tst x0, 3
csetw0, ne
ret
.L3:
mov w0, 1
ret

But this could be a single TST.  Clang produces:

f1:
tst w0, #0x8003
csetw0, ne
ret

GCC does produce single bitmask tests for non-sign tests, such as:

int f2(int x) { return x & 0x100 || x & 3; }

but of course:

int f3(int x) { return x & 0x8000 || x & 3; }

is canonicalised to the original form.

[Bug rtl-optimization/116550] [lra][avr] internal compiler error: in final_scan_insn_1, at final.cc:2807

2024-10-16 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116550

Sam James  changed:

   What|Removed |Added

   Keywords||patch

--- Comment #16 from Sam James  ---
https://inbox.sourceware.org/gcc-patches/70ac546b-984c-4f84-9b5c-d3ff71f8c...@gmail.com/

[Bug c++/70536] g++ doesn't emit DW_AT_name for DW_TAG_GNU_formal_parameter_pack

2024-10-16 Thread michal.staromiejski+gnu at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70536

Michał Staromiejski  changed:

   What|Removed |Added

 CC||michal.staromiejski+gnu@gma
   ||il.com

--- Comment #7 from Michał Staromiejski  ---
I am not an expert on DWARF but I work with heavily templated code and I was
really surprised that the basic feature of having parameters from pack
instantiation is not available.

As per my investigation, the following code snippet is partially supported by
the official GCC release (tested on 14.1 only):


template 
struct A {
  static int f(Us ... us) {
return 0; // breakpoint
  }
};

template 
int g(Ts ... ts) {
  return A::f(ts ...);
}

int main() {
  return g(1, 2.5);
}


with (unpatched) GDB stacktrace for the breakpoint at line 4:


#0  A::f (us#0=1, us#1=2.5) at vt.cpp:4
#1  0x5171 in g () at vt.cpp:10
#2  0x5147 in main () at vt.cpp:14


It seems that the issue with variadic parameters is only for function templates
(`f` is not one). After examining GIMPLE output, we can see that the names for
parameters are already there, they are just not exported in respective DIEs:


main ()
{
  int D.2099;

  D.2099 = g (1, 2.5e+0);
  return D.2099;
  D.2099 = 0;
  return D.2099;
}

g (int ts#0, double ts#1)
{
  int D.2102;

  D.2102 = A::f (ts#0, ts#1);
  return D.2102;
}

A::f (int us#0, double us#1)
{
  int D.2104;

  D.2104 = 0;
  return D.2104;
}


I tried to just change `false` to `true` just below your change (to emit names)
but the problem is that GDB does not support it (`DW_TAG_formal_parameter` DIEs
should appear directly under the respective `DW_TAG_subprogram` DIE).
So instead, I suggest to modify `while (generic_decl_parm || parm)` loop at
gcc/dwarf2out.cc:24103 (line numbers wrt 14.1 tag) to just export all the
`DW_TAG_formal_parameter` DIEs and separately all the
`DW_TAG_GNU_formal_parameter_pack` DIEs (separate loop) without exporting
formal parameters again:


  // emit all formal parameter packs
  while (generic_decl_parm)
{
  if (lang_hooks.function_parameter_pack_p (generic_decl_parm))
gen_formal_parameter_pack_die (generic_decl_parm, NULL, subr_die,
   NULL);
  generic_decl_parm = DECL_CHAIN (generic_decl_parm);
}

  // emit all formal parameters
  while (parm)
{
  dw_die_ref parm_die = gen_decl_die (parm, NULL, NULL, subr_die);

  if (early_dwarf
  && parm == DECL_ARGUMENTS (decl)
  && TREE_CODE (TREE_TYPE (decl)) == METHOD_TYPE
  && parm_die
  && (dwarf_version >= 3 || !dwarf_strict))
add_AT_die_ref (subr_die, DW_AT_object_pointer, parm_die);

  parm = DECL_CHAIN (parm);
}


This seems to work as expected (also with mixed normal + variadic parameters)
with official GDB.

Any thoughts?

[Bug tree-optimization/116578] vectorizer SLP transition issues / dependences

2024-10-16 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116578
Bug 116578 depends on bug 116655, which changed state.

Bug 116655 Summary: RISC-V: ICE with -mrvv-max-lmul=dynamic in 
compute_nregs_for_mode
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116655

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

[Bug target/116655] RISC-V: ICE with -mrvv-max-lmul=dynamic in compute_nregs_for_mode

2024-10-16 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116655

Robin Dapp  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #4 from Robin Dapp  ---
Fixed.

[Bug tree-optimization/117067] false warning: array subscript 'int (**)(...)[ 0]' is partly outside array bounds

2024-10-16 Thread lobel.krivic at proton dot me via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117067

--- Comment #5 from Lobel Krivic  ---
Sorry, I am not able to follow you. Could you please explain it a bit more? Is
this a bug in the code or in the compiler?

[Bug preprocessor/117166] [15 regression] ICE when building lxml-5.3.0 with LTO (linemap_line_start, at libcpp/line-map.cc:949)

2024-10-16 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117166

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |15.0
  Component|lto |preprocessor
   Keywords||lto

[Bug c/117167] ICE: ‘verify_type’ failed with attribute const and -flto during IPA pass

2024-10-16 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117167

Richard Biener  changed:

   What|Removed |Added

   Keywords||ice-checking,
   ||ice-on-valid-code

--- Comment #1 from Richard Biener  ---
Note it's -flto triggering type verification (else we don't do that, -g also
triggers some of it).

[Bug tree-optimization/117067] false warning: array subscript 'int (**)(...)[ 0]' is partly outside array bounds

2024-10-16 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117067

Sam James  changed:

   What|Removed |Added

 CC||sjames at gcc dot gnu.org

--- Comment #6 from Sam James  ---
It's a compiler bug, it's just notes in terms of debugging/fixing it.

[Bug target/117168] Bootstrap fails with ICE: in curr_insn_transform, at lra-constraints.cc:4283

2024-10-16 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117168

Sam James  changed:

   What|Removed |Added

   Target Milestone|--- |15.0
   Keywords||build, ice-on-valid-code
 Status|UNCONFIRMED |NEW
   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=113952
   Last reconfirmed||2024-10-16
 Ever confirmed|0   |1

--- Comment #2 from Sam James  ---
Reproduced. Reverting my r15-4373-gb388f65abc71c9 seems to help. I'll spend
time on it today and see how I get on. If I hit a dead-end, I'll revert that
for now.

[Bug middle-end/117160] [15 regression] GCC trunk generates larger code than GCC 14 at -Os/-Oz (progressed in 14) since r15-3986-g3e1bd6470e4deb

2024-10-16 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117160

--- Comment #2 from Sam James  ---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116906#c4

[Bug middle-end/117160] [15 regression] GCC trunk generates larger code than GCC 14 at -Os/-Oz (progressed in 14) since r15-3986-g3e1bd6470e4deb

2024-10-16 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117160

--- Comment #3 from Richard Biener  ---
That's a correctness fix expected to cause some fallout.  As mentioned it's a
bit too conservative, assuming p->size could trap but it's not trivially
easy to fix that given we'd have to compute "known to not trap up-to" and
somehow make use of that.

There's an effective duplicate of this bug.

[Bug middle-end/117160] [15 regression] GCC trunk generates larger code than GCC 14 at -Os/-Oz (progressed in 14) since r15-3986-g3e1bd6470e4deb

2024-10-16 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117160

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
   Last reconfirmed||2024-10-16
   Priority|P3  |P2

[Bug target/117048] Failure to combine into XAR instruction

2024-10-16 Thread ktkachov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117048

ktkachov at gcc dot gnu.org changed:

   What|Removed |Added

 Status|RESOLVED|ASSIGNED
 Resolution|FIXED   |---

--- Comment #6 from ktkachov at gcc dot gnu.org ---
There are more cases where we fail to match XAR:
#include 

uint64x2_t
func_shl_eor (uint64x2_t a, uint64x2_t b) {
  uint64x2_t c = veorq_u64 (a, b);
  return veorq_u64(vshlq_n_u64(c, 1), vshrq_n_u64(c, 63));
}

uint64x2_t
func_add_eor (uint64x2_t a, uint64x2_t b) {
  uint64x2_t c = veorq_u64 (a, b);
  return veorq_u64(vaddq_u64(c, c), vshrq_n_u64(c, 63));
}

uint64x2_t
func_shl_orr (uint64x2_t a, uint64x2_t b) {
  uint64x2_t c = veorq_u64 (a, b);
  return vorrq_u64(vshlq_n_u64(c, 1), vshrq_n_u64(c, 63));
}

uint64x2_t
func_add_orr (uint64x2_t a, uint64x2_t b) {
  uint64x2_t c = veorq_u64 (a, b);
  return vorrq_u64(vaddq_u64(c, c), vshrq_n_u64(c, 63));
}

uint64x2_t
func_shl_add (uint64x2_t a, uint64x2_t b) {
  uint64x2_t c = veorq_u64 (a, b);
  return vaddq_u64(vshlq_n_u64(c, 1), vshrq_n_u64(c, 63));
}

uint64x2_t
func_add_add (uint64x2_t a, uint64x2_t b) {
  uint64x2_t c = veorq_u64 (a, b);
  return vaddq_u64(vaddq_u64(c, c), vshrq_n_u64(c, 63));
}

I'll handle them in follow-up patches.

[Bug c/117164] ICE building gcc.dg/nested-func-12.c with -std=gnu23

2024-10-16 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117164

--- Comment #1 from Richard Biener  ---
For the aggregate return the type of the LHS and the type of the declared
return type have to be compatible according to TYPE_CANONICAL.

[Bug lto/117166] [15 regression] ICE when building lxml-5.3.0 with LTO (linemap_line_start, at libcpp/line-map.cc:949)

2024-10-16 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117166

--- Comment #2 from Richard Biener  ---
  linemap_assert (SOURCE_LINE (map, r) == to_line);

maybe some overflow somewhere that's not catched.

[Bug rtl-optimization/116550] [lra][avr] internal compiler error: in final_scan_insn_1, at final.cc:2807

2024-10-16 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116550

--- Comment #17 from Georg-Johann Lay  ---
(In reply to denisc from comment #15)
> I sent a patch.

What might help is to CC the respective maintainer as listed in MAINTAINERS.

[Bug target/117168] New: Bootstrap fails with ICE: in curr_insn_transform, at lra-constraints.cc:4283

2024-10-16 Thread glaubitz at physik dot fu-berlin.de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117168

Bug ID: 117168
   Summary: Bootstrap fails with ICE: in curr_insn_transform, at
lra-constraints.cc:4283
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: glaubitz at physik dot fu-berlin.de
CC: ro at gcc dot gnu.org, sjames at gcc dot gnu.org
  Target Milestone: ---
  Host: sparc-sun-solaris2.11

Trying to bootstrap master sparc-sun-solaris2.11 fails with:

In file included from
../../../../../libstdc++-v3/src/c++17/floating_to_chars.cc:124:
../../../../../libstdc++-v3/src/c++17/ryu/generic_128.c: In function
‘{anonymous}::ryu::generic128::floating_decimal_128
{anonymous}::ryu::generic128::generic_binary_to_decimal({anonymous}::uint128_t,
uint32_t, bool, uint32_t, uint32_t, bool)’:
../../../../../libstdc++-v3/src/c++17/ryu/generic_128.c:237:1: error: unable to
generate reloads for:
  237 | }
  | ^
(insn 3615 8022 3616 75 (set (mem/c:DI (plus:SI (reg/f:SI 101 %sfp)
(const_int -112 [0xff90])) [59 %sfp+-112 S8 A64])
(mem/c:DI (plus:SI (reg/f:SI 101 %sfp)
(const_int -32 [0xffe0])) [9 MEM[(long long
unsigned int[4] *)_2631][0]+0 S8 A64]))
"../../../../../libstdc++-v3/src/c++17/ryu/generic_128.h":483:52 discrim 2 124
{*movdi_insn_sp32}
 (nil))
during RTL pass: reload
../../../../../libstdc++-v3/src/c++17/ryu/generic_128.c:237:1: internal
compiler error: in curr_insn_transform, at lra-constraints.cc:4283
0x104519b5b internal_error(char const*, ...)
../../gcc/diagnostic-global-context.cc:517
0x1044e1e83 fancy_abort(char const*, int, char const*)
../../gcc/diagnostic.cc:1535
0x1025b085f _fatal_insn(char const*, rtx_def const*, char const*, int, char
const*)
../../gcc/rtl-error.cc:108
0x1022b7917 curr_insn_transform
../../gcc/lra-constraints.cc:4283
0x1022be52f lra_constraints(bool)
../../gcc/lra-constraints.cc:5496
0x102297b03 lra(__FILE*, int)
../../gcc/lra.cc:2445
0x10220e2b3 do_reload
../../gcc/ira.cc:5977
0x10220ec6f execute
../../gcc/ira.cc:6165
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See  for instructions.

This might be related to PR target/113952.

Reproduced on cfarm216.

[Bug ada/114065] gnat build with -D_TIME_BITS=64 -D_FILE_OFFSET_BITS=64 fails on 32bit archs

2024-10-16 Thread nicolas at debian dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114065

Nicolas Boulenguez  changed:

   What|Removed |Added

  Attachment #58252|0   |1
is obsolete||
  Attachment #58591|0   |1
is obsolete||

--- Comment #30 from Nicolas Boulenguez  ---
Created attachment 59360
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59360&action=edit
version 12: Inline, indentation, changelog

[Bug c/106762] incorrect array bounds warning (-Warray-bounds) at -O2 on memset()

2024-10-16 Thread qinzhao at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106762

--- Comment #5 from qinzhao at gcc dot gnu.org ---
with my work-in-progress patch + -fdiagnostics-explain-harder:

t_106762.c: In function ‘bug’:
t_106762.c:16:2: warning: ‘memset’ offset [0, 7] is out of the bounds [0, 0]
[-Warray-bounds=]
   16 |  memset(&obj->field1, 0xff, sizeof(obj->field1));
  |  ^~~
  ‘bug’: events 1-2
   12 |  if (idx < ary->objcnt)
  | ^
  | |
  | (1) when the condition is evaluated to false
..
   16 |  memset(&obj->field1, 0xff, sizeof(obj->field1));
  |  ~~~
  |  |
  |  (2) out of array bounds here

looks a nice improvement to the diagnostic.

[Bug go/113143] Remove usage of ucontext.h

2024-10-16 Thread ian at airs dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113143

--- Comment #14 from Ian Lance Taylor  ---
Note that on x86_64 Linux libgo no longer uses ucontext.h.  (Well, the #include
is still there, but it is not used.)  So we have a clear path to removing
ucontext.h on any given system, but it does unfortunately require writing
assembly code.

[Bug target/117173] can_vec_perm_const_p does not consider costs

2024-10-16 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117173

--- Comment #5 from Richard Biener  ---
That said - maybe you shouldn't advertise two-input permutes that are not
blends as supported?  Of course interleaving { 0, 8, 1, 9, 2, 10 ... }
is such a kind.  It should be doable with a vslideup instead of the
blend part and then a single input vgather.  So that's another special-case
to consider.

[Bug target/117173] can_vec_perm_const_p does not consider costs

2024-10-16 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117173

--- Comment #6 from Jakub Jelinek  ---
Please have a look at the i386 backend, where for constant permutes it tries a
sequence of 1, 2, 3, 4 or even 5 instructions to do the various permutations. 
It isn't perfect and surely misses some cases that could be done more
optimally, but starting in the backend with undoing the case you've filed this
for would be useful.
As Andrew wrote, user could have written it using
__builtin_shuffle/__builtin_shufflevector using the combined permutation from
the start.

[Bug middle-end/117123] [14/15 regression] Generated code at -Os on trunk is larger than GCC 14.4 since r14-6536-gcd794c39610177 (sccopy)

2024-10-16 Thread pheeck at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117123

--- Comment #5 from Filip Kastl  ---
PRE is the pass that should be able to optimize most of the code in the
testcase away.  It doesn't remove the code if sccopy1 has run.  I looked at
gimple dumps before PRE for compiler runs with sccopy1 and without sccopy1.  I
think that the only difference is here

with sccopy

---
 43[local count: 19861441]:
 44   _36 = a_24(D) & 1;
 45   if (_36 == 0)
 46 goto ; [66.00%]
 47   else
 48 goto ; [34.00%]
 49
 50[local count: 13108551]:
 51   if (a_24(D) != 10)
 52 goto ; [66.00%]
 53   else
 54 goto ; [34.00%]
 55
 56[local count: 21441448]:
 57   # spud$size_22 = PHI <10(4), a_24(D)(2), a_24(D)(3)>
 58   _5 = a_24(D) == -100;
 59   goto ; [100.00%]
---

without sccopy

---
 43[local count: 21441448]:
 44   # spud$size_55 = PHI 
 45   _7 = a_31(D) == -100;
 46   goto ; [100.00%]
 47 
 48[local count: 19861441]:
 49   _44 = a_31(D) & 1;
 50   if (_44 == 0)
 51 goto ; [66.00%]
 52   else
 53 goto ; [34.00%]
 54 
 55[local count: 13108551]:
 56   if (a_31(D) != 10)
 57 goto ; [66.00%]
 58   else
 59 goto ; [34.00%]
---

If we ignore ssa name version numbers, these snippets are basically the same. 
The only difference is the order of basic blocks.  For some reason, this
difference is important to PRE.  Note that I have only a basic idea about what
PRE does and how it does it.

[Bug tree-optimization/117140] [15 regression] RISC-V: ICE in initialize_flags_in_bb for rv32gcv

2024-10-16 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117140

--- Comment #8 from rguenther at suse dot de  ---
On Wed, 16 Oct 2024, tnfchris at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117140
> 
> --- Comment #7 from Tamar Christina  ---
> For this statement somehow the location of the gsi ends up having
> first == last, so gsi_insert_before just silently ignores the insert.
> 
> The ICE happens because for this one BB, no vector statement is ever emitted 
> to
> the BB.
> 
> Reverting to the original code I had
> 
> diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
> index 8727246c27a..c028594e18b 100644
> --- a/gcc/tree-vect-slp.cc
> +++ b/gcc/tree-vect-slp.cc
> @@ -11128,7 +11128,8 @@ vectorize_slp_instance_root_stmt (vec_info *vinfo,
> slp_tree node, slp_instance i
>  can't support lane > 1 at this time.  */
>gcc_assert (instance->root_stmts.length () == 1);
>auto root_stmt_info = instance->root_stmts[0];
> -  auto last_stmt = STMT_VINFO_STMT (root_stmt_info);

if this is a pattern you'd want STMT_VINFO_STMT (vect_orig_stmt 
(root_stmt_info))

[Bug middle-end/117123] [14/15 regression] Generated code at -Os on trunk is larger than GCC 14.4 since r14-6536-gcd794c39610177 (sccopy)

2024-10-16 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117123

--- Comment #6 from rguenther at suse dot de  ---
On Wed, 16 Oct 2024, pheeck at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117123
> 
> --- Comment #5 from Filip Kastl  ---
[...]
> If we ignore ssa name version numbers, these snippets are basically the same. 
> The only difference is the order of basic blocks.  For some reason, this
> difference is important to PRE.  Note that I have only a basic idea about what
> PRE does and how it does it.

Unless it's in a cycle it shouldn't make a difference.

[Bug target/117168] Bootstrap fails with ICE: in curr_insn_transform, at lra-constraints.cc:4283

2024-10-16 Thread ebotcazou at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117168

Eric Botcazou  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #11 from Eric Botcazou  ---
Thanks, that's good enough I think.

[Bug c++/116167] "static_cast" of member function pointer (non-noexcept) to noexcept erroneously succeeds if not overloaded

2024-10-16 Thread admin at hexadigm dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116167

--- Comment #3 from Larry Smith  ---
Just a follow-up (discovered since original post), if the function in question
is inherited then the issue disappears:

class BaseClass
{
public:

// Same as example in my original post but now
// inherited so "static_cast" below successfully
// fails compilation (unlike original example where
// it wasn't inherited - move this back to class
// "Test" below that is and it erroneously succeeds
// compilation as per my original post)

void Whatever(float)
{
}
};

class Test : public BaseClass
{
public:
};

int main()
{
constexpr auto pWhatever = static_cast(&Test::Whatever);

return 0;
}

[Bug c++/117174] New: Compiler seems to incorrectly cache SFINAE condition evaluation results

2024-10-16 Thread ivan.solovev at qt dot io via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117174

Bug ID: 117174
   Summary: Compiler seems to incorrectly cache SFINAE condition
evaluation results
   Product: gcc
   Version: 14.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ivan.solovev at qt dot io
  Target Milestone: ---

https://godbolt.org/z/197vedfYz

```
namespace A {
template 
bool func(T *) { return true; }
} // namespace A

namespace B {

namespace C {
using A::func;

template 
constexpr bool inline hasFunc = false;

template 
constexpr bool inline hasFunc<
T, std::void_t()))>
> = true;
} // namespace C

template 
constexpr bool inline hasCustomFunc = false;

template 
constexpr bool inline hasCustomFunc<
T, std::void_t()))>
> = true;

} // namespace B

struct S {};

bool func(S *) { return false; }

static_assert(B::hasCustomFunc);
static_assert(!B::hasCustomFunc); // FAILS here!
static_assert(B::C::hasFunc);
```

This code uses the same condition 

  std::void_t()))>

to figure out if the function overload is available for type T. 
The difference is that `B::C::hasFunc` also sees the template from namespace A,
while `B::hasCustomFunc` should not see it.

It looks like GCC evaluates the condition only once and reuses it in the second
check, which is incorrect. Updating the second condition to look like

 std::void_t()) == true)>

fixes the issue.

[Bug preprocessor/117118] [12/13/14/15 Regression] ICE with pragma message and raw strings since r11-498

2024-10-16 Thread lhyatt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117118

Lewis Hyatt  changed:

   What|Removed |Added

   Keywords||patch
URL||https://gcc.gnu.org/piperma
   ||il/gcc-patches/2024-October
   ||/665652.html

--- Comment #4 from Lewis Hyatt  ---
Submitted the one-line patch:
https://gcc.gnu.org/pipermail/gcc-patches/2024-October/665652.html

[Bug c/117024] [C2y] Implement N3349, Abs Without Undefined Behavior

2024-10-16 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117024

Jakub Jelinek  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2024-10-16
   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org
 CC||jakub at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #2 from Jakub Jelinek  ---
https://gcc.gnu.org/pipermail/gcc-patches/2024-October/665651.html

[Bug go/113143] Remove usage of ucontext.h

2024-10-16 Thread joel at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113143

Joel Sherrill  changed:

   What|Removed |Added

 CC||joel at gcc dot gnu.org

--- Comment #13 from Joel Sherrill  ---
RTEMS also had support for Go a long time ago -- it was a GSoC 2010 project. It
worked until the dependency on  was added. RTEMS has tracked the
lack of  needed for Go for 7 years
(https://gitlab.rtems.org/rtems/rtos/rtems/-/issues/2832). It was likely broken
for a while before one of us filed the ticket.

We use newlib and looking at the source Cygwin has its own . My
msys2 install has that version of the file in /usr/include. 

RTEMS does not have . The ticket we have encourages porting the
implementation from *BSD but that will only address a handful of the 18
architectures we currently support.

I agree it is obsolete per POSIX and not universally supported. I also give Ian
the benefit of doubt that there is no adequate replacement -- certainly not in
POSIX. That leaves the solution of having an alternative in Go or an
implementation that is easy to drop into OSes other than Linux or *BSD.

Looks like neither RTEMS nor Haiku has seen someone step up as a volunteer in
the long time this has been an issue.

[Bug middle-end/117123] [14/15 regression] Generated code at -Os on trunk is larger than GCC 14.4 since r14-6536-gcd794c39610177 (sccopy)

2024-10-16 Thread pheeck at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117123

--- Comment #7 from Filip Kastl  ---
It is not in a loop.  I guess I'll double-check that there aren't any
differences which I didn't notice.  There is one here:

57   # spud$size_22 = PHI <10(4), a_24(D)(2), a_24(D)(3)>
44   # spud$size_55 = PHI 

The constant 10 is on a different position in the PHI function.  However, I
think that this also shouldn't have any effect on PRE.

[Bug middle-end/116510] [15 Regression] ice in decompose, at wide-int.h:1049

2024-10-16 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116510

--- Comment #14 from GCC Commits  ---
The trunk branch has been updated by Andi Kleen :

https://gcc.gnu.org/g:d5a05db80fa95dcae1ebc177f7790e1d34fa73ed

commit r15-4387-gd5a05db80fa95dcae1ebc177f7790e1d34fa73ed
Author: Andi Kleen 
Date:   Tue Oct 15 13:16:02 2024 -0700

PR116510: Add missing fold_converts into tree switch if conversion

Passes test suite. Ok to commit?

gcc/ChangeLog:

PR middle-end/116510
* tree-if-conv.cc (predicate_bbs): Add missing fold_converts.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/vect-switch-ifcvt-3.c: New test.

[Bug rtl-optimization/116550] [lra][avr] internal compiler error: in final_scan_insn_1, at final.cc:2807

2024-10-16 Thread denisc at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116550

--- Comment #18 from denisc at gcc dot gnu.org ---
(In reply to Georg-Johann Lay from comment #17)
> (In reply to denisc from comment #15)
> > I sent a patch.
> 
> What might help is to CC the respective maintainer as listed in MAINTAINERS.

Done.
"PING - [PATCH][LRA][PR116550] Reuse scratch registers generated by LRA"
https://gcc.gnu.org/pipermail/gcc-patches/2024-October/665646.html

I think that Richard can approve it as a general maintainer.

[Bug target/113952] Finish LRA transition for sparc by removing -mlra

2024-10-16 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113952

--- Comment #5 from GCC Commits  ---
The master branch has been updated by Eric Botcazou :

https://gcc.gnu.org/g:935b7fbd03373c91bae065c6fe862a9fc7d1a901

commit r15-4385-g935b7fbd03373c91bae065c6fe862a9fc7d1a901
Author: Eric Botcazou 
Date:   Wed Oct 16 13:59:50 2024 +0200

Fix bootstrap on 32-bit SPARC/Solaris

The 'U' constraint cannot be used with LRA.

gcc/
PR target/113952
PR target/117168
* config/sparc/constraints.md ('U'): Delete.
* config/sparc/sparc.md (*movdi_insn_sp32): Remove U alternatives.
(*movdf_insn_sp32): Likewise.
(*mov_insn_sp32): Likewise.
* doc/md.texi (SPARC constraints): Remove entry for 'U'.

[Bug tree-optimization/117093] Missing detection of REV64 vector permute

2024-10-16 Thread ktkachov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117093

--- Comment #4 from ktkachov at gcc dot gnu.org ---
(In reply to ktkachov from comment #3)
> If we remove the casts:
> uint32x4_t ror32_neon_tgt_gcc_bad(uint32x4_t r) {
> uint32x4_t a = r;
> uint32_t t;
> t = a[0]; a[0] = a[1]; a[1] = t;
> t = a[2]; a[2] = a[3]; a[3] = t;
> return a;
> }
> Then this is successfully recognised as:
>   a_2 = VEC_PERM_EXPR ;

In this case it's forwprop1 that optimises it.

[Bug c++/117175] New: Internal compiler error in gimple_add_tmp_var, at gimplify.cc:802

2024-10-16 Thread kernalex256 at yandex dot ru via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117175

Bug ID: 117175
   Summary: Internal compiler error in gimple_add_tmp_var, at
gimplify.cc:802
   Product: gcc
   Version: 14.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: kernalex256 at yandex dot ru
  Target Milestone: ---

The following code produces an ICE under GCC 14.2.0 on x86_64 Linux:

#include 
#include 
#include 

struct promise;
struct coroutine : std::coroutine_handle {
using promise_type = ::promise;
};

struct promise {
coroutine get_return_object() { return {coroutine::from_promise(*this)}; }
std::suspend_always initial_suspend() noexcept { return {}; }
std::suspend_always final_suspend() noexcept { return {}; }
void return_void() {}
void unhandled_exception() {}
};

struct awaitable {
bool await_ready() { return false; }
void await_suspend(std::coroutine_handle<> h) {}
void await_resume() {}
};

struct s1 {
std::string data;
};

awaitable g(std::vector b);

coroutine f() {
co_await g(
{
{.data = "4"},
{.data = "3"},
}
);
}

The code is compiled with --std=c++23 flag, no other flags are needed.

The error message is:

: In function 'void f(f()::_Z1fv.Frame*)':
:34:22: internal compiler error: in gimple_add_tmp_var, at
gimplify.cc:802
   34 | {.data = "4"},
  |  ^~~
0x2031cbc internal_error(char const*, ...)
???:0
0x77895f fancy_abort(char const*, int, char const*)
???:0
0xc4f7db gimplify_expr(tree_node**, gimple**, gimple**, bool (*)(tree_node*),
int)
???:0
0xc4f825 gimplify_expr(tree_node**, gimple**, gimple**, bool (*)(tree_node*),
int)
???:0
0xc55a18 gimplify_arg(tree_node**, gimple**, unsigned int, bool)
???:0
0xc4f0eb gimplify_expr(tree_node**, gimple**, gimple**, bool (*)(tree_node*),
int)
???:0
0xc4dadd gimplify_expr(tree_node**, gimple**, gimple**, bool (*)(tree_node*),
int)
???:0
0xc4f914 gimplify_expr(tree_node**, gimple**, gimple**, bool (*)(tree_node*),
int)
???:0
0xc4dadd gimplify_expr(tree_node**, gimple**, gimple**, bool (*)(tree_node*),
int)
???:0
0xc4e333 gimplify_expr(tree_node**, gimple**, gimple**, bool (*)(tree_node*),
int)
???:0
0xc4dadd gimplify_expr(tree_node**, gimple**, gimple**, bool (*)(tree_node*),
int)
???:0
0xc4f8fe gimplify_expr(tree_node**, gimple**, gimple**, bool (*)(tree_node*),
int)
???:0
0xc4f8fe gimplify_expr(tree_node**, gimple**, gimple**, bool (*)(tree_node*),
int)
???:0
0xc4dadd gimplify_expr(tree_node**, gimple**, gimple**, bool (*)(tree_node*),
int)
???:0
0xc4e333 gimplify_expr(tree_node**, gimple**, gimple**, bool (*)(tree_node*),
int)
???:0
0xc4dadd gimplify_expr(tree_node**, gimple**, gimple**, bool (*)(tree_node*),
int)
???:0
0xc4f8fe gimplify_expr(tree_node**, gimple**, gimple**, bool (*)(tree_node*),
int)
???:0
0xc4dadd gimplify_expr(tree_node**, gimple**, gimple**, bool (*)(tree_node*),
int)
???:0
0xc4f8fe gimplify_expr(tree_node**, gimple**, gimple**, bool (*)(tree_node*),
int)
???:0
0xc51ede gimplify_body(tree_node*, bool)
???:0
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See  for instructions.
Compiler returned: 1

[Bug target/117116] [15 regression] error: unrecognizable insn: with -march=znver3

2024-10-16 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117116

Uroš Bizjak  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
   Assignee|uros at gcc dot gnu.org|ubizjak at gmail dot com
   Target Milestone|15.0|12.5
 Resolution|--- |FIXED

--- Comment #20 from Uroš Bizjak  ---
Fixed everywhere.

[Bug libstdc++/117085] chrono formatting: %c does not honor locale after expansion

2024-10-16 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117085

Jonathan Wakely  changed:

   What|Removed |Added

 Status|RESOLVED|ASSIGNED
 Resolution|FIXED   |---

--- Comment #6 from Jonathan Wakely  ---
Please leave it open, it still needs to be backported.

[Bug tree-optimization/117176] New: [15 regression] ICE when building netpbm-11.8.0

2024-10-16 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117176

Bug ID: 117176
   Summary: [15 regression] ICE when building netpbm-11.8.0
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: sjames at gcc dot gnu.org
  Target Milestone: ---

Created attachment 59362
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59362&action=edit
pnmpsnr.i.xz

```
$ gcc -c ./pnmpsnr.i -O2 -march=znver2
during GIMPLE pass: vect
pnmpsnr.c: In function ‘main’:
pnmpsnr.c:576:1: internal compiler error: in operator[], at vec.h:910
  576 | main (int argc, const char **argv) {
  | ^~~~
0x5f7e773cecfd internal_error(char const*, ...)
   
/usr/src/debug/sys-devel/gcc-15.0./gcc-15.0./gcc/diagnostic-global-context.cc:517
0x5f7e773ac190 fancy_abort(char const*, int, char const*)
   
/usr/src/debug/sys-devel/gcc-15.0./gcc-15.0./gcc/diagnostic.cc:1535
0x5f7e763e00e0 vec<_slp_tree*, va_heap, vl_embed>::operator[](unsigned int)
/usr/src/debug/sys-devel/gcc-15.0./gcc-15.0./gcc/vec.h:910
0x5f7e763e00e0 vec<_slp_tree*, va_heap, vl_ptr>::operator[](unsigned int)
/usr/src/debug/sys-devel/gcc-15.0./gcc-15.0./gcc/vec.h:1599
0x5f7e763e00e0 vect_is_simple_use(vec_info*, _stmt_vec_info*, _slp_tree*,
unsigned int, tree_node**, _slp_tree**, vect_def_type*, tree_node**,
_stmt_vec_info**)
   
/usr/src/debug/sys-devel/gcc-15.0./gcc-15.0./gcc/tree-vect-stmts.cc:14216
0x5f7e77eafeda vectorizable_comparison_1
   
/usr/src/debug/sys-devel/gcc-15.0./gcc-15.0./gcc/tree-vect-stmts.cc:12809
0x5f7e77c47d1a vectorizable_early_exit(vec_info*, _stmt_vec_info*,
gimple_stmt_iterator*, gimple**, _slp_tree*, vec*)
   
/usr/src/debug/sys-devel/gcc-15.0./gcc-15.0./gcc/tree-vect-stmts.cc:13142
0x5f7e77c47d1a vectorizable_early_exit(vec_info*, _stmt_vec_info*,
gimple_stmt_iterator*, gimple**, _slp_tree*, vec*)
   
/usr/src/debug/sys-devel/gcc-15.0./gcc-15.0./gcc/tree-vect-stmts.cc:13021
0x5f7e77b4fa2d vect_slp_analyze_operations(vec_info*)
   
/usr/src/debug/sys-devel/gcc-15.0./gcc-15.0./gcc/tree-vect-slp.cc:8158
0x5f7e77edbea5 vect_analyze_loop_2
   
/usr/src/debug/sys-devel/gcc-15.0./gcc-15.0./gcc/tree-vect-loop.cc:2976
0x5f7e77eda8d6 vect_analyze_loop_1
   
/usr/src/debug/sys-devel/gcc-15.0./gcc-15.0./gcc/tree-vect-loop.cc:3454
0x5f7e77c1063a vect_analyze_loop(loop*, gimple*, vec_info_shared*)
   
/usr/src/debug/sys-devel/gcc-15.0./gcc-15.0./gcc/tree-vect-loop.cc:3614
0x5f7e77ed965d try_vectorize_loop_1
   
/usr/src/debug/sys-devel/gcc-15.0./gcc-15.0./gcc/tree-vectorizer.cc:1072
0x5f7e77ed965d try_vectorize_loop
   
/usr/src/debug/sys-devel/gcc-15.0./gcc-15.0./gcc/tree-vectorizer.cc:1189
0x5f7e77c0e185 execute
   
/usr/src/debug/sys-devel/gcc-15.0./gcc-15.0./gcc/tree-vectorizer.cc:1305
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See  for instructions.
```

```
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-pc-linux-gnu/15/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with:
/var/tmp/portage/sys-devel/gcc-15.0./work/gcc-15.0./configure
--host=x86_64-pc-linux-gnu --build=x86_64-pc-linux-gnu --prefix=/usr
--bindir=/usr/x86_64-pc-linux-gnu/gcc-bin/15
--includedir=/usr/lib/gcc/x86_64-pc-linux-gnu/15/include
--datadir=/usr/share/gcc-data/x86_64-pc-linux-gnu/15
--mandir=/usr/share/gcc-data/x86_64-pc-linux-gnu/15/man
--infodir=/usr/share/gcc-data/x86_64-pc-linux-gnu/15/info
--with-gxx-include-dir=/usr/lib/gcc/x86_64-pc-linux-gnu/15/include/g++-v15
--disable-silent-rules --disable-dependency-tracking
--with-python-dir=/share/gcc-data/x86_64-pc-linux-gnu/15/python
--enable-objc-gc --enable-languages=c,c++,d,go,objc,obj-c++,fortran,ada,m2,rust
--enable-obsolete --enable-secureplt --disable-werror --with-system-zlib
--enable-nls --without-included-gettext --disable-libunwind-exceptions
--enable-checking=yes,extra,rtl --with-bugurl=https://bugs.gentoo.org/
--with-pkgversion='Gentoo Hardened 15.0. p, commit
b4a852b00ffc7c9436d15867d602c28837c97758' --with-gcc-major-version-only
--enable-libstdcxx-time --enable-lto --disable-libstdcxx-pch --enable-shared
--enable-threads=posix --enable-__cxa_atexit --enable-clocale=gnu
--enable-multilib --with-multilib-list=m32,m64 --disable-fixed-point
--enable-targets=all --enable-libgomp --disable-libssp --enable-libada
--disable-cet --disable-systemtap --enable-valgrind-annotations
--disable-vtable-verify --disable-libvtv --with-zstd --with-isl
--disable-isl-version-check --enable-default-pie --enable-host-pie
--enable-host-bind-now --enable-default-ssp --disable-fixincludes
--with-build-c

[Bug target/117173] can_vec_perm_const_p does not consider costs

2024-10-16 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117173

--- Comment #7 from Richard Biener  ---
To say that - we expect the backends to output the optimal sequence for each
permute.  The limit is that for a series of permutes this doesn't consider
the case where there might be more "complex" permutes that through sharing
common parts would be cheaper overall.

Given the "magic" is all in the backends there's no good way to implement
a generic "global" optimization for this.

One baby-step towards this would be to have a can_vec_perm_const that
fails when the permute requires more than one "sub-"permute on the target
or even better, give back the series of "sub-"permutes that would be
performed in a symbolic way.

But it's possibly "easier" to scrape the .mds for define_insns that
implement vector permutations.

[Bug tree-optimization/116575] [15 Regression] blender in SPEC2017 fails to use mask_load_lanes

2024-10-16 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116575

--- Comment #9 from Richard Biener  ---
I'll note the mask for SLP load-lanes is also off, we discover

mask_struct_load_2.c:39:1: note:   node 0x5feccd0 (max_nunits=16, refcnt=2)
vector([16,16]) 
mask_struct_load_2.c:39:1: note:   op template: _21 = _3 != 0;
mask_struct_load_2.c:39:1: note:stmt 0 _21 = _3 != 0;
mask_struct_load_2.c:39:1: note:stmt 1 _21 = _3 != 0;
mask_struct_load_2.c:39:1: note:stmt 2 _21 = _3 != 0;
mask_struct_load_2.c:39:1: note:children 0x5fece98 0x5fecf30
mask_struct_load_2.c:39:1: note:   node (constant) 0x5fece98 (max_nunits=1,
refcnt=1)
mask_struct_load_2.c:39:1: note:{ 0, 0, 0 }
mask_struct_load_2.c:39:1: note:   node 0x5fecf30 (max_nunits=16, refcnt=2)
vector([16,16]) signed char
mask_struct_load_2.c:39:1: note:   op: VEC_PERM_EXPR
mask_struct_load_2.c:39:1: note:stmt 0 _3 = *_2;
mask_struct_load_2.c:39:1: note:stmt 1 _3 = *_2;
mask_struct_load_2.c:39:1: note:stmt 2 _3 = *_2;
mask_struct_load_2.c:39:1: note:lane permutation { 0[0] 0[0] 0[0] }
mask_struct_load_2.c:39:1: note:children 0x5fec9d8

as the mask - but in the end the actual CPU instruction only needs a third
of the lanes and the above specific permute (the "splat"), isn't supported.

load/store-lanes are a difficult beast and the SLP representation we
currently use might be sub-optimal.


SLP pattern matching could be another place to discover load-lanes.

[Bug target/117168] Bootstrap fails with ICE: in curr_insn_transform, at lra-constraints.cc:4283

2024-10-16 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117168

--- Comment #10 from GCC Commits  ---
The master branch has been updated by Eric Botcazou :

https://gcc.gnu.org/g:935b7fbd03373c91bae065c6fe862a9fc7d1a901

commit r15-4385-g935b7fbd03373c91bae065c6fe862a9fc7d1a901
Author: Eric Botcazou 
Date:   Wed Oct 16 13:59:50 2024 +0200

Fix bootstrap on 32-bit SPARC/Solaris

The 'U' constraint cannot be used with LRA.

gcc/
PR target/113952
PR target/117168
* config/sparc/constraints.md ('U'): Delete.
* config/sparc/sparc.md (*movdi_insn_sp32): Remove U alternatives.
(*movdf_insn_sp32): Likewise.
(*mov_insn_sp32): Likewise.
* doc/md.texi (SPARC constraints): Remove entry for 'U'.

[Bug tree-optimization/117093] Missing detection of REV64 vector permute

2024-10-16 Thread ktkachov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117093

--- Comment #3 from ktkachov at gcc dot gnu.org ---
I think it's the VIEW_CONVERT_EXPR that are hurting us (more complete dump
before expand):
  _1 = VIEW_CONVERT_EXPR(r_3(D));
  t_4 = BIT_FIELD_REF ;
  a_5 = VEC_PERM_EXPR <_1, _1, { 1, 1, 2, 3 }>;
  a_6 = BIT_INSERT_EXPR ;
  t_7 = BIT_FIELD_REF ;
  _2 = BIT_FIELD_REF ;
  a_8 = BIT_INSERT_EXPR ;
  a_9 = BIT_INSERT_EXPR ;
  _10 = VIEW_CONVERT_EXPR(a_9);
  return _10;

If we remove the casts:
uint32x4_t ror32_neon_tgt_gcc_bad(uint32x4_t r) {
uint32x4_t a = r;
uint32_t t;
t = a[0]; a[0] = a[1]; a[1] = t;
t = a[2]; a[2] = a[3]; a[3] = t;
return a;
}
Then this is successfully recognised as:
  a_2 = VEC_PERM_EXPR ;

[Bug middle-end/117123] [14/15 regression] Generated code at -Os on trunk is larger than GCC 14.4 since r14-6536-gcd794c39610177 (sccopy)

2024-10-16 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117123

--- Comment #8 from rguenther at suse dot de  ---
On Wed, 16 Oct 2024, pheeck at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117123
> 
> --- Comment #7 from Filip Kastl  ---
> It is not in a loop.  I guess I'll double-check that there aren't any
> differences which I didn't notice.  There is one here:
> 
> 57   # spud$size_22 = PHI <10(4), a_24(D)(2), a_24(D)(3)>
> 44   # spud$size_55 = PHI 
> 
> The constant 10 is on a different position in the PHI function.  However, I
> think that this also shouldn't have any effect on PRE.

It shouldn't, but the order of edges has an influence on the order
of sets processed in compute_antic (but the result should be independent
on order ... in theory).

You can compare -fdump-tree-pre-details, specifically the ANTIC_IN
sets.

[Bug target/117116] [15 regression] error: unrecognizable insn: with -march=znver3

2024-10-16 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117116

--- Comment #17 from GCC Commits  ---
The releases/gcc-14 branch has been updated by Uros Bizjak :

https://gcc.gnu.org/g:8be94d5643176ecd2dcdceaf4448c3b89318037c

commit r14-10797-g8be94d5643176ecd2dcdceaf4448c3b89318037c
Author: Uros Bizjak 
Date:   Tue Oct 15 16:51:33 2024 +0200

i386: Fix expand_vector_set for VEC_MERGE/VEC_DUPLICATE RTX [PR117116]

Middle end can generate SYMBOL_REF RTX as a value "val" in the call
to expand_vector_set, but SYMBOL_REF RTX is not accepted in
_pinsr insn pattern, generated via
VEC_MERGE/VEC_DUPLICATE RTX path.

Force the value into a register before VEC_MERGE/VEC_DUPLICATE RTX
is generated if it doesn't satisfy nonimmediate_operand predicate.

PR target/117116

gcc/ChangeLog:

* config/i386/i386-expand.cc (expand_vector_set): Force "val"
into a register before VEC_MERGE/VEC_DUPLICATE RTX is generated
if it doesn't satisfy nonimmediate_operand predicate.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr117116.c: New test.

(cherry picked from commit 80d7032067a3a5b76aecd657d9b35b0a8f5a941d)

[Bug target/117116] [15 regression] error: unrecognizable insn: with -march=znver3

2024-10-16 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117116

--- Comment #18 from GCC Commits  ---
The releases/gcc-13 branch has been updated by Uros Bizjak :

https://gcc.gnu.org/g:dc295054c4ba28e44d4856bb68d148e9ac272d05

commit r13-9119-gdc295054c4ba28e44d4856bb68d148e9ac272d05
Author: Uros Bizjak 
Date:   Tue Oct 15 16:51:33 2024 +0200

i386: Fix expand_vector_set for VEC_MERGE/VEC_DUPLICATE RTX [PR117116]

Middle end can generate SYMBOL_REF RTX as a value "val" in the call
to expand_vector_set, but SYMBOL_REF RTX is not accepted in
_pinsr insn pattern, generated via
VEC_MERGE/VEC_DUPLICATE RTX path.

Force the value into a register before VEC_MERGE/VEC_DUPLICATE RTX
is generated if it doesn't satisfy nonimmediate_operand predicate.

PR target/117116

gcc/ChangeLog:

* config/i386/i386-expand.cc (expand_vector_set): Force "val"
into a register before VEC_MERGE/VEC_DUPLICATE RTX is generated
if it doesn't satisfy nonimmediate_operand predicate.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr117116.c: New test.

(cherry picked from commit 80d7032067a3a5b76aecd657d9b35b0a8f5a941d)

[Bug target/117116] [15 regression] error: unrecognizable insn: with -march=znver3

2024-10-16 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117116

--- Comment #19 from GCC Commits  ---
The releases/gcc-12 branch has been updated by Uros Bizjak :

https://gcc.gnu.org/g:a8bd38de88715fdbf0d064ff0d50e2b8734de939

commit r12-10774-ga8bd38de88715fdbf0d064ff0d50e2b8734de939
Author: Uros Bizjak 
Date:   Tue Oct 15 16:51:33 2024 +0200

i386: Fix expand_vector_set for VEC_MERGE/VEC_DUPLICATE RTX [PR117116]

Middle end can generate SYMBOL_REF RTX as a value "val" in the call
to expand_vector_set, but SYMBOL_REF RTX is not accepted in
_pinsr insn pattern, generated via
VEC_MERGE/VEC_DUPLICATE RTX path.

Force the value into a register before VEC_MERGE/VEC_DUPLICATE RTX
is generated if it doesn't satisfy nonimmediate_operand predicate.

PR target/117116

gcc/ChangeLog:

* config/i386/i386-expand.cc (expand_vector_set): Force "val"
into a register before VEC_MERGE/VEC_DUPLICATE RTX is generated
if it doesn't satisfy nonimmediate_operand predicate.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr117116.c: New test.

(cherry picked from commit 80d7032067a3a5b76aecd657d9b35b0a8f5a941d)

[Bug c++/117174] Compiler seems to incorrectly cache SFINAE condition evaluation results

2024-10-16 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117174

--- Comment #1 from Andrew Pinski  ---
Created attachment 59361
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59361&action=edit
Full testcase

[Bug c++/117175] Internal compiler error in gimple_add_tmp_var, at gimplify.cc:802

2024-10-16 Thread lozko.roma at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117175

Roman Lozko  changed:

   What|Removed |Added

 CC||lozko.roma at gmail dot com

--- Comment #1 from Roman Lozko  ---
trunk shows a different stack trace for the same error
https://godbolt.org/z/49PdrEj4W

: In function 'void f(_Z1fv.Frame*)':
:33:22: internal compiler error: in cxx_eval_constant_expression, at
cp/constexpr.cc:7680
   33 | {.data = "4"},
  |  ^~~
0x286ad45 diagnostic_context::diagnostic_impl(rich_location*,
diagnostic_metadata const*, diagnostic_option_id, char const*, __va_list_tag
(*) [1], diagnostic_t)
???:0
0x287eb25 internal_error(char const*, ...)
???:0
0xa8c6e8 fancy_abort(char const*, int, char const*)
???:0
0xb04520 maybe_constant_value(tree_node*, tree_node*, mce_value)
???:0
0xb2ffbc cp_fully_fold_init(tree_node*)
???:0
0xd76afe split_nonconstant_init(tree_node*, tree_node*)
???:0
0x176305c walk_tree_1(tree_node**, tree_node* (*)(tree_node**, int*, void*),
void*, hash_set >*,
tree_node* (*)(tree_node**, int*, tree_node* (*)(tree_node**, int*, void*),
void*, hash_set >*))
???:0
0x176374d walk_tree_1(tree_node**, tree_node* (*)(tree_node**, int*, void*),
void*, hash_set >*,
tree_node* (*)(tree_node**, int*, tree_node* (*)(tree_node**, int*, void*),
void*, hash_set >*))
???:0
0x176374d walk_tree_1(tree_node**, tree_node* (*)(tree_node**, int*, void*),
void*, hash_set >*,
tree_node* (*)(tree_node**, int*, tree_node* (*)(tree_node**, int*, void*),
void*, hash_set >*))
???:0
0x1763272 walk_tree_1(tree_node**, tree_node* (*)(tree_node**, int*, void*),
void*, hash_set >*,
tree_node* (*)(tree_node**, int*, tree_node* (*)(tree_node**, int*, void*),
void*, hash_set >*))
???:0
0x176374d walk_tree_1(tree_node**, tree_node* (*)(tree_node**, int*, void*),
void*, hash_set >*,
tree_node* (*)(tree_node**, int*, tree_node* (*)(tree_node**, int*, void*),
void*, hash_set >*))
???:0
0x176374d walk_tree_1(tree_node**, tree_node* (*)(tree_node**, int*, void*),
void*, hash_set >*,
tree_node* (*)(tree_node**, int*, tree_node* (*)(tree_node**, int*, void*),
void*, hash_set >*))
???:0
0x1763272 walk_tree_1(tree_node**, tree_node* (*)(tree_node**, int*, void*),
void*, hash_set >*,
tree_node* (*)(tree_node**, int*, tree_node* (*)(tree_node**, int*, void*),
void*, hash_set >*))
???:0
0x176374d walk_tree_1(tree_node**, tree_node* (*)(tree_node**, int*, void*),
void*, hash_set >*,
tree_node* (*)(tree_node**, int*, tree_node* (*)(tree_node**, int*, void*),
void*, hash_set >*))
???:0
0x176374d walk_tree_1(tree_node**, tree_node* (*)(tree_node**, int*, void*),
void*, hash_set >*,
tree_node* (*)(tree_node**, int*, tree_node* (*)(tree_node**, int*, void*),
void*, hash_set >*))
???:0
0xb27783 cp_fold_function(tree_node*)
???:0
0xb78616 finish_function(bool)
???:0
0xb19eaa cp_coroutine_transform::finish_transforms()
???:0
0xb7869f finish_function(bool)
???:0
0xc99cda c_parse_file()
???:0

[Bug c++/117129] [14/15 Regression] internal compiler error: Segmentation fault at gimplify_expr(tree_node**, gimple**, gimple**, bool (*)(tree_node*), int)

2024-10-16 Thread simartin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117129

Simon Martin  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |simartin at gcc dot 
gnu.org
 Status|NEW |ASSIGNED

--- Comment #2 from Simon Martin  ---
Working on it.

[Bug target/117173] can_vec_perm_const_p does not consider costs

2024-10-16 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117173

--- Comment #4 from Richard Biener  ---
I can see the difficulty here.  Note a strathegy for decomposition would be
to produce a monotonic permute of any two input vector permute by noting
elements needed in the final result.  If it's possible to aggregate all of them
then do that.  This transform isn't unique as elements not needed can be
select from either vector so it's likely the now present two permutes
will end up as four - two blends and two single input permutes where the
blends might not agree and are not CSEable unless we have a clever way
of encoding "don't care" and CSE of those.

But yeah, it should be still cheaper than two gathers and applicable to
a subset of all two input permutes.

[Bug c++/117099] [14/15 Regression] internal compiler error: Segmentation fault in finalize_nrv(tree_node*, tree_node*)

2024-10-16 Thread simartin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117099

Simon Martin  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |simartin at gcc dot 
gnu.org
 Status|NEW |ASSIGNED

--- Comment #3 from Simon Martin  ---
Working on it.

[Bug target/117168] Bootstrap fails with ICE: in curr_insn_transform, at lra-constraints.cc:4283

2024-10-16 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117168

--- Comment #9 from Sam James  ---
(In reply to Eric Botcazou from comment #6)
> Created attachment 59359 [details]
> Tentative fix

This gets me to PR117170 (same as when reverting my commit) on Solaris and
Linux bootstrapped fine. I'll check the testsuites later.

[Bug tree-optimization/116575] [15 Regression] blender in SPEC2017 fails to use mask_load_lanes

2024-10-16 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116575

Richard Biener  changed:

   What|Removed |Added

 CC||tnfchris at gcc dot gnu.org

--- Comment #8 from Richard Biener  ---
FAIL: gcc.target/aarch64/sve/mask_struct_load_2.c

for example fails because of this.  We now correctly do SLP discovery to

mask_struct_load_2.c:39:1: note:   node 0x4ed9940 (max_nunits=16, refcnt=2)
vector([16,16]) signed char
mask_struct_load_2.c:39:1: note:   op: VEC_PERM_EXPR
mask_struct_load_2.c:39:1: note:stmt 0 _7 = .MASK_LOAD (_6, 8B, _21);
mask_struct_load_2.c:39:1: note:lane permutation { 0[0] }
mask_struct_load_2.c:39:1: note:children 0x4ed99d8
mask_struct_load_2.c:39:1: note:   node 0x4ed9e00 (max_nunits=16, refcnt=2)
vector([16,16]) signed char
mask_struct_load_2.c:39:1: note:   op: VEC_PERM_EXPR
mask_struct_load_2.c:39:1: note:stmt 0 _11 = .MASK_LOAD (_10, 8B, _21);
mask_struct_load_2.c:39:1: note:lane permutation { 0[1] }
mask_struct_load_2.c:39:1: note:children 0x4ed99d8
mask_struct_load_2.c:39:1: note:   node 0x4ed9f30 (max_nunits=16, refcnt=2)
vector([16,16]) signed char
mask_struct_load_2.c:39:1: note:   op: VEC_PERM_EXPR
mask_struct_load_2.c:39:1: note:stmt 0 _16 = .MASK_LOAD (_15, 8B, _21);
mask_struct_load_2.c:39:1: note:lane permutation { 0[2] }
mask_struct_load_2.c:39:1: note:children 0x4ed99d8
mask_struct_load_2.c:39:1: note:   node 0x4ed99d8 (max_nunits=16, refcnt=4)
vector([16,16]) signed char
mask_struct_load_2.c:39:1: note:   op template: _7 = .MASK_LOAD (_6, 8B, _21);
mask_struct_load_2.c:39:1: note:stmt 0 _7 = .MASK_LOAD (_6, 8B, _21);
mask_struct_load_2.c:39:1: note:stmt 1 _11 = .MASK_LOAD (_10, 8B, _21);
mask_struct_load_2.c:39:1: note:stmt 2 _16 = .MASK_LOAD (_15, 8B, _21);
mask_struct_load_2.c:39:1: note:children 0x4ed9a70 

but this representation is not marked as ->ldst_p - it doesn't require
further lowering (there's no permute on the actual load) and that's what
currently sets the want-to-use-load-lanes flag.

For masked load lanes we need some other place setting this - it could
be as late as during permute optimization (where we conveniently have
backward edges for the SLP graph).  I do not want to set the flag during
SLP discovery (which now splits nodes as seen above).

FAIL: gcc.target/aarch64/sve/mask_struct_load_1.c

fails the same way though IMO questionable whether ld2 is really profitable
here.

[Bug tree-optimization/117140] [15 regression] RISC-V: ICE in initialize_flags_in_bb for rv32gcv

2024-10-16 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117140

--- Comment #9 from Tamar Christina  ---
(In reply to rguent...@suse.de from comment #8)
> On Wed, 16 Oct 2024, tnfchris at gcc dot gnu.org wrote:
> > diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
> > index 8727246c27a..c028594e18b 100644
> > --- a/gcc/tree-vect-slp.cc
> > +++ b/gcc/tree-vect-slp.cc
> > @@ -11128,7 +11128,8 @@ vectorize_slp_instance_root_stmt (vec_info *vinfo,
> > slp_tree node, slp_instance i
> >  can't support lane > 1 at this time.  */
> >gcc_assert (instance->root_stmts.length () == 1);
> >auto root_stmt_info = instance->root_stmts[0];
> > -  auto last_stmt = STMT_VINFO_STMT (root_stmt_info);
> 
> if this is a pattern you'd want STMT_VINFO_STMT (vect_orig_stmt 
> (root_stmt_info))

Ah of course.. doh..

running regression tests..

[Bug libstdc++/117085] chrono formatting: %c does not honor locale after expansion

2024-10-16 Thread xu2k3l4 at outlook dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117085

XU Kailiang  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #5 from XU Kailiang  ---
thank you, the output looks much nicer now!

mark as fixed.

[Bug fortran/80235] ICE: coarrays, submodule

2024-10-16 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80235

--- Comment #9 from GCC Commits  ---
The master branch has been updated by Andre Vehreschild :

https://gcc.gnu.org/g:e32fff675c3bb040fa79854f6b0654c16bc38997

commit r15-4405-ge32fff675c3bb040fa79854f6b0654c16bc38997
Author: Andre Vehreschild 
Date:   Tue Sep 24 14:30:52 2024 +0200

Fix ICE with coarrays and submodules [PR80235]

Exposing a variable in a module and referencing it in a submodule made
the compiler ICE, because the external variable was not sorted into the
correct module.  In fact the module name was not set where the variable
got built.

gcc/fortran/ChangeLog:

PR fortran/80235

* trans-decl.cc (gfc_build_qualified_array): Make sure the array
is associated to the correct module and being marked as extern.

gcc/testsuite/ChangeLog:

* gfortran.dg/coarray/add_sources/submodule_1_sub.f90: New test.
* gfortran.dg/coarray/submodule_1.f90: New test.

[Bug tree-optimization/115274] [14/15 regression] Bogus -Wstringop-overread in SQLite source code

2024-10-16 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115274

--- Comment #12 from Sam James  ---
(In reply to D. Richard Hipp from comment #6)
> The source file that causes the problem can now be downloaded from
> .

If you can reproduce it with the non-amalgamation build, that would be much
easier to analyse.

[Bug tree-optimization/116939] rewrite_to_defined_overflow/gimple_with_undefined_signed_overflow should also rewrite VCEs (from/to integral types) into casts

2024-10-16 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116939

--- Comment #2 from Andrew Pinski  ---
Created attachment 59368
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59368&action=edit
patch in test

Depends on a few others which either have been approved or posted.

[Bug rtl-optimization/115879] ICE: verify_flow_info failed: missing REG_EH_REGION note at the end of bb 6 with -O -fnon-call-exceptions -finstrument-functions and invalid memory access

2024-10-16 Thread zsojka at seznam dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115879

Zdenek Sojka  changed:

   What|Removed |Added

Summary|ICE: verify_flow_info   |ICE: verify_flow_info
   |failed: missing |failed: missing
   |REG_EH_REGION note at the   |REG_EH_REGION note at the
   |end of bb 6 with -O |end of bb 6 with -O
   |-fnon-call-exceptions   |-fnon-call-exceptions
   |-finstrument-functions and  |-finstrument-functions and
   |_BitInt()   |invalid memory access

--- Comment #2 from Zdenek Sojka  ---
_BitInt() is not needed, this can be also triggered at least by memset() and
memcpy() with the memory operand having negative offset to address to a
variable.

  1   2   >