[Bug c++/106812] Throwing a non-copyable exception

2024-12-15 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106812

Andrew Pinski  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #5 from Andrew Pinski  ---
Dup.

*** This bug has been marked as a duplicate of bug 103048 ***

[Bug c/117178] -Wunterminated-string-initialization should ignore trailing NUL byte for nonstring char arrays

2024-12-15 Thread kees at outflux dot net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117178

--- Comment #21 from Kees Cook  ---
Okay, now with tests and an updated truncation message. :)

Please see:
https://gcc.gnu.org/pipermail/gcc-patches/2024-December/671714.html

(Hopefully I have managed to get coding and commit log style correct...)

[Bug tree-optimization/118055] New: [15 Regression] gcc.dg/tree-ssa/pr83403-1.c and -2 for CRIS and m68k since r15-6097-gee2f19b0937b5e

2024-12-15 Thread hp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118055

Bug ID: 118055
   Summary: [15 Regression] gcc.dg/tree-ssa/pr83403-1.c and -2 for
CRIS and m68k since r15-6097-gee2f19b0937b5e
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hp at gcc dot gnu.org
CC: liuhongt at gcc dot gnu.org
  Target Milestone: ---
Target: cris-elf, m68k-unknown-linux-gnu

The commit r15-6097-gee2f19b0937b5e caused gcc.dg/tree-ssa/pr83403-1.c and
gcc.dg/tree-ssa/pr83403-2.c to regress for cris-elf - and also
m68k-unknown-linux-gnu from inspection of
https://gcc.gnu.org/pipermail/gcc-testresults/2024-December/832621.html and
https://gcc.gnu.org/pipermail/gcc-testresults/2024-December/832703.html like
so:
FAIL: gcc.dg/tree-ssa/pr83403-1.c scan-tree-dump-times lim2 "Executing store
motion of" 10
FAIL: gcc.dg/tree-ssa/pr83403-2.c scan-tree-dump-times lim2 "Executing store
motion of" 10

Apparently and surprisingly, this happens for no other target posted
around that time than for (CRIS and) m68k.

Possibly arm*-*-* was also affected, as the commit forces "--param
max-completely-peeled-insns=300" but with no explanation.

[Bug tree-optimization/118032] Bootstrap slowdown for risc-v

2024-12-15 Thread jvdelisle at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118032

Jerry DeLisle  changed:

   What|Removed |Added

 CC||jvdelisle at gcc dot gnu.org

--- Comment #8 from Jerry DeLisle  ---
FWIW, I do not do bootstrap. I only do C, C++, and Fortran. I have noticed
build time increasing from about 15 minutes to 18 minutes.

[Bug tree-optimization/118032] Bootstrap slowdown for risc-v

2024-12-15 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118032

--- Comment #9 from Andrew Pinski  ---
(In reply to Andreas Schwab from comment #7)
> 20241206:  2d 12:19:19
> 20241213:  2d 20:08:16

The big things between this are: 64bit location_t. I wonder if that introduced
the slow down.

[Bug tree-optimization/118055] [15 Regression] gcc.dg/tree-ssa/pr83403-1.c and -2 for CRIS and m68k since r15-6097-gee2f19b0937b5e

2024-12-15 Thread hp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118055

--- Comment #2 from Hans-Peter Nilsson  ---
(In reply to Hongtao Liu from comment #1)
> I explained in the thread.
> https://gcc.gnu.org/pipermail/gcc-patches/2024-December/671289.html
> 
> -
> BTW arm ci reported 2 regressed testcase so I added
> * gcc.dg/tree-ssa/pr83403-1.c: Add --param max-completely-peeled-insns=300
> for arm*-*-*.
> * gcc.dg/tree-ssa/pr83403-2.c: Ditto.
> 
> For 32-bit arm, there're more stmts in the innermost loop,
> and removal of the reduction prevents completely unrolling for them.
> For aarch64, it looks fine.

Thanks for the quick reply.  I found the post after entering the bug.  Still,
that's not an explanation of why it happens for these targets and not for
others.

Is it perhaps that the test is brittle; mostly target-specific despite being at
the tree-level and that instead the scan-test should be a specific
known-matching target list?

> So the fix for cris-elf/m68k could be similar as arm.

While I know the "fix" works for cris-elf, I'd prefer to put a meaningful
effective target than accumulating a list of random targets.

[Bug target/116979] [12/13/14/15 regression] fma not always used in complex product

2024-12-15 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116979

--- Comment #21 from Richard Biener  ---
(In reply to Richard Biener from comment #20)
> (In reply to Jakub Jelinek from comment #17)
> > Not marking as fixed for GCC 15 yet, as there is the scalar cost computation
> > issue unresolved.
> 
> I will look into this part.

So this is because all load lanes and the FMADDSUB lanes are "live" and we
try to be conservative with respect to costing.  Given we cannot be sure
to be able to use lane-extraction to code-generate "live" (because of
scheduling issues), the intent was to simply not cost the scalar stmts
for the whole "live" subgraph.  The latter doesn't seem to work because
of the check in vect_bb_slp_scalar_cost:

  /* If there is a non-vectorized use of the defs then the scalar
 stmt is kept live in which case we do not account it or any
 required defs in the SLP children in the scalar cost.  This
 way we make the vectorization more costly when compared to
 the scalar cost.  */
  if (!STMT_VINFO_LIVE_P (stmt_info))

which keeps live[i] == false if STMT_VINFO_LIVE_P (stmt_info).  Fixing this
would make the issue only worse - what we fail to realize is that

note: node 0x4dc9ac0 (max_nunits=2, refcnt=1) vector(2) float
note: op template: _8 = REALPART_EXPR f>;
note:   [l] stmt 0 _8 = REALPART_EXPR f>;
note:   [l] stmt 1 _8 = REALPART_EXPR f>;
note:   load permutation { 0 0 }

should be better handled as extern { _8, _8 } rather than costed as
vector load + permute (cost of 4, compared to 12 + 4).  But then the
conservative handling of "live" would cause a scalar cost of zero.

The vector construction from the __mulsc3 result also fails to consume
the REAL/IMAGPART_EXPRs, but I guess that is harder to fix.

That said, the proper solution is to make "live" handling less conservative
and compute scheduling constraints so we know whether extracted lanes can
substitute for the original scalar ones (uses are not after the vector def).

Fixing the above conservativeness without further changes results in the
following

gcc.target/i386/pr116979.c:15:10: note: Cost model analysis:
_21 1 times scalar_stmt costs 12 in body
_22 1 times scalar_stmt costs 12 in body
_21 = PHI <_16(5), _19(3)> 1 times scalar_stmt costs 12 in body
_22 = PHI <_17(5), _20(3)> 1 times scalar_stmt costs 12 in body
REALPART_EXPR f> 1 times vec_perm costs 4 in body
REALPART_EXPR f> 1 times unaligned_load (misalign -1) costs 12 in body
REALPART_EXPR f> 1 times unaligned_load (misalign -1) costs 12 in body
IMAGPART_EXPR f> 1 times vec_perm costs 4 in body
IMAGPART_EXPR f> 1 times unaligned_load (misalign -1) costs 12 in body
IMAGPART_EXPR f> 1 times vec_perm costs 4 in body
IMAGPART_EXPR f> 1 times unaligned_load (misalign -1) costs 12 in body
_9 * _11 1 times vector_stmt costs 16 in body
.VEC_FMADDSUB (_8, _10, _13) 1 times vector_stmt costs 12 in body
_21 = PHI <_16(5), _19(3)> 1 times vector_stmt costs 12 in body
node 0x4dc9d00 1 times vec_construct costs 4 in prologue
_21 1 times vector_store costs 12 in body
_12 - _13 1 times vec_to_scalar costs 4 in epilogue
_14 + _15 1 times vec_to_scalar costs 4 in epilogue
REALPART_EXPR f> 1 times vec_to_scalar costs 4 in epilogue
REALPART_EXPR f> 1 times vec_to_scalar costs 4 in epilogue
REALPART_EXPR f> 1 times vec_to_scalar costs 4 in epilogue
IMAGPART_EXPR f> 1 times vec_to_scalar costs 4 in epilogue
IMAGPART_EXPR f> 1 times vec_to_scalar costs 4 in epilogue
IMAGPART_EXPR f> 1 times vec_to_scalar costs 4 in epilogue
IMAGPART_EXPR f> 1 times vec_to_scalar costs 4 in epilogue
REALPART_EXPR f> 1 times vec_to_scalar costs 4 in epilogue
gcc.target/i386/pr116979.c:15:10: note: Cost model analysis for part in loop 0:
  Vector cost: 156
  Scalar cost: 48

diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
index 9ad95104ec7..e7ef943d743 100644
--- a/gcc/tree-vect-slp.cc
+++ b/gcc/tree-vect-slp.cc
@@ -8740,6 +8740,11 @@ next_lane:
  if ((*life)[i])
continue;
}
+  else
+   {
+ (*life)[i] = true;
+ continue;
+   }

   /* Count scalar stmts only once.  */
   if (gimple_visited_p (orig_stmt))

[Bug tree-optimization/118032] Bootstrap slowdown for risc-v

2024-12-15 Thread law at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118032

--- Comment #3 from Jeffrey A. Law  ---
Could be.  I've got the 4G model and a swap partition on an m.2 drive rather
than the silly (and insanely slow) mmc card.  If Mark's got the 2G version or
is using the MMC card, it'd be much more sensitive to memory usage.

I'll also note that over the last 3 weeks my emulated native builds have
dramatically sped up, my money would be on the genemit changes.  It's at the
right time (last week of November) and purposefully designed to speed builds
up.

[Bug rtl-optimization/117964] duplicate computed gotos will happily duplicate blocks with 9189 successors

2024-12-15 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117964

--- Comment #8 from rguenther at suse dot de  ---
On Tue, 10 Dec 2024, segher at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117964
> 
> --- Comment #7 from Segher Boessenkool  ---
> When maybe_duplicate_computed_goto is asked to duplicate a block with 9189
> successors, it damn well should!  If that is a bad idea for the case at hand,
> just do not call maybe_duplicate_computed_goto on such a block!
> 
> maybe_duplicate_computed_goto should never ever decide to know better than
> its caller.  That way insanity lies.

"maybe_" suggests it's not all that clear, I understand the intent is
more like duplicate_computed_goto_if_possible (as opposed to
_if_profitable_and_possible).

But sure, limiting in the caller is sensible as well - it might be even
applying an overall cost (rather than a per PRED duplication cost).

But as said the main issue is the successors are "artificial" - there's
only a single real successor - it's just not known statically.  So finding
a better representation for the CFG to solve the time/memory issue is
better than not duplicating the gotos.

[Bug fortran/84674] [12/13/14 Regression] Derived type name change makes a program segfault, removing non_overridable helps

2024-12-15 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84674

--- Comment #19 from GCC Commits  ---
The releases/gcc-14 branch has been updated by Paul Thomas :

https://gcc.gnu.org/g:1572e634dec4a09593f68645939b5b5043de8de6

commit r14-11091-g1572e634dec4a09593f68645939b5b5043de8de6
Author: Paul Thomas 
Date:   Sun Dec 15 14:48:59 2024 +

Fortran: Fix non_overridable typebound proc problems [PR84674/117730].

2024-12-15  Paul Thomas  

gcc/fortran/ChangeLog

PR fortran/117730
PR fortran/84674
* class.cc (add_proc_comp): If the present typebound procedure
component is abstract, unconditionally check the replacement.
Only reject a non_overridable if it has no overridden procedure
and the component is already present in the vtype.

gcc/testsuite/ChangeLog

PR fortran/117730
* gfortran.dg/pr117730_a.f90: New test.
* gfortran.dg/pr117730_b.f90: New test.

PR fortran/84674
* gfortran.dg/pr84674.f90: New test.

[Bug fortran/117730] Wrong code with non_overridable typebound procedure

2024-12-15 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117730

--- Comment #9 from GCC Commits  ---
The releases/gcc-14 branch has been updated by Paul Thomas :

https://gcc.gnu.org/g:1572e634dec4a09593f68645939b5b5043de8de6

commit r14-11091-g1572e634dec4a09593f68645939b5b5043de8de6
Author: Paul Thomas 
Date:   Sun Dec 15 14:48:59 2024 +

Fortran: Fix non_overridable typebound proc problems [PR84674/117730].

2024-12-15  Paul Thomas  

gcc/fortran/ChangeLog

PR fortran/117730
PR fortran/84674
* class.cc (add_proc_comp): If the present typebound procedure
component is abstract, unconditionally check the replacement.
Only reject a non_overridable if it has no overridden procedure
and the component is already present in the vtype.

gcc/testsuite/ChangeLog

PR fortran/117730
* gfortran.dg/pr117730_a.f90: New test.
* gfortran.dg/pr117730_b.f90: New test.

PR fortran/84674
* gfortran.dg/pr84674.f90: New test.

[Bug fortran/117730] Wrong code with non_overridable typebound procedure

2024-12-15 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117730

--- Comment #10 from GCC Commits  ---
The releases/gcc-13 branch has been updated by Paul Thomas :

https://gcc.gnu.org/g:e33257cab75c1f8a07ea8d5c829b8aec7069683e

commit r13-9254-ge33257cab75c1f8a07ea8d5c829b8aec7069683e
Author: Paul Thomas 
Date:   Sun Dec 15 14:48:59 2024 +

Fortran: Fix non_overridable typebound proc problems [PR84674/117730].

2024-12-15  Paul Thomas  

gcc/fortran/ChangeLog

PR fortran/117730
PR fortran/84674
* class.cc (add_proc_comp): If the present typebound procedure
component is abstract, unconditionally check the replacement.
Only reject a non_overridable if it has no overridden procedure
and the component is already present in the vtype.

gcc/testsuite/ChangeLog

PR fortran/117730
* gfortran.dg/pr117730_a.f90: New test.
* gfortran.dg/pr117730_b.f90: New test.

PR fortran/84674
* gfortran.dg/pr84674.f90: New test.

(cherry picked from commit 1572e634dec4a09593f68645939b5b5043de8de6)

[Bug fortran/84674] [12/13/14 Regression] Derived type name change makes a program segfault, removing non_overridable helps

2024-12-15 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84674

--- Comment #20 from GCC Commits  ---
The releases/gcc-13 branch has been updated by Paul Thomas :

https://gcc.gnu.org/g:e33257cab75c1f8a07ea8d5c829b8aec7069683e

commit r13-9254-ge33257cab75c1f8a07ea8d5c829b8aec7069683e
Author: Paul Thomas 
Date:   Sun Dec 15 14:48:59 2024 +

Fortran: Fix non_overridable typebound proc problems [PR84674/117730].

2024-12-15  Paul Thomas  

gcc/fortran/ChangeLog

PR fortran/117730
PR fortran/84674
* class.cc (add_proc_comp): If the present typebound procedure
component is abstract, unconditionally check the replacement.
Only reject a non_overridable if it has no overridden procedure
and the component is already present in the vtype.

gcc/testsuite/ChangeLog

PR fortran/117730
* gfortran.dg/pr117730_a.f90: New test.
* gfortran.dg/pr117730_b.f90: New test.

PR fortran/84674
* gfortran.dg/pr84674.f90: New test.

(cherry picked from commit 1572e634dec4a09593f68645939b5b5043de8de6)

[Bug fortran/84674] [12/13/14 Regression] Derived type name change makes a program segfault, removing non_overridable helps

2024-12-15 Thread pault at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84674

Paul Thomas  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #21 from Paul Thomas  ---
Fixed on all affected branches.

Thank you for the report.

Paul

[Bug tree-optimization/118032] Bootstrap slowdown for risc-v

2024-12-15 Thread mark at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118032

--- Comment #4 from Mark Wielaard  ---
The slowdown is on the pioneer box, which has 64 cores and 128GB ram.
https://builder.sourceware.org/buildbot/#/builders/gcc-full-fedora-riscv
See the build times tab on that page.

It used to do builds in 4 to 5 hours (it slowly increased in the last month)
(bootstrap, make -j64 in ~2.5 hours, plus tests, make -j64 check in ~1.5 hours)
But recently jumped to ~9 hours
(bootstrap, make -j64 in ~7.5 hours, plus tests, make -j64 check in ~1.5 hours)
And in the last day it kept timing out trying to build insn-attrtab.cc for more
than 2.5 hours.

[Bug tree-optimization/118032] Bootstrap slowdown for risc-v

2024-12-15 Thread law at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118032

--- Comment #5 from Jeffrey A. Law  ---
Oh, on the pioneer. 

Hard to even guess.  Given how under-powered each core is, if you don't keep
them all busy it's going to be bad...  So I'd be looking for a serialization
problem in the build.

Or your pcie might be fried.  I'd check your system logs and see if any
diagnostics were issued, particularly for whatever drives you've got.  Ours
would take absurdly long times for various operations presumably due to disk
retries right before the whole thing would lose its mind.

Basically the pioneer boxes are junk.

[Bug c++/118049] New: conflicting global module declaration

2024-12-15 Thread furyusss at yahoo dot fr via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118049

Bug ID: 118049
   Summary: conflicting global module declaration
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: furyusss at yahoo dot fr
  Target Milestone: ---

Created attachment 59872
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59872&action=edit
3 source files

I get the following error:

```
In file included from /home/installdir/include/c++/15.0.0/ostream:45,
 from /home/installdir/include/c++/15.0.0/bits/unique_ptr.h:43,
 from /home/installdir/include/c++/15.0.0/bits/std_thread.h:45,
 from /home/installdir/include/c++/15.0.0/stop_token:40,
 from /home/installdir/include/c++/15.0.0/thread:44,
 from module2.cpp:3,
of module module2, imported at module2_impl.cpp:6:
/home/installdir/include/c++/15.0.0/format:2709:7: error: conflicting global
module declaration ‘auto
std::__format::_Sink_iter@module2<_CharT>::_M_reserve(std::size_t) const [with
_CharT = char; std::size_t = long unsigned int]’
 2709 |   _M_reserve(size_t __n) const
  |   ^~
In file included from /home/installdir/include/c++/15.0.0/ostream:45,
 from /home/installdir/include/c++/15.0.0/bits/unique_ptr.h:43,
 from /home/installdir/include/c++/15.0.0/memory:80,
 from module1.cpp:4,
of module module1, imported at module2.cpp:8,
of module module2, imported at module2_impl.cpp:6:
/home/installdir/include/c++/15.0.0/format:2709:7: note: existing declaration
‘auto std::__format::_Sink_iter@module2<_CharT>::_M_reserve(std::size_t) const
[with _CharT = char; std::size_t = long unsigned int]’
 2709 |   _M_reserve(size_t __n) const
  |   ^~
```

With `gcc version 14.2.0 (Ubuntu 14.2.0-4ubuntu2~24.04)`. I also tried the
trunk (7b5599dbd75fe1ee7d861d4cfc6ea655a126bef3).

I attached the 3 source files.

The commands I ran are:

```
g++ -O0 -g -std=c++23 -E -x c++ module2_impl.cpp -MT module2_impl.cpp.o.ddi -MD
-MF module2_impl.cpp.o.ddi.d -fmodules-ts -fdeps-file=module2_impl.cpp.o.ddi
-fdeps-target=module2_impl.cpp.o -fdeps-format=p1689r5 -o
module2_impl.cpp.o.ddi.i
g++ -O0 -g -std=c++23 -E -x c++ module1.cpp -MT module1.cpp.o.ddi -MD -MF
module1.cpp.o.ddi.d -fmodules-ts -fdeps-file=module1.cpp.o.ddi
-fdeps-target=module1.cpp.o -fdeps-format=p1689r5 -o module1.cpp.o.ddi.i 
g++ -O0 -g -std=c++23 -MD -MT module1.cpp.o -MF module1.cpp.o.d -fmodules-ts
-MD -fdeps-format=p1689r5 -x c++ -o module1.cpp.o -c module1.cpp
g++ -O0 -g -std=c++23 -E -x c++ module2.cpp -MT module2.cpp.o.ddi -MD -MF
module2.cpp.o.ddi.d -fmodules-ts -fdeps-file=module2.cpp.o.ddi
-fdeps-target=module2.cpp.o -fdeps-format=p1689r5 -o module2.cpp.o.ddi.i
g++ -O0 -g -std=c++23 -MD -MT module2.cpp.o -MF module2.cpp.o.d -fmodules-ts
-MD -fdeps-format=p1689r5 -x c++ -o module2.cpp.o -c module2.cpp
g++ -O0 -g -std=c++23 -MD -MT module2_impl.cpp.o -MF module2_impl.cpp.o.d
-fmodules-ts -MD -fdeps-format=p1689r5 -x c++ -o module2_impl.cpp.o -c
module2_impl.cpp
```

[Bug tree-optimization/118025] [15 Regression] gcc.dg/field-merge-9.c FAILs

2024-12-15 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118025

Richard Biener  changed:

   What|Removed |Added

Summary|gcc.dg/field-merge-9.c  |[15 Regression]
   |FAILs   |gcc.dg/field-merge-9.c
   ||FAILs

--- Comment #1 from Richard Biener  ---
The test is oddly having two variants depending on byte-order, I'd have
expected two tests with the optimization scan guarded by big/little endianess
instead?

[Bug c++/118047] [12/13/14/15 Regression] Incorrect list initialization of vector of struct of array of struct of enum since r12-7803-gf0530882d99abc

2024-12-15 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118047

Richard Biener  changed:

   What|Removed |Added

   Priority|P3  |P2

[Bug target/116979] [12/13/14/15 regression] fma not always used in complex product

2024-12-15 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116979

--- Comment #22 from Richard Biener  ---
(In reply to Richard Biener from comment #21)
> (In reply to Richard Biener from comment #20)
> > (In reply to Jakub Jelinek from comment #17)
> > > Not marking as fixed for GCC 15 yet, as there is the scalar cost 
> > > computation
> > > issue unresolved.
> > 
> > I will look into this part.
> 
> So this is because all load lanes and the FMADDSUB lanes are "live" and we
> try to be conservative with respect to costing.  Given we cannot be sure
> to be able to use lane-extraction to code-generate "live" (because of
> scheduling issues), the intent was to simply not cost the scalar stmts
> for the whole "live" subgraph.  The latter doesn't seem to work because
> of the check in vect_bb_slp_scalar_cost:
> 
>   /* If there is a non-vectorized use of the defs then the scalar
>  stmt is kept live in which case we do not account it or any
>  required defs in the SLP children in the scalar cost.  This
>  way we make the vectorization more costly when compared to
>  the scalar cost.  */
>   if (!STMT_VINFO_LIVE_P (stmt_info))
> 
> which keeps live[i] == false if STMT_VINFO_LIVE_P (stmt_info).  Fixing this
> would make the issue only worse - what we fail to realize is that
> 
> note: node 0x4dc9ac0 (max_nunits=2, refcnt=1) vector(2) float
> note: op template: _8 = REALPART_EXPR f>;
> note:   [l] stmt 0 _8 = REALPART_EXPR f>;
> note:   [l] stmt 1 _8 = REALPART_EXPR f>;
> note:   load permutation { 0 0 }
> 
> should be better handled as extern { _8, _8 } rather than costed as
> vector load + permute (cost of 4, compared to 12 + 4).  But then the
> conservative handling of "live" would cause a scalar cost of zero.
> 
> The vector construction from the __mulsc3 result also fails to consume
> the REAL/IMAGPART_EXPRs, but I guess that is harder to fix.
> 
> That said, the proper solution is to make "live" handling less conservative
> and compute scheduling constraints so we know whether extracted lanes can
> substitute for the original scalar ones (uses are not after the vector def).
> 
> Fixing the above conservativeness without further changes results in the
> following
> 
> gcc.target/i386/pr116979.c:15:10: note: Cost model analysis:
> _21 1 times scalar_stmt costs 12 in body
> _22 1 times scalar_stmt costs 12 in body
> _21 = PHI <_16(5), _19(3)> 1 times scalar_stmt costs 12 in body
> _22 = PHI <_17(5), _20(3)> 1 times scalar_stmt costs 12 in body
> REALPART_EXPR f> 1 times vec_perm costs 4 in body
> REALPART_EXPR f> 1 times unaligned_load (misalign -1) costs 12 in
> body
> REALPART_EXPR f> 1 times unaligned_load (misalign -1) costs 12 in
> body
> IMAGPART_EXPR f> 1 times vec_perm costs 4 in body
> IMAGPART_EXPR f> 1 times unaligned_load (misalign -1) costs 12 in
> body
> IMAGPART_EXPR f> 1 times vec_perm costs 4 in body
> IMAGPART_EXPR f> 1 times unaligned_load (misalign -1) costs 12 in
> body
> _9 * _11 1 times vector_stmt costs 16 in body
> .VEC_FMADDSUB (_8, _10, _13) 1 times vector_stmt costs 12 in body
> _21 = PHI <_16(5), _19(3)> 1 times vector_stmt costs 12 in body
> node 0x4dc9d00 1 times vec_construct costs 4 in prologue
> _21 1 times vector_store costs 12 in body
> _12 - _13 1 times vec_to_scalar costs 4 in epilogue
> _14 + _15 1 times vec_to_scalar costs 4 in epilogue
> REALPART_EXPR f> 1 times vec_to_scalar costs 4 in epilogue
> REALPART_EXPR f> 1 times vec_to_scalar costs 4 in epilogue
> REALPART_EXPR f> 1 times vec_to_scalar costs 4 in epilogue
> IMAGPART_EXPR f> 1 times vec_to_scalar costs 4 in epilogue
> IMAGPART_EXPR f> 1 times vec_to_scalar costs 4 in epilogue
> IMAGPART_EXPR f> 1 times vec_to_scalar costs 4 in epilogue
> IMAGPART_EXPR f> 1 times vec_to_scalar costs 4 in epilogue
> REALPART_EXPR f> 1 times vec_to_scalar costs 4 in epilogue
> gcc.target/i386/pr116979.c:15:10: note: Cost model analysis for part in loop
> 0:
>   Vector cost: 156
>   Scalar cost: 48
> 
> diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
> index 9ad95104ec7..e7ef943d743 100644
> --- a/gcc/tree-vect-slp.cc
> +++ b/gcc/tree-vect-slp.cc
> @@ -8740,6 +8740,11 @@ next_lane:
>   if ((*life)[i])
> continue;
> }
> +  else
> +   {
> + (*life)[i] = true;
> + continue;
> +   }
>  
>/* Count scalar stmts only once.  */
>if (gimple_visited_p (orig_stmt))

Hmm, no - it seems to work as intended (and we're not that conservative
anymore).  Instead the issue seems to be we're not finding all scalar
stmts to cost because we only consider SLP_TREE_SCALAR_STMTS.  We consider
intermediate pattern covered scalar stmts for finding whether any of those
are still live after vectorization (and not covered by live lane extraction)
but do not consider them for scalar costing.  We have corresponding code
in vect_bb_slp_mark_live_stmts (vec_slp_has_scalar_use) that does this
for the purpose of STMT_VINFO_LIVE_P computation, but costing lacks

[Bug tree-optimization/118028] A better vectorized reduction across multi-level loop-nest

2024-12-15 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118028

Richard Biener  changed:

   What|Removed |Added

Version|unknown |15.0

--- Comment #2 from Richard Biener  ---
We're usually handling this as a double-reduction, aka vectorizing the outer
loop.  But

t.c:6:5: note:   Analyze phi: sum_20 = PHI 
t.c:6:5: note:   reduction path: sum_14 sum_20 
t.c:6:5: note:   reduction: detected reduction
t.c:6:5: note:   Detected vectorizable nested cycle.

and

t.c:6:5: note:   grouped access in outer loop.
t.c:6:5: missed:   not vectorized: complicated access pattern.
t.c:10:21: missed:   not vectorized: complicated access pattern.
t.c:6:5: missed:  bad data access.

so there's work to be done to enable outer loop vectorization for this.

[Bug tree-optimization/118032] Bootstrap slowdown for risc-v

2024-12-15 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118032

Richard Biener  changed:

   What|Removed |Added

Version|unknown |15.0
   Keywords||build
 Target||riscv

--- Comment #2 from Richard Biener  ---
Memory use? (aka swap)

[Bug modula2/118045] [15 Regression] libm2iso.so.20.0.0 contains an unresolvable reference to symbol casin (and 171 other similar warnings)

2024-12-15 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118045

Richard Biener  changed:

   What|Removed |Added

   Keywords||build
   Target Milestone|--- |15.0

[Bug tree-optimization/118046] [15 Regression] wrong code at -O{2, 3} on x86_64-linux-gnu since r15-6173-ge8febb641415fd

2024-12-15 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118046

Richard Biener  changed:

   What|Removed |Added

   Priority|P3  |P1

[Bug middle-end/26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)

2024-12-15 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163
Bug 26163 depends on bug 117874, which changed state.

Bug 117874 Summary: [15 Regression] 17% regression for 433.milc on Zen4
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117874

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

[Bug target/117874] [15 Regression] 17% regression for 433.milc on Zen4

2024-12-15 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117874

Richard Biener  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #12 from Richard Biener  ---
Confirmed fixed.

[Bug rtl-optimization/117095] [13/14/15 Regression] Wrong code since r13-5103-g7c9f20fcfdc2d8

2024-12-15 Thread stefansf at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117095

--- Comment #8 from Stefan Schulze Frielinghaus  
---
Bootstrap and regtest are successful on s390.

[Bug tree-optimization/117979] [12/13/14/15 Regression] ICE on x86_64-linux-gnu: in verify_loop_structure, at cfgloop.cc:1742 at -Os and above and returns twice since r12-5301

2024-12-15 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117979

Richard Biener  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org

--- Comment #3 from Richard Biener  ---
I will have a look.

[Bug middle-end/118012] [avr] Expensive code (bit extract + extend + neg + and) instead of bit test

2024-12-15 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118012

--- Comment #13 from Richard Biener  ---
I think Georg-Johann is right that the patterns were introduced for
optimization and not canonicalization.  Yes, every pattern is also
canonicalization, but the patterns transforming COND_EXPRs to straight-line
code almost exclusively rely on the full expression being either visible in
GENERIC already or for phiopt helping.

For those involving operations where no generic inline expansion fallback is
present (or where it might be more expensive), a simple way would be to
avoid generating > word_mode operations that could be "expensive" (Georg-Johann
mentioned multiplication, but also negation might be a candidiate - bit
operations might be fine).

On RTL we do target costing for if-conversion sequences, PHI-OPT doesn't
reject any simplified sequence late - but we're also lacking an easy way
to cost "before" vs. "after" (genmatch does not get you a set of all stmts
involved in the "before" match, and the "after" result might contain
references into the "before" sequence).

It might be sensible to have a target hook for GIMPLE if-conversion we could
involve for rejecting the replacement sequence, or alternatively a hook
that tells us "operation expensive" from a tree-code and mode combination?
I could imagine some cases involving __int128_t might be worse on x86-64 as
well.

Note this wouldn't fix the patterns triggering on GENERIC, and I'd not
invoke the hook from the pattern itself.

For specific cases improving RTL expansion is also possible - the argument
would be that of course the user might have written the "bad" code in the
first place and if more profitable, a branchy form should be used for code
generation.

[Bug target/116979] [12/13/14/15 regression] fma not always used in complex product

2024-12-15 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116979

--- Comment #19 from Richard Biener  ---
(In reply to Uroš Bizjak from comment #18)
> (In reply to Jakub Jelinek from comment #17)
> > Not marking as fixed for GCC 15 yet, as there is the scalar cost computation
> > issue unresolved.
> There is also a issue how the final result for SFmode is constructed. It can
> be seen when compiled with -ffast-math:
> 
> vmovshdup   %xmm0, %xmm4
> vmovss  %xmm0, -8(%rsp)
> vmovss  %xmm4, -4(%rsp)
> vmovq   -8(%rsp), %xmm0
> 
> This will result in store forwarding stall.

That looks like a backend issue?  A missed vec_init patttern?

[Bug target/116979] [12/13/14/15 regression] fma not always used in complex product

2024-12-15 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116979

Richard Biener  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org
 Status|NEW |ASSIGNED

--- Comment #20 from Richard Biener  ---
(In reply to Jakub Jelinek from comment #17)
> Not marking as fixed for GCC 15 yet, as there is the scalar cost computation
> issue unresolved.

I will look into this part.

[Bug fortran/117897] [13/14/15 Regression] Bug in gfortran compiled windows run time with the latest release (14.2.0)

2024-12-15 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117897

--- Comment #5 from GCC Commits  ---
The master branch has been updated by Paul Thomas :

https://gcc.gnu.org/g:a87bf1d20a37bb69c9fa6d2211ffd963aa69240d

commit r15-6260-ga87bf1d20a37bb69c9fa6d2211ffd963aa69240d
Author: Paul Thomas 
Date:   Sun Dec 15 10:42:34 2024 +

Fortran: Pointer fcn results must not be finalized [PR117897]

2024-12-15  Paul Thomas  

gcc/fortran
PR fortran/117897
* trans-expr.cc (gfc_trans_assignment_1): RHS pointer function
results must not be finalized.

gcc/testsuite/
PR fortran/117897
* gfortran.dg/finalize_59.f90: New test.

[Bug fortran/117897] [13/14 Regression] Bug in gfortran compiled windows run time with the latest release (14.2.0)

2024-12-15 Thread pault at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117897

Paul Thomas  changed:

   What|Removed |Added

Summary|[13/14/15 Regression] Bug   |[13/14 Regression] Bug in
   |in gfortran compiled|gfortran compiled windows
   |windows run time  with the  |run time  with the latest
   |latest release (14.2.0) |release (14.2.0)

--- Comment #6 from Paul Thomas  ---
The bug was so absolutely wrong and the fix simple and non-invasive enough that
I have pushed the fix without review. I will inform the list later on today.

Paul

[Bug c/26154] [12/13/14/15 Regression] OpenMP extensions to the C language is not documented or documented in the wrong spot

2024-12-15 Thread burnus at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26154

--- Comment #38 from Tobias Burnus  ---
Regarding gfortran, besides generic manual updates there, I wonder whether
https://gcc.gnu.org/onlinedocs/gfortran/OpenMP-Modules-OMP_005fLIB-and-OMP_005fLIB_005fKINDS.html
should be moved to libgomp.texi by documenting there the named constants for
Fortran and add C/C++ data type/enums as well.

[Bug libstdc++/60621] std::vector::emplace_back generates massively more code than push_back

2024-12-15 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60621

Jan Hubicka  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
 CC||hubicka at gcc dot gnu.org,
   ||mjambor at suse dot cz
   Last reconfirmed||2024-12-15
   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=94960

--- Comment #9 from Jan Hubicka  ---
With recent changes to std::string (including not yet reviewed
https://gcc.gnu.org/pipermail/gcc-patches/2024-December/671599.html)
and std::vector we now get:

jh@ryzen3:~> ~/trunk-install2/bin/g++ -O2 -std=c++11 empl2.C  ; size   a.out 
   textdata bss dec hex filename
   3792 656   844561168 a.out
jh@ryzen3:~> ~/trunk-install2/bin/g++ -O2 -std=c++11 empl2.C -DEMPLACE_BACK ;
size   a.out 
   textdata bss dec hex filename
   5095 680   857831697 a.out

note that text size includes also EH tables. with emplace back we now get:

int main ()
{
  void * D.46214;
  struct S * const vs$8;
  struct S * const vs$D40641$_M_impl$D39953$_M_start;
  ptrdiff_t __dif;
  const char * c;
  const char * b;
  const char * a;
  struct vector vs;
  int _7;
  long int _15;
  void * _66;

   [local count: 1073741824]:
  MEM[(struct _Vector_impl_data *)&vs] ={v} {CLOBBER(bob)};
  MEM[(struct _Vector_impl_data *)&vs]._M_end_of_storage = 0B;
  a = "a";
  b = "b";
  c = "c";
  MEM  [(struct vector *)&vs] = { 0, 0 };
  std::vector::_M_realloc_append
(&vs, &a, &b, &c);

   [local count: 1073741824]:
  vs$D40641$_M_impl$D39953$_M_start_143 = MEM  [(struct
vector *)&vs];
  vs$8_144 = MEM  [(struct vector *)&vs + 8B];
  _15 = vs$8_144 - vs$D40641$_M_impl$D39953$_M_start_143;
  __dif_16 = _15 /[ex] 96;
  _7 = (int) __dif_16;
  std::vector::~vector (&vs);
  vs ={v} {CLOBBER(eos)};
  a ={v} {CLOBBER(eos)};
  b ={v} {CLOBBER(eos)};
  c ={v} {CLOBBER(eos)};
  return _7;

   [count: 0]:
:
  std::vector::~vector (&vs);
  _66 = __builtin_eh_pointer (2);
  __builtin_unwind_resume (_66);

}

So _M_realloc_apped is offlined and quite large since it constructs strings and
we do not know that the strings fits to local buffer.

Without emplace back everything gets inlined.  The main difference is that here
the construction happens in main().

Now inlining is limited since we know that main is called once.  Modifying
testcase:

jh@ryzen3:~> cat empl2.C
#include 
#include 

struct S {
#ifdef USE_CHAR
S(const char*a, const char*b, const char*c)
#else
S(std::string const&a, std::string const&b, std::string const &c)
#endif
: a(a), b(b), c(c) {}
std::string a, b, c;
};

int main2() {
std::vector vs;
#ifdef USE_STRING
std::string a("a"),b("b"),c("c");
#else
char const* a = "a", *b = "b", *c = "c";
#endif
#ifdef EMPLACE_BACK
vs.emplace_back(a, b, c);
#elif defined(EMPLACE_BACK_NOTHROW)
vs.emplace_back(std::string(a), std::string(b),
std::string(c));
#else
vs.push_back(S{a, b, c});
#endif
return vs.size();
}

int main()
{
return main2();
}

I get:
int main2 ()
{
   [local count: 1073741824]:
  return 1;

}   
int main ()
{
   [local count: 1073741824]:
  return 1;

}

which is as small as it can get :)
With emplace_back we only get everything inlined and otimized if --param
max-inline-insns-auto=160 is used. Default is 15 for -O2 and 30 for -O3.

Inline summary is:
IPA function summary for void std::vector<_Tp,
_Alloc>::_M_realloc_append(_Args&& ...) [with _Args = {const char*&, const
char*&, const char*&}; _Tp = S; _Alloc = std::allocator]/760 inlinable
  global time: 840.049387
  self size:   81
  global size: 309
  min size:   288
  self stack:  123
  global stack:123
  estimated growth:4
size:167.50, time:388.798296
size:3.00, time:2.00,  executed if:(not inlined)
size:3.00, time:3.00,  executed if:(op0 not sra candidate) && (not
inlined)
size:3.00, time:3.00,  executed if:(op0 not sra candidate)
size:0.50, time:0.50,  executed if:(op3 not sra candidate) && (not
inlined)
size:0.50, time:0.50,  executed if:(op3 not sra candidate)
size:0.50, time:0.50,  executed if:(op2 not sra candidate) && (not
inlined)
size:0.50, time:0.50,  executed if:(op2 not sra candidate)
size:0.50, time:0.50,  executed if:(op1 not sra candidate) && (not
inlined)
size:0.50, time:0.50,  executed if:(op1 not sra candidate)
size:0.50, time:0.50,  executed if:(op0 not sra candidate), 
nonconst if:(op0[ref offset: 64] changed) && (op0 not sra candidate)
size:0.50, time:0.50,  executed 

[Bug tree-optimization/86701] Optimize strlen called on std::string c_str()

2024-12-15 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86701

Jan Hubicka  changed:

   What|Removed |Added

 CC||hubicka at gcc dot gnu.org

--- Comment #5 from Jan Hubicka  ---
I think this is a valid testcase showing that we can't optimize it:

#include
#include
#include
int
main()
{
std::string str="";
str.push_back (0);
printf ("%i %i\n", str.length(), strlen (str.c_str()));
return 0;
}

We could perhaps infer value range of strlen to be str.end()-str.begin()

[Bug fortran/110626] [13/14/15 regression] Duplicated finalization in derived

2024-12-15 Thread pault at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110626

Paul Thomas  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 CC||pault at gcc dot gnu.org
   Assignee|unassigned at gcc dot gnu.org  |pault at gcc dot gnu.org
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2024-12-15

--- Comment #3 from Paul Thomas  ---
Hello Javier,

For some reason this bug slipped by me. I do apologise.

I can confirm that there is a bug, which comes about from the use of a
temporary for the defined assignment of the component. That of itself wouldn't
matter, if the temporary were assigned to at the right time. This what the
intermediate code looks like:
  o1.inner.val = 15;
  _F.DA0 = o2;// Assignment temporary here
  {
struct array00_b desc.15;

desc.15.dtype = {.elem_len=4, .version=0, .rank=0, .type=5};
desc.15.data = (void * restrict) &o2;
desc.15.span = (integer(kind=8)) desc.15.dtype.elem_len;
__final_testmod_B (&desc.15, 4, 0);   // 'o2' finalized before
assignment
  }
  o2 = o1;// assignment of 'o1' to 'o2'
  {   // temporary assign should be
here
struct __class_testmod_A_t class.16;

class.16._vptr = (struct __vtype_testmod_A * {ref-all})
&__vtab_testmod_A;
class.16._data = &_F.DA0.inner;
copy (&class.16, &o1.inner); // defined assignment of
'o1%inner'
  }  // generates second finallization.
  o2.inner = _F.DA0.inner;   // set the resulting 'o2%inner'

For (pre-)historical reasons the assignment is not being done as required by
the standard, ie. component by component, but, instead, only does the does the
defined assignments by component.

I will think about this.

Paul

[Bug tree-optimization/118032] Bootstrap slowdown for risc-v

2024-12-15 Thread schwab--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118032

--- Comment #7 from Andreas Schwab  ---
20240920:  2d 09:32:07
20240927:  2d 09:49:40
20241004:  2d 10:05:11
20241020:  2d 09:50:24
20241025:  2d 10:01:37
20241101:  2d 10:36:27
20241108:  2d 11:34:26
20241115:  2d 10:59:49
20241130:  2d 11:50:58
20241206:  2d 12:19:19
20241213:  2d 20:08:16

[Bug ada/118051] New: gnatprove indicates error

2024-12-15 Thread 00120260a at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118051

Bug ID: 118051
   Summary: gnatprove indicates error
   Product: gcc
   Version: 14.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: ada
  Assignee: unassigned at gcc dot gnu.org
  Reporter: 00120260a at gmail dot com
CC: dkm at gcc dot gnu.org
  Target Milestone: ---

+===GNAT BUG DETECTED==+
| 1.0 (spark) GCC error:   |
| File /home/drm/Documents/Administration/Ada/books/sourcecode ~   |
| Building High Integrity Applications with
SPARK/chapter07/build/gnatprove/messages_wrapper__compute_fletcher_checksum.mlw,
marked by (* ERROR: *)(...):
This expression has type int, but is expected to have type bv.BV16.t.
   |
| No source file position information available|
| Compiling /home/drm/Documents/Administration/Ada/books/sourcecode ~ Building
High Integrity Applications with SPARK/chapter07/messages_wrapper.ads|
| Please submit a bug report; see https://gcc.gnu.org/bugs/ .  |
| Use a subject line meaningful to you and us to track the bug.|
| Include the entire contents of this bug box in the report.   |
| Include the exact command that you entered.  |
| Also include sources listed below.   |
+==+

Code is the following. I do not know what causes it. it compiles well, but
gnatprove produces the bug. I assume this is the place to bring that up.

pragma SPARK_Mode(On);
pragma Ada_2022;
with Interfaces.C;
package Messages_Wrapper is
   type Checksum_Type is mod 2**16;
   function Compute_Checksum (Data : in String) return Checksum_Type
 with Pre => Data'Length /= 0;
private
   use Interfaces.C;  -- needed for visibility in precondition
   function Compute_Fletcher_Checksum
   (Buffer : in char_array;
Size   : in size_t) return unsigned_short
  with
 Global=> null,
 Import=> True,
 Convention=> C,
 Pre   => Size = Buffer'Length and Buffer'Length in
unsigned_short,
 External_Name => "compute_fletcher_checksum";

   function Compute_Checksum (Data : in String) return Checksum_Type is
  ((declare Buffer: constant Interfaces.C.Char_array := Interfaces.C.To_C
(Item => Data, Append_Nul => False); begin
Checksum_Type(Compute_Fletcher_Checksum (Buffer, Buffer'Length;
end Messages_Wrapper;

[Bug ada/118052] New: gnatproves bugs, nothing more indicated.

2024-12-15 Thread 00120260a at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118052

Bug ID: 118052
   Summary: gnatproves bugs, nothing more indicated.
   Product: gcc
   Version: 14.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: ada
  Assignee: unassigned at gcc dot gnu.org
  Reporter: 00120260a at gmail dot com
CC: dkm at gcc dot gnu.org
  Target Milestone: ---

+===GNAT BUG DETECTED==+
| 1.0 (spark) GCC error:   |
| File /home/drm/Documents/Administration/Ada/books/sourcecode ~   |
| Building High Integrity Applications with
SPARK/chapter07/build/gnatprove/messages_wrapper__compute_fletcher_checksum.mlw,
marked by (* ERROR: *)(...):
This expression has type int, but is expected to have type bv.BV16.t.
   |
| No source file position information available|
| Compiling /home/drm/Documents/Administration/Ada/books/sourcecode ~ Building
High Integrity Applications with SPARK/chapter07/messages_wrapper.ads|
| Please submit a bug report; see https://gcc.gnu.org/bugs/ .  |
| Use a subject line meaningful to you and us to track the bug.|
| Include the entire contents of this bug box in the report.   |
| Include the exact command that you entered.  |
| Also include sources listed below.   |
+==+

Code is the following. I do not know what causes it. it compiles well, but
gnatprove produces the bug.

pragma SPARK_Mode(On);
pragma Ada_2022;
with Interfaces.C;
package Messages_Wrapper is
   type Checksum_Type is mod 2**16;
   function Compute_Checksum (Data : in String) return Checksum_Type
 with Pre => Data'Length /= 0;
private
   use Interfaces.C;  -- needed for visibility in precondition
   function Compute_Fletcher_Checksum
   (Buffer : in char_array;
Size   : in size_t) return unsigned_short
  with
 Global=> null,
 Import=> True,
 Convention=> C,
 Pre   => Size = Buffer'Length and Buffer'Length in
unsigned_short,
 External_Name => "compute_fletcher_checksum";

   function Compute_Checksum (Data : in String) return Checksum_Type is
  ((declare Buffer: constant Interfaces.C.Char_array := Interfaces.C.To_C
(Item => Data, Append_Nul => False); begin
Checksum_Type(Compute_Fletcher_Checksum (Buffer, Buffer'Length;
end Messages_Wrapper;

[Bug c/117178] -Wunterminated-string-initialization should ignore trailing NUL byte for nonstring char arrays

2024-12-15 Thread alx at kernel dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117178

--- Comment #20 from Alejandro Colomar  ---
Hi Kees,

(In reply to Kees Cook from comment #19)
> Created attachment 59874 [details]
> RFC for ignoring NUL byte with nonstring attribute
> 
> Here's an RFC patch for allowing the NUL char truncation when the nonstring
> attribute is present.

I would s/exactly truncates/truncates/ in the diagnostic message.

Cheers,
Alex

[Bug target/118018] FAIL: gcc.c-torture/execute/nestfunc-5.c -O1 execution test

2024-12-15 Thread danglin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118018

John David Anglin  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #3 from John David Anglin  ---
Fixed on trunk.

[Bug other/118050] New: [15 Regression] timevar.cc:163:18: error: 'CLOCK_MONOTONIC' was not declared in this scope

2024-12-15 Thread danglin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118050

Bug ID: 118050
   Summary: [15 Regression] timevar.cc:163:18: error:
'CLOCK_MONOTONIC' was not declared in this scope
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: other
  Assignee: unassigned at gcc dot gnu.org
  Reporter: danglin at gcc dot gnu.org
  Target Milestone: ---
  Host: hppa64-hp-hpux11.11
Target: hppa64-hp-hpux11.11
 Build: hppa64-hp-hpux11.11

g++ -std=c++14  -fno-PIE -c   -g -DIN_GCC-fno-exceptions -fno-rtti
-fasynchr
onous-unwind-tables -W -Wall -Wno-error=narrowing -Wwrite-strings -Wcast-qual
-W
no-format -Wmissing-format-attribute -Wconditionally-supported
-Woverloaded-virt
ual -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings 
-DHAV
E_CONFIG_H -fno-PIE -I. -I. -I../../gcc/gcc -I../../gcc/gcc/.
-I../../gcc/gcc/..
/include  -I../../gcc/gcc/../libcpp/include -I../../gcc/gcc/../libcody
-I/opt/gn
u64/gcc/gmp/include  -I../../gcc/gcc/../libdecnumber
-I../../gcc/gcc/../libdecnu
mber/dpd -I../libdecnumber -I../../gcc/gcc/../libbacktrace
-I/opt/gnu64/gcc/gmp/
include  -o timevar.o -MT timevar.o -MMD -MP -MF ./.deps/timevar.TPo
../../gcc/g
cc/timevar.cc
../../gcc/gcc/timevar.cc: In function 'void get_time(timevar_time_def*)':
../../gcc/gcc/timevar.cc:163:18: error: 'CLOCK_MONOTONIC' was not declared in
this scope
  163 |   clock_gettime (CLOCK_MONOTONIC, &ts);
  |  ^~~
make[3]: *** [Makefile:1208: timevar.o] Error 1

CLOCK_MONOTONIC is not supported.  Only CLOCK_REALTIME is supported on
hpux11.11.

This occurs in following code:

#ifdef HAVE_CLOCK_GETTIME
  struct timespec ts;
  clock_gettime (CLOCK_MONOTONIC, &ts);
  now->wall = ts.tv_sec * 10 + ts.tv_nsec;
  return;
#define HAVE_WALL_TIME 1
#endif

I proposed the following:

diff --git a/gcc/timevar.cc b/gcc/timevar.cc
index 48d0c72cbdf..21fd65d2f89 100644
--- a/gcc/timevar.cc
+++ b/gcc/timevar.cc
@@ -160,7 +160,11 @@ get_time (struct timevar_time_def *now)

 #ifdef HAVE_CLOCK_GETTIME
   struct timespec ts;
+#if _POSIX_TIMERS > 0 && defined(_POSIX_MONOTONIC_CLOCK)
   clock_gettime (CLOCK_MONOTONIC, &ts);
+#else
+  clock_gettime (CLOCK_REALTIME, &ts);
+#endif
   now->wall = ts.tv_sec * 10 + ts.tv_nsec;
   return;
 #define HAVE_WALL_TIME 1

Another possibility is a configure check.

[Bug libstdc++/109941] [feat req] Add an option to mark STL types as nodiscard

2024-12-15 Thread arthur.j.odwyer at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109941

Arthur O'Dwyer  changed:

   What|Removed |Added

 CC||arthur.j.odwyer at gmail dot 
com

--- Comment #5 from Arthur O'Dwyer  ---
https://quuxplusone.github.io/blog/2024/12/08/should-expected-be-nodiscard/
https://github.com/microsoft/STL/pull/5174

As of this past Friday, Microsoft STL's std::expected is finally marked
nodiscard.
LLVM/Clang's own `llvm::Expected` type has been marked nodiscard since 2016.
I encourage libstdc++ to follow suit, for `expected` specifically.

(Personally I'm less sure that nodiscard belongs at all on `error_code` or
`errc`; Microsoft doesn't mark those yet. And on the exception hierarchy, which
Microsoft also marked last Friday, I think it's harmless and in keeping with
Microsoft's aggressive marking strategy, but perhaps not super helpful either.)

[Bug target/118017] [15 Regression] ICE: maximum number of generated reload insns per insn achieved (90) with -Og -frounding-math -mno-80387 -mno-mmx

2024-12-15 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118017

Uroš Bizjak  changed:

   What|Removed |Added

 CC||vmakarov at gcc dot gnu.org

--- Comment #5 from Uroš Bizjak  ---
(In reply to Uroš Bizjak from comment #4)

> Please note that TImode and TDmode are tieable on x86_64 targets, so LRA
> should simple consider all registers as TImode:
I can't find anything wrong or missing in target-dependant files, let's ask RA
expert.

[Bug libstdc++/109941] [feat req] Add an option to mark STL types as nodiscard

2024-12-15 Thread roi.jacobson1 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109941

--- Comment #6 from Roy Jacobson  ---
Also worth mentioning that clang-tidy diagnoses this under
bugprone-unused-return-value since May 23.

[Bug c++/118053] New: [14/15 Regression] Only -Ox -std=c++2x internal compiler error: in cxx_eval_indirect_ref, at cp/constexpr.cc:5954

2024-12-15 Thread yjwoo14 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118053

Bug ID: 118053
   Summary: [14/15 Regression] Only -Ox -std=c++2x internal
compiler error: in cxx_eval_indirect_ref, at
cp/constexpr.cc:5954
   Product: gcc
   Version: 14.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: yjwoo14 at gmail dot com
  Target Milestone: ---

Reproducer: https://godbolt.org/z/3WMo4PPMP

Only errors with -Ox option and -std=c++20 (or c++23) on gcc-14 or gcc-15. 


```cpp
#include 
#include 

template  void run(Funct funct = Funct()) { funct(1); }

std::vector runner() {
  try {
std::vector vec = {1};
run([&](const auto &num) { vec.back() = num; });
return vec;
  } catch (...) {
  }
  return {};
}

int main() {
auto vec = runner();
std::cout << vec.back() << std::endl;
}

```

[Bug c++/97094] Compiling big std::unordered_map became slower

2024-12-15 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97094

Jan Hubicka  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2024-12-15
 Status|UNCONFIRMED |NEW
 CC||hubicka at gcc dot gnu.org

--- Comment #4 from Jan Hubicka  ---
We seems to spend most of time sorting PHI edges

#0  0x0231f835 in mergesort (in=,
c=0x7fffd530, n=24, out=0x14a29790 "\236\204", tmp=) at
../../gcc/sort.cc:230
#1  0x0231f7be in mergesort (in=0x14a29688 "\277\204",
c=0x7fffd530, n=47, out=0x14a29688 "\277\204", tmp=) at
../../gcc/sort.cc:210
#2  0x0231f7d6 in mergesort (in=0x14a29688 "\277\204",
c=0x7fffd530, n=94, out=0x14a2a250 "\212\202", tmp=) at
../../gcc/sort.cc:212
#3  0x0231f7be in mergesort (in=0x14a29398 "\035\205",
c=0x7fffd530, n=188, out=0x14a29f60 "\004\204", tmp=) at
../../gcc/sort.cc:210
#4  0x0231f7d6 in mergesort (in=0x14a29398 "\035\205",
c=0x7fffd530, n=377, out=0x14a29398 "\035\205", tmp=) at
../../gcc/sort.cc:212
#5  0x0231f7be in mergesort (in=0x14a287d0 "\226\206",
c=0x7fffd530, n=754, out=0x14a287d0 "\226\206", tmp=) at
../../gcc/sort.cc:210
#6  0x0231f7d6 in mergesort (in=0x14a287d0 "\226\206",
c=0x7fffd530, n=1508, out=0x4710f50 "\316z", tmp=) at
../../gcc/sort.cc:212
#7  0x0231f7be in mergesort (in=0x14a258b0 "z\214",
c=0x7fffd530, n=3016, out=0x470e030 "\352t", tmp=) at
../../gcc/sort.cc:210
#8  0x0231f7be in mergesort (in=0x14a1fa70 "B\230",
c=0x7fffd530, n=6032, out=0x47081f0 "\353t", tmp=) at
../../gcc/sort.cc:210
#9  0x0231f7d6 in mergesort (in=0x14a1fa70 "B\230",
c=0x7fffd530, n=12065, out=0x14a1fa70 "B\230", tmp=) at
../../gcc/sort.cc:212
#10 0x0231f7be in mergesort (in=0x14a08168 , c=0x7fffd530, n=24130, out=0x14a08168 , tmp=)
at ../../gcc/sort.cc:210
#11 0x0231f7be in mergesort (in=0x149d8f60 "\001",
c=0x7fffd530, n=48259, out=0x149d8f60 "\001", tmp=) at
../../gcc/sort.cc:210
#12 0x0231fc00 in gcc_qsort (vbase=0x149d8f60, n=48259, size=, cmp=) at ../../gcc/sort.cc:268
#13 0x0103ceb0 in prune_unused_phi_nodes (phis=0x13984d28,
kills=, uses=0x5727910) at ../../gcc/tree-into-ssa.cc:819
#14 insert_phi_nodes_for (var=var@entry=0x7fffd1c2e6c0,
phi_insertion_points=phi_insertion_points@entry=0x13984d28,
update_p=update_p@entry=true) at ../../gcc/tree-into-ssa.cc:975
#15 0x0103d901 in insert_updated_phi_nodes_for (var=0x7fffd1c2e6c0,
dfs=dfs@entry=0x15eef880, update_flags=update_flags@entry=2048) at
../../gcc/tree-into-ssa.cc:3286
#16 0x0103e114 in update_ssa (update_flags=update_flags@entry=2048) at
../../gcc/tree-into-ssa.cc:3558
#17 0x0120feca in execute_update_addresses_taken () at
../../gcc/tree-ssa.cc:2292
#18 0x00e9d5bd in execute_function_todo (fn=0x73b45600,
data=) at ../../gcc/passes.cc:2079
#19 0x00e9dba8 in execute_todo (flags=2132064) at
../../gcc/passes.cc:2155

It has 48259 parameters.

Later a lot of time goes to
#0  0x00a7f306 in alloc_page (order=18) at ../../gcc/ggc-page.cc:786
#1  ggc_internal_alloc (size=size@entry=262136, f=f@entry=0x0, s=s@entry=0,
n=n@entry=1) at ../../gcc/ggc-page.cc:1295
#2  0x010775c3 in ggc_internal_alloc (s=262136) at ../../gcc/ggc.h:136
#3  allocate_phi_node (len=) at ../../gcc/tree-phinodes.cc:119
#4  resize_phi_node (phi=, len=) at
../../gcc/tree-phinodes.cc:258
#5  reserve_phi_args_for_new_edge (bb=) at
../../gcc/tree-phinodes.cc:295
#6  0x01011d13 in cleanup_empty_eh_merge_phis (new_bb=0x7fffe6546780,
old_bb=old_bb@entry=0x7fffe6546720, old_bb_out=old_bb_out@entry=0x7fffe3b7cb10, 
change_region=change_region@entry=true) at ../../gcc/tree-eh.cc:4507
#7  0x01012124 in cleanup_empty_eh (lp=0x7fffeb2a0050) at
../../gcc/tree-eh.cc:4756

/* Do a post-order traversal of the EH region tree.  Examine each
   post_landing_pad block and see if we can eliminate it as empty.  */

static bool
cleanup_all_empty_eh (void)
{ 
  bool changed = false;
  eh_landing_pad lp;
  int i;

  /* The post-order traversal may lead to quadraticness in the redirection
 of incoming EH edges from inner LPs, so first try to walk the region
 tree from inner to outer LPs in order to eliminate these edges.  */
  for (i = vec_safe_length (cfun->eh->lp_array) - 1; i >= 1; --i)
{ 
  lp = (*cfun->eh->lp_array)[i];
  if (lp)
changed |= cleanup_empty_eh (lp);
}

  /* Now do the post-order traversal to eliminate outer empty LPs.  */
  for (i = 1; vec_safe_iterate (cfun->eh->lp_array, i, &lp); ++i)
if (lp)
  changed |= cleanup_empty_eh (lp);

  return changed;
}

So it seems that some logic to kill the quadraticness is already present, but
not good enough.

[Bug fortran/117643] F_C_STRING from F23 is missing

2024-12-15 Thread kargls at comcast dot net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117643

kargls at comcast dot net changed:

   What|Removed |Added

  Attachment #59830|0   |1
is obsolete||

--- Comment #5 from kargls at comcast dot net ---
Created attachment 59873
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59873&action=edit
New diff

The attached diff is the WIP, and unfortunately, where I will stop.  I cannot
resolve the issue of an uninitialized variables.  Consider, this simple
program:

   program foo

   use iso_c_binding, only : c_null_char, c_char, f_c_string

   implicit none

   logical, volatile :: asis
   integer i
   character(len=6, kind=c_char), volatile :: s1
   character(len=:, kind=c_char), allocatable :: ss2

   ss2 = f_c_string(s1, .true.)
   i = len_trim(s1) + 1
   if (len(ss2) /= i) stop 3

   end program foo

with the attached patch I get

gfcx -o z -fdump-tree-original -fno-realloc-lhs f_c_string.f90

(note -fno-realloc-lhs simply isolates the problems)

character(kind=1) str.0[7]; // <-- This 7 is tlen.1
integer(kind=8) tlen.1; // <-- obviously, tlen.1
character(kind=1) * tstr.2;
character(kind=1) * pstr.3;
void * restrict D.4787;
integer(kind=8) D.4788;
integer(kind=8) D.4789;
void * D.4790;
void * D.4791;

D.4788 = tlen.1 + 1;  // <-- Whoops, tlen.1 is unset!


gfcx -o z -fdump-tree-original f_c_string.f90
(this shows the real issue and the allocation of 64T of memory!)

character(kind=1) str.0[7]; // <-- This 7 is tlen.1
integer(kind=8) tlen.1; // <-- obviously, tlen.1
character(kind=1) * tstr.2;
character(kind=1) * pstr.3;
void * restrict D.4787;
integer(kind=8) D.4788;
integer(kind=8) D.4789;
void * D.4790;
void * D.4791;
//
// The following is reallocation on assignment.  Notice tlen.1 is unset!
//
if (ss2 != 0B) goto L.1;
ss2 = (character(kind=1)[1:.ss2] *)
   __builtin_malloc (MAX_EXPR <(sizetype) (tlen.1 + 1), 1>);
goto L.2;
L.1:;
if (tlen.1 + 1 == .ss2) goto L.2;
ss2 = (character(kind=1)[1:.ss2] *)
   __builtin_realloc ((void *) ss2, MAX_EXPR <(sizetype) (tlen.1 + 1), 1>);
L.2:;
.ss2 = tlen.1 + 1;


If someone smarter than I wants to pick up the pieces, I would much appreciate
it.

[Bug middle-end/118050] [15 Regression] timevar.cc:163:18: error: 'CLOCK_MONOTONIC' was not declared in this scope

2024-12-15 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118050

Andrew Pinski  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Keywords||build
   Last reconfirmed||2024-12-15
  Component|other   |middle-end
   Severity|normal  |blocker
   Target Milestone|--- |15.0

--- Comment #1 from Andrew Pinski  ---
.

[Bug c++/98935] [coroutines] co_await on statement expressions causes ICE

2024-12-15 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98935

--- Comment #7 from GCC Commits  ---
The master branch has been updated by Iain D Sandoe :

https://gcc.gnu.org/g:bd8c7e71f516bae29a5a9e517b266141458f3977

commit r15-6263-gbd8c7e71f516bae29a5a9e517b266141458f3977
Author: Iain Sandoe 
Date:   Fri Nov 1 23:30:58 2024 +

c++, coroutines:Ensure bind exprs are visited once [PR98935].

Recent changes in the C++ FE and the coroutines implementation have
exposed a latent issue in which a bind expression containing a var
that we need to capture in the coroutine state gets visited twice.
This causes an ICE (from a checking assert).  Fixed by adding a pset
to the relevant tree walk.  Exit the callback early when the tree is
not a BIND_EXPR.

PR c++/98935

gcc/cp/ChangeLog:

* coroutines.cc (register_local_var_uses): Add a pset to the
tree walk to avoid visiting the same BIND_EXPR twice.  Make
an early exit for cases that the callback does not apply.
(cp_coroutine_transform::apply_transforms): Add a pset to the
tree walk for register_local_vars.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/pr98935.C: New test.

Signed-off-by: Iain Sandoe 

[Bug c/117178] -Wunterminated-string-initialization should ignore trailing NUL byte for nonstring char arrays

2024-12-15 Thread kees at outflux dot net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117178

--- Comment #19 from Kees Cook  ---
Created attachment 59874
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59874&action=edit
RFC for ignoring NUL byte with nonstring attribute

Here's an RFC patch for allowing the NUL char truncation when the nonstring
attribute is present. I haven't added (or updated) test cases yet. Does this
look like the right direction? (It also respects c++-compat.) Initial testing
with the Linux kernel looks good.

[Bug target/118017] [15 Regression] ICE: maximum number of generated reload insns per insn achieved (90) with -Og -frounding-math -mno-80387 -mno-mmx

2024-12-15 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118017

--- Comment #3 from Uroš Bizjak  ---
It looks to me that reload is trying to handle the following sequence from
_.322r.ira dump:

(insn 32 31 35 2 (set (subreg:TI (reg:TD 99 [ _2 ]) 0)
(reg:TI 20 xmm0)) "pr118017.c":14:24 94 {*movti_internal}
 (expr_list:REG_DEAD (reg:TI 20 xmm0)
(nil)))
...

(insn 41 40 42 2 (set (mem/c:TI (plus:DI (reg/f:DI 19 frame)
(const_int -240 [0xff10])) [0  S16 A128])
(subreg:TI (reg:TD 99 [ _2 ]) 0)) "pr118017.c":14:20 94
{*movti_internal}
 (nil))

Choking in LRA with the following _.323r.reload emergency dump:

(insn 32 31 301 2 (set (reg:TI 281)
(reg:TI 20 xmm0)) "pr118017.c":14:24 94 {*movti_internal}
 (expr_list:REG_DEAD (reg:TI 20 xmm0)
(nil)))
(insn 301 32 279 2 (set (subreg:TI (reg:TD 267 [orig:99 _2 ] [99]) 0)
(reg:TI 281)) "pr118017.c":14:24 94 {*movti_internal}
 (nil))
(insn 279 301 35 2 (set (subreg:TI (reg:TD 99 [ _2 ]) 0)
(subreg:TI (reg:TD 267 [orig:99 _2 ] [99]) 0)) "pr118017.c":14:24 94
{*movti_internal}
 (expr_list:REG_UNUSED (reg:TD 99 [ _2 ])
(nil)))

...

(insn 275 40 302 2 (set (reg:TI 282)
(subreg:TI (reg:TD 267 [orig:99 _2 ] [99]) 0)) "pr118017.c":14:20 94
{*movti_internal}
 (expr_list:REG_DEAD (reg:TD 267 [orig:99 _2 ] [99])
(nil)))
(insn 302 275 303 2 (set (reg:TI 283)
(reg:TI 282)) "pr118017.c":14:20 94 {*movti_internal}
 (nil))
(insn 303 302 304 2 (set (reg:TI 284)
(reg:TI 283)) "pr118017.c":14:20 94 {*movti_internal}
 (nil))

...

(insn 392 391 225 2 (set (subreg:TI (reg:TD 264 [orig:99 _2 ] [99]) 0)
(reg:TI 372)) "pr118017.c":14:20 94 {*movti_internal}
 (nil))
(insn 225 392 300 2 (set (reg:TI 280)
(subreg:TI (reg:TD 264 [orig:99 _2 ] [99]) 0)) "pr118017.c":14:20 94
{*movti_internal}
 (nil))
(insn 300 225 41 2 (set (reg:TI 221)
(reg:TI 280)) "pr118017.c":14:20 94 {*movti_internal}
 (nil))
(insn 41 300 274 2 (set (mem/c:TI (plus:DI (reg/f:DI 19 frame)
(const_int -288 [0xfee0])) [0  S16 A128])
(reg:TI 221)) "pr118017.c":14:20 94 {*movti_internal}
 (expr_list:REG_DEAD (reg:TI 221)
(nil)))

Please note the sequence of instructions from (insn 302) to (insn 392) that
kills the compilation due to maximum (90) reloads.

[Bug target/118017] [15 Regression] ICE: maximum number of generated reload insns per insn achieved (90) with -Og -frounding-math -mno-80387 -mno-mmx

2024-12-15 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118017

--- Comment #4 from Uroš Bizjak  ---
(In reply to Uroš Bizjak from comment #3)
> It looks to me that reload is trying to handle the following sequence from
> _.322r.ira dump:
> 
> (insn 32 31 35 2 (set (subreg:TI (reg:TD 99 [ _2 ]) 0)
> (reg:TI 20 xmm0)) "pr118017.c":14:24 94 {*movti_internal}
>  (expr_list:REG_DEAD (reg:TI 20 xmm0)
> (nil)))
> ...
> 
> (insn 41 40 42 2 (set (mem/c:TI (plus:DI (reg/f:DI 19 frame)
> (const_int -240 [0xff10])) [0  S16 A128])
> (subreg:TI (reg:TD 99 [ _2 ]) 0)) "pr118017.c":14:20 94
> {*movti_internal}
>  (nil))

Please note that TImode and TDmode are tieable on x86_64 targets, so LRA should
simple consider all registers as TImode:

#define VALID_SSE_REG_MODE(MODE)\
  ((MODE) == V1TImode || (MODE) == TImode   \
   || (MODE) == V4SFmode || (MODE) == V4SImode  \
   || (MODE) == SFmode || (MODE) == SImode  \
   || (MODE) == TFmode || (MODE) == TDmode)

and in ix86_modes_tieable_p:

  /* If MODE2 is only appropriate for an SSE register, then tie with
 any other mode acceptable to SSE registers.  */
  ...
  if (GET_MODE_SIZE (mode2) == 16
  && ix86_hard_regno_mode_ok (FIRST_SSE_REG, mode2))
return (GET_MODE_SIZE (mode1) == 16
&& ix86_hard_regno_mode_ok (FIRST_SSE_REG, mode1));

[Bug target/118018] FAIL: gcc.c-torture/execute/nestfunc-5.c -O1 execution test

2024-12-15 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118018

--- Comment #2 from GCC Commits  ---
The master branch has been updated by John David Anglin :

https://gcc.gnu.org/g:3e7ae868fa057a808448a5ab081d2ad30ad80bab

commit r15-6269-g3e7ae868fa057a808448a5ab081d2ad30ad80bab
Author: John David Anglin 
Date:   Sun Dec 15 17:18:40 2024 -0500

hppa: Implement TARGET_FRAME_POINTER_REQUIRED

If a function receives nonlocal gotos, it needs to save the frame
pointer in the argument save area.  This ensures that LRA sets
frame_pointer_needed when it saves arguments in the save area.

2024-12-15  John David Anglin  

gcc/ChangeLog:

PR target/118018
* config/pa/pa.cc (pa_frame_pointer_required): Declare and
implement.
(TARGET_FRAME_POINTER_REQUIRED): Define.

[Bug c++/118054] New: GCC allows catch-by-value using trivial move constructor

2024-12-15 Thread hstong at ca dot ibm.com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118054

Bug ID: 118054
   Summary: GCC allows catch-by-value using trivial move
constructor
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Keywords: accepts-invalid
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hstong at ca dot ibm.com
  Target Milestone: ---

GCC allows catch-by-copy for classes with deleted copy constructors but trivial
move constructors.

Online compiler: https://godbolt.org/z/5cx354qvM

### SOURCE ()
struct Base {
  Base() = default;
  Base(const Base &) = delete;
  Base(Base &&) = default;
};
struct A : Base {
  A() = default;
  A(const A &) : Base() {}
  A(A &&) = default;
};
void bar(), foo(Base *);
int main() {
  try {
bar();
  } catch (Base b) {
foo(&b);
  }
}


### COMPILE COMMAND
g++ -fsyntax-only -std=c++11 -xc++ -


### ACTUAL COMPILE OUTPUT
(clean compile)


### EXPECTED COMPILE OUTPUT
: In function 'int main()':
:15:17: error: use of deleted function 'Base::Base(const Base&)'
:3:3: note: declared here
:15:17: note: use '-fdiagnostics-all-candidates' to display considered
candidates


### COMPILER VERSION INFO (g++ -v)
Using built-in specs.
COLLECT_GCC=/opt/wandbox/gcc-head/bin/g++
COLLECT_LTO_WRAPPER=/opt/wandbox/gcc-head/libexec/gcc/x86_64-pc-linux-gnu/15.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../source/configure --prefix=/opt/wandbox/gcc-head
--enable-languages=c,c++ --disable-multilib --without-ppl --without-cloog-ppl
--enable-checking=release --disable-nls --enable-lto
LDFLAGS=-Wl,-rpath,/opt/wandbox/gcc-head/lib,-rpath,/opt/wandbox/gcc-head/lib64,-rpath,/opt/wandbox/gcc-head/lib32
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 15.0.0 20241214 (experimental) (GCC)

[Bug c++/103048] [C++17+] Mandatory copy elision used for catch-by-value

2024-12-15 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103048

Andrew Pinski  changed:

   What|Removed |Added

 CC||jens.maurer at gmx dot net

--- Comment #3 from Andrew Pinski  ---
*** Bug 106812 has been marked as a duplicate of this bug. ***

[Bug tree-optimization/118055] [15 Regression] gcc.dg/tree-ssa/pr83403-1.c and -2 for CRIS and m68k since r15-6097-gee2f19b0937b5e

2024-12-15 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118055

--- Comment #3 from Hongtao Liu  ---
> 
> Is it perhaps that the test is brittle; mostly target-specific despite being
> at the tree-level and that instead the scan-test should be a specific
> known-matching target list?

The testcase is used to detect load/store motion optimization which relies on
loop unrolling, my commit adjusted unroll heuritic to prevent some
"bad"(performance) unroll and breaks the testcase on some targets.

For this testcase itself, (for some targets) it may be necessary to add --param
max-completely-peeled-insns=300 to ensure that unroll occurs.

For performance perpective, the targets may need to Fine-tuning the parameter
of max-completely-peeled-insns according to benchmarks.

[Bug rtl-optimization/118042] ICE: maximum number of generated reload insns per insn achieved (90) with -O1 -fexpensive-optimizations -favoid-store-forwarding -mavx10.1 -mprefer-vector-width=128 --par

2024-12-15 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118042

Uroš Bizjak  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2024-12-16
 Status|UNCONFIRMED |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |ubizjak at gmail dot com

--- Comment #2 from Uroš Bizjak  ---
Created attachment 59875
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59875&action=edit
Proposed patch

TImode (and DImode on 32-bit targets) should be tied with other integer modes.
We allow HImode values in XMM registers, so on 64-bit targets, TImode can be
tied with HImode in both, integer and XMM registers. This will allow a register
in

(insn 228 230 227 2 (set (reg:TI 345)
(subreg:TI (reg:HI 389) 0)) "/app/example.cpp":6:5 94 {*movti_internal}
 (expr_list:REG_DEAD (reg:HI 389)
(nil)))

to be accessed directly.

[Bug target/118042] ICE: maximum number of generated reload insns per insn achieved (90) with -O1 -fexpensive-optimizations -favoid-store-forwarding -mavx10.1 -mprefer-vector-width=128 --param=store-f

2024-12-15 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118042

Uroš Bizjak  changed:

   What|Removed |Added

  Component|rtl-optimization|target

--- Comment #3 from Uroš Bizjak  ---
Target issue.

[Bug middle-end/97094] Compiling big std::unordered_map became slower

2024-12-15 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97094

--- Comment #5 from Andrew Pinski  ---
See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114480#c20 too.

[Bug tree-optimization/118032] Bootstrap slowdown for risc-v

2024-12-15 Thread schwab--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118032

--- Comment #6 from Andreas Schwab  ---
Here are the bootstrap times on the HiFive Unleashed (all languages):

20240920:  3d 09:32:07
20240927:  3d 09:49:40
20241004:  3d 10:05:11
20241020:  3d 09:50:24
20241025:  3d 10:01:37
20241101:  3d 10:36:27
20241108:  3d 11:34:26
20241115:  3d 10:59:49
20241130:  3d 11:50:58
20241206:  3d 12:19:19
20241213:  3d 20:08:16

[Bug tree-optimization/118055] [15 Regression] gcc.dg/tree-ssa/pr83403-1.c and -2 for CRIS and m68k since r15-6097-gee2f19b0937b5e

2024-12-15 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118055

--- Comment #1 from Hongtao Liu  ---
I explained in the thread.
https://gcc.gnu.org/pipermail/gcc-patches/2024-December/671289.html

-
BTW arm ci reported 2 regressed testcase so I added
* gcc.dg/tree-ssa/pr83403-1.c: Add --param max-completely-peeled-insns=300 for
arm*-*-*.
* gcc.dg/tree-ssa/pr83403-2.c: Ditto.

For 32-bit arm, there're more stmts in the innermost loop,
and removal of the reduction prevents completely unrolling for them.
For aarch64, it looks fine.
-


So the fix for cris-elf/m68k could be similar as arm.

[Bug c++/118056] New: ICE: tree code ‘template_type_parm’ is not supported in LTO streams

2024-12-15 Thread iamanonymous.cs at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118056

Bug ID: 118056
   Summary: ICE: tree code ‘template_type_parm’ is not supported
in LTO streams
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: iamanonymous.cs at gmail dot com
  Target Milestone: ---

***
The compiler produces an internal error during IPA pass: modref when compiling
the provided code with the specified options. 
The issue can also be reproduced on Compiler Explorer.

***
OS and Platform:
# uname -a
Linux ubuntu 4.15.0-213-generic #224-Ubuntu SMP Mon Jun 19 13:30:12 UTC 2023
x86_64 x86_64 x86_64 GNU/Linux
***
# g++ -v
Using built-in specs.
COLLECT_GCC=g++
COLLECT_LTO_WRAPPER=/root/gdbtest/gcc/gcc-241212/libexec/gcc/x86_64-pc-linux-gnu/15.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc/configure --prefix=/root/gdbtest/gcc/gcc-241212
--enable-languages=c,c++ --disable-multilib --disable-bootstrap
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 15.0.0 20241212 (experimental) (GCC)
***
Program:
#cat code_0.cpp
#include 
#include 

void operate_one(auto&&) {}

void operate_multi(auto&&...args)
{
[&](std::index_sequence)
{
(::operate_one(args...[idx]), ...);
}(std::make_index_sequence{});
}

int main()
{
::operate_multi(1, 2, 3);
}
***
Command Lines:
g++ code_0.cpp  -funroll-loops -flto  -Wall -Wextra  -fno-strict-aliasing 
-fwrapv -g -fsanitize=address  -c -o code_0.o


 == 

code_0.cpp:4:18: warning: use of ‘auto’ in parameter declaration only available
with ‘-std=c++20’ or ‘-fconcepts’ [-Wc++20-extensions]
4 | void operate_one(auto&&) {}
  |  ^~~~
code_0.cpp:6:20: warning: use of ‘auto’ in parameter declaration only available
with ‘-std=c++20’ or ‘-fconcepts’ [-Wc++20-extensions]
6 | void operate_multi(auto&&...args)
  |^~~~
code_0.cpp: In lambda function:
code_0.cpp:10:28: warning: pack indexing only available with ‘-std=c++2c’ or
‘-std=gnu++2c’ [-Wc++26-extensions]
   10 | (::operate_one(args...[idx]), ...);
  |^~~
during IPA pass: modref
code_0.cpp: At top level:
code_0.cpp:17:1: internal compiler error: tree code ‘template_type_parm’ is not
supported in LTO streams
   17 | }
  | ^
0x29bfcdf internal_error(char const*, ...)
../../gcc/gcc/diagnostic-global-context.cc:517
0x129f8d8 DFS::DFS(output_block*, tree_node*, bool, bool, bool)
../../gcc/gcc/lto-streamer-out.cc:911
0x12a0def lto_output_tree(output_block*, tree_node*, bool, bool)
../../gcc/gcc/lto-streamer-out.cc:1863
0x129779e write_global_stream
../../gcc/gcc/lto-streamer-out.cc:2879
0x12a3dd8 lto_output_decl_state_streams(output_block*, lto_out_decl_state*)
../../gcc/gcc/lto-streamer-out.cc:2926
0x12a3dd8 produce_asm_for_decls()
../../gcc/gcc/lto-streamer-out.cc:3338
0x133bd1a write_lto
../../gcc/gcc/passes.cc:2795
0x133bd1a ipa_write_summaries_1
../../gcc/gcc/passes.cc:2858
0x133bd1a ipa_write_summaries()
../../gcc/gcc/passes.cc:2914
0xf2d4a9 ipa_passes
../../gcc/gcc/cgraphunit.cc:2274
0xf2d4a9 symbol_table::compile()
../../gcc/gcc/cgraphunit.cc:2349
0xf30247 symbol_table::compile()
../../gcc/gcc/cgraphunit.cc:2327
0xf30247 symbol_table::finalize_compilation_unit()
../../gcc/gcc/cgraphunit.cc:2601
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See  for instructions.

***

Also ICE on trunk, compiler explorer:https://godbolt.org/z/5fPGTeEMK

***

[Bug tree-optimization/118055] [15 Regression] gcc.dg/tree-ssa/pr83403-1.c and -2 for CRIS and m68k since r15-6097-gee2f19b0937b5e

2024-12-15 Thread hp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118055

--- Comment #4 from Hans-Peter Nilsson  ---
(In reply to Hongtao Liu from comment #3)
> > 
> > Is it perhaps that the test is brittle; mostly target-specific despite being
> > at the tree-level and that instead the scan-test should be a specific
> > known-matching target list?
> 
> The testcase is used to detect load/store motion optimization which relies
> on loop unrolling, my commit adjusted unroll heuritic to prevent some
> "bad"(performance) unroll and breaks the testcase on some targets.
> 
> For this testcase itself, (for some targets) it may be necessary to add
> --param max-completely-peeled-insns=300 to ensure that unroll occurs.
> 
> For performance perpective, the targets may need to Fine-tuning the
> parameter of max-completely-peeled-insns according to benchmarks.

I'll take that as a "yes" to my question. :)

And "yes" seems correct; I made a quick analysis myself: the "number of insns"
compared here quickly boils down to testing RTL-level target specifics, like
MOVE_MAX.  Note that the test involves moving a lot of 64-bit entitites.

So, this is mostly a difference between "32-bit" and "64-bit" targets, and the
target specifier should probably better reflect this than being an accumulated
list of targets.  There are exceptions to this MOVE_MAX=4 => "32-bit",
MOVE_MAX=8 => "64-bit", like pru-elf (which I noticed did *not* regress in the
posted results) which is "32-bit" but has #define MOVE_MAX 8.