[Bug target/116625] [15 regression] regressions on arm-eabi since r15-1619-g3b9b8d6cfdf593

2024-12-04 Thread thiago.bauermann at linaro dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116625

--- Comment #5 from Thiago Jung Bauermann  
---
This bug has been fixed by Torbjörn SVENSSON's changes in
r15-4548-ga79ca49b5ce0ad.

[Bug target/117917] Hangs on avx512

2024-12-04 Thread xry111 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117917

Xi Ruoyao  changed:

   What|Removed |Added

 CC||xry111 at gcc dot gnu.org
 Status|UNCONFIRMED |RESOLVED
   Keywords||diagnostic
 Resolution|--- |INVALID

--- Comment #2 from Xi Ruoyao  ---
Yes it finishes.  If redirecting the message to /dev/null it spends 20 seconds
for me.

Not a bug.  This is one of the reasons we say

The compiler driver processes source code, invokes other programs
such as the assembler and linker and generates the output result,
which may be assembly code or machine code.  Compiling untrusted
sources can result in arbitrary code execution and unconstrained
resource consumption in the compiler. As a result, compilation of
such code should be done inside a sandboxed environment to ensure
that it does not compromise the host environment.

in SECURITY.md.

Use -fmax-errors= and/or just pipe the stderr to an external program to
truncate it if you want to limit the output amount.  But even with the limit
you still need a sandbox if compiling some untrusted source code.

[Bug tree-optimization/117919] New: [14/15 Regression] ICE: in propagate, at gimple-ssa-sccopy.cc:625 with -O -fno-tree-forwprop -fnon-call-exceptions --param=early-inlining-insns=192

2024-12-04 Thread zsojka at seznam dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117919

Bug ID: 117919
   Summary: [14/15 Regression] ICE: in propagate, at
gimple-ssa-sccopy.cc:625 with -O -fno-tree-forwprop
-fnon-call-exceptions --param=early-inlining-insns=192
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Keywords: ice-on-valid-code
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zsojka at seznam dot cz
  Target Milestone: ---
  Host: x86_64-pc-linux-gnu
Target: x86_64-pc-linux-gnu

Created attachment 59792
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59792&action=edit
auto-reduced testcase

Compiler output:
$ x86_64-pc-linux-gnu-gcc -O -fno-tree-forwprop -fnon-call-exceptions
--param=early-inlining-insns=192 -std=c++20 strgen.cpp.ii
during GIMPLE pass: sccopy
strgen.cpp.ii: In function 'int main()':
strgen.cpp.ii:48:1: internal compiler error: in propagate, at
gimple-ssa-sccopy.cc:625
   48 | }
  | ^
0x30d83a0 internal_error(char const*, ...)
/repo/gcc-trunk/gcc/diagnostic-global-context.cc:517
0xfa5c25 fancy_abort(char const*, int, char const*)
/repo/gcc-trunk/gcc/diagnostic.cc:1696
0xf65c94 scc_copy_prop::propagate()
/repo/gcc-trunk/gcc/gimple-ssa-sccopy.cc:625
0x2eaf31d execute
/repo/gcc-trunk/gcc/gimple-ssa-sccopy.cc:681
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.

$ x86_64-pc-linux-gnu-gcc -v  Using
built-in specs.
COLLECT_GCC=/repo/gcc-trunk/binary-latest-amd64/bin/x86_64-pc-linux-gnu-gcc
COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-20241204084034-r15-5922-ga0ac8fa55a4749-checking-yes-rtl-df-extra-nobootstrap-amd64/bin/../libexec/gcc/x86_64-pc-linux-gnu/15.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++
--enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra
--disable-bootstrap --with-cloog --with-ppl --with-isl
--build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu
--target=x86_64-pc-linux-gnu --with-ld=/usr/bin/x86_64-pc-linux-gnu-ld
--with-as=/usr/bin/x86_64-pc-linux-gnu-as --enable-libsanitizer
--disable-libstdcxx-pch
--prefix=/repo/gcc-trunk//binary-trunk-20241204084034-r15-5922-ga0ac8fa55a4749-checking-yes-rtl-df-extra-nobootstrap-amd64
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 15.0.0 20241204 (experimental) (GCC)

[Bug c/117917] New: Hangs on avx512

2024-12-04 Thread xieym3 at zohomail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117917

Bug ID: 117917
   Summary: Hangs on avx512
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: xieym3 at zohomail dot com
  Target Milestone: ---

$ gcc-trunk -v
Using built-in specs.
COLLECT_GCC=/data/xieym/exp/gcc/test_data/gcc-latest-install/bin/gcc
COLLECT_LTO_WRAPPER=/data/xieym/exp/gcc/test_data/gcc-latest-install/libexec/gcc/x86_64-pc-linux-gnu/15.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /data/xieym/exp/gcc/test_data/gcc-latest-src/configure
--enable-coverage --enable-checking --disable-multilib --disable-shared
--disable-bootstrap --enable-languages=c,c++
--prefix=/data/xieym/exp/gcc/test_data/gcc-latest-install
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 15.0.0 20241203 (experimental) (GCC)
(fuzz4all) ➜  hangs git:(main) ✗ cat crash_20241204_073058_00b5.c
#include 
int main(int argc, char **argv) {
  size_t i;
  int *p =
(
void
*
)
calloc
(
4
*
1000
*
1000
*
[
(
(
(
int
)
)
;
for
(
i
=
0
;
i
<
4
*
1000
*
1000
*
1000
;
i
++
)
{
*p = increment(*p);
p += 7;
}
return 0;
}
#include 
int main(int argc, char **argv) {
  volatile int i = 7;
  volatile int j = 11;
  int k;
  for (k = 0; k < 1000; k++) {
_mm_empty();
i -= j;
  }
}
$ gcc-trunk -x c -std=c2x -c crash_20241204_073058_00b5.c -o /dev/nul
# It has been running for a minute and still hasn’t finished.

[Bug target/117878] RISC-V: ICE when build spec17 526.blender_r with -O3 -march=rv64gcv_zvl256b

2024-12-04 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117878

--- Comment #5 from GCC Commits  ---
The master branch has been updated by Pan Li :

https://gcc.gnu.org/g:fb64a7b0e1d7488e6e3ae96af8d97fd2226b6d21

commit r15-5917-gfb64a7b0e1d7488e6e3ae96af8d97fd2226b6d21
Author: Pan Li 
Date:   Wed Dec 4 13:53:52 2024 +0800

RISC-V: Add assert for insn operand out of range access [PR117878][NFC]

According to the the initial analysis of PR117878, the ice comes from
the out-of-range operand access for recog_data.operand[].  Thus, add
one assert here to expose this explicitly.

PR target/117878

gcc/ChangeLog:

* config/riscv/riscv-v.cc (vlmax_avl_type_p): Add assert for
out of range access.
(nonvlmax_avl_type_p): Ditto.

Signed-off-by: Pan Li 

[Bug lto/114542] -flto=4294967296 is treated the same as -flto=0 and -flto=4294967297 is treated the same as -flto=1 instead of being invalid options

2024-12-04 Thread heiko at hexco dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114542

--- Comment #3 from Heiko Eißfeldt  ---
FYI: the GNUmake bug seems to be fixed now.

[Bug modula2/117904] cc1gm2 ICE when compiling a const built from VAL and SIZE

2024-12-04 Thread gaius at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117904

--- Comment #3 from Gaius Mulley  ---
Created attachment 59780
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59780&action=edit
Proposed fix

Proposed fix which converts the increment into the same type as the start, end
expression type of a for loop.

[Bug target/117908] New: [RX] Add support for v2 ISA

2024-12-04 Thread olegendo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117908

Bug ID: 117908
   Summary: [RX] Add support for v2 ISA
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: olegendo at gcc dot gnu.org
  Target Milestone: ---

Support for RXv2 and RXv3 ISAs has been added to bintuilts long time ago but
the compiler still lacks any support.

A patch to add RXv2 to the compiler has been posted quite a while ago
https://gcc.gnu.org/legacy-ml/gcc-patches/2015-12/msg01135.html

There is a bit of a conflict because I've added some atomic operations to rx.md
and the patch also adds some atomic functions in a new file sync.md.  It'll
need to be sorted out.  (I wasn't aware of the existence of the patch when I
added the atomics)

[Bug modula2/117371] [14/15 Regression] type incompatibility between ‘INTEGER’ and ‘CARDINAL’

2024-12-04 Thread gaius at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117371

Gaius Mulley  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #11 from Gaius Mulley  ---
Closing now that gcc-15 and gcc-14 have had the patch applied.

[Bug target/117908] [RX] Add support for v2 ISA

2024-12-04 Thread olegendo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117908

Oleg Endo  changed:

   What|Removed |Added

   Last reconfirmed||2024-12-04
 Ever confirmed|0   |1
 CC||ysato at users dot 
sourceforge.jp
 Status|UNCONFIRMED |NEW
 Target||rx

[Bug modula2/117660] Errors referring to variables of type array could display full declaration

2024-12-04 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117660

--- Comment #5 from GCC Commits  ---
The releases/gcc-14 branch has been updated by Gaius Mulley
:

https://gcc.gnu.org/g:5902ea4a341419d243725e7a52800e297159ff9d

commit r14-11060-g5902ea4a341419d243725e7a52800e297159ff9d
Author: Gaius Mulley 
Date:   Wed Dec 4 09:03:36 2024 +

[PATCH] PR modula2/117660: Errors referring to variables of type array
could display full declaration

This patch ensures that the tokens defining the full declaration of an
ARRAY type is stored in the symbol table and used during production of
error messages.

gcc/m2/ChangeLog:

PR modula2/117660
* gm2-compiler/P2Build.bnf (ArrayType): Update tok with the
composite token produced during array type declaration.
* gm2-compiler/P2SymBuild.mod (EndBuildArray): Create the
combinedtok and store it into the symbol table.
Also ensure combinedtok is pushed to the quad stack.
(BuildFieldArray): Preserve typetok.
* gm2-compiler/SymbolTable.def (PutArray): Rename parameters.
* gm2-compiler/SymbolTable.mod (PutArray): Rename parameters.

gcc/testsuite/ChangeLog:

PR modula2/117660
* gm2/iso/fail/arraymismatch.mod: New test.

(cherry picked from commit ab7abf1db09519a92f4a02af30ed6b834264c45e)

Signed-off-by: Gaius Mulley 

[Bug tree-optimization/116083] [14/15 Regression] Re-surfacing SLP vectorization slowness for gcc.dg/pr87985.c

2024-12-04 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116083

--- Comment #5 from GCC Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:27b444a41a6afab8020183c4c5d3361e21635031

commit r15-5918-g27b444a41a6afab8020183c4c5d3361e21635031
Author: Richard Biener 
Date:   Tue Dec 3 14:37:21 2024 +0100

tree-optimization/116083 - SLP discovery slowness

One large constant factor of SLP discovery is figuring the vector
type for each individual lane of each node.  That should be redundant
since the structual comparison of stmts should ensure they end up
the same so the following computes them only once per node rather
than for each lane.

This cuts the compile-time of the testcase in half.

PR tree-optimization/116083
* tree-vect-slp.cc (vect_build_slp_tree_1): Compute vector
type and max_nunits only once.  Remove check for matching
vector type of each lane and replace it with matching check
for LHS type.

[Bug target/113948] Switch rx to LRA

2024-12-04 Thread olegendo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113948

--- Comment #2 from Oleg Endo  ---
-mlra option has been around on RX for a while.  I've using GCC 8 as production
compiler on a larger firmware.

Adding -mlra to the build will make the thing get stuck during linking (using
LTO) somewhere, looks like it gets stuck in some infinite loop somewhere.

[Bug target/117908] [RX] Add support for v2 ISA

2024-12-04 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117908

Richard Biener  changed:

   What|Removed |Added

   Severity|normal  |enhancement

[Bug tree-optimization/116083] [14/15 Regression] Re-surfacing SLP vectorization slowness for gcc.dg/pr87985.c

2024-12-04 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116083

--- Comment #6 from Richard Biener  ---
The last change gets us to

 tree DSE   :   0.23 ( 16%)  2048  (  0%)
 tree slp vectorization :   0.65 ( 47%)  2494k ( 15%)

there's another pending improvement.  The quadraticness with respect to
SLP discovery depth and number of lanes is present since forever, the
issue is the 13 branch succeeds with the full lanes SLP discovery by
immediately building the store feeding from scalars because the call
makes the first lane mismatch indicating a fatal failure as it still
has the "bug" treating CFN_LAST as internal_fn_p while GCC 14 corrects
this mistake to support OpenMP SIMD and other vectorizable calls.

A more meaningful testcase would still run into the quadraticness even
with older GCC.

[Bug libgomp/109664] Deadlocks with gomp_fatal called from libgomp/plugins/

2024-12-04 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109664

Thomas Schwinge  changed:

   What|Removed |Added

   Last reconfirmed||2024-12-04
 CC||tschwinge at gcc dot gnu.org
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Keywords||openacc, openmp

--- Comment #2 from Thomas Schwinge  ---
Wild idea: switch libgomp to C++, and use 'throw' and other C++ idioms for
proper error propagation and unlocking as part of that?

[Bug tree-optimization/116463] [15 Regression] complex multiply vectorizer detection failures after r15-3087-gb07f8a301158e5

2024-12-04 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116463

--- Comment #32 from Tamar Christina  ---
(In reply to Richard Biener from comment #31)
> (In reply to Tamar Christina from comment #29)
> > (In reply to Tamar Christina from comment #27)
> > > > > 
> > > > > We DO already impose any order on them, but the other operand is 
> > > > > oddodd, so
> > > > > the overall order ends up being oddodd because any known permute 
> > > > > overrides
> > > > > unknown ones.
> > > > 
> > > > So what's the desired outcome?  I guess PERM_UNKNOWN?  I guess it's
> > > > the "other operand" of an add?  What's the (bad) effect of classifying
> > > > it as ODDODD (optimistically)?
> > > > 
> > > > > So the question is, can we not follow externals in a constructor to 
> > > > > figure
> > > > > out if how they are used they all read from the same base and in 
> > > > > which order?
> > > > 
> > > > I don't see how it makes sense to do this.  For the above example, 
> > > > what's
> > > > the testcase exhibiting this (and on which arch)?
> > > 
> > > I've been working on a fix from a different angle for this, which also
> > > covers another GCC 14 regression that went unnoticed. I'll post after
> > > regressions finish.
> > 
> > So I've formalized the handling of TOP a bit better.  Which gets it to
> > recognize it again, however, it will be dropped as it's not profitable.
> > 
> > The reason it's not profitable is the canonicalization issue mentioned
> > above.  This has split the imaginary and real nodes into different
> > computations.
> >
> > So no matter what you do in the SLP tree, the attached digraph won't make
> > the loads of _5 linear.  Are you ok with me trying that Richi?
> 
> I can't make sense of that graph - the node feeding the store seems to have
> wrong scalar stmts?
> 
> What's the testcase for this (and on what arch?).
> 

void fms_elemconjsnd(_Complex TYPE a[restrict N], _Complex TYPE b,
 _Complex TYPE c[restrict N]) {
  for (int i = 0; i < N; i++)
c[i] -= a[i] * ~b;
}

compiled with -Ofast -march=armv8.3-a

#define TYPE double
#define I 1.0i
#define N 200
void fms180snd (_Complex TYPE a[restrict N], _Complex TYPE b[restrict N],
_Complex TYPE c[restrict N])
{
  for (int i=0; i < N; i++)
c[i] -= a[i] * (b[i] * I * I);
}

void fms180snd_1 (_Complex TYPE a[restrict N], _Complex TYPE b[restrict N],
_Complex TYPE c[restrict N])
{
  _Complex TYPE t = I;
  for (int i=0; i < N; i++)
c[i] -= a[i] * (b[i] * t * t);
}

is another one, where they are the same things, but 1st one is matched and
second one doesn't.

> But yes, the loads of *5 won't get linear here, but at least the
> permute node feeding the complex-add-rot270 can be elided, eventually
> even making the external _53, b$real_11 match the other with different
> order (though we don't model that, cost-wise).

But without the loads getting linearize the match will never work as multi-lane
SLP will be immediately cancelled because it assumed load-lanes is cheaper
(it's not, but load lanes doesn't get costed) and that's why there's a load
permute optimization step after complex pattern matching.

The point is however, that no permute is needed. *not even for the loads*.

GCC 13 generated:

fms_elemconjsnd:
fnegd1, d1
mov x2, 0
dup v4.2d, v0.d[0]
dup v3.2d, v1.d[0]
.L2:
ldr q1, [x0, x2]
ldr q0, [x1, x2]
fmulv2.2d, v3.2d, v1.2d
fcadd   v0.2d, v0.2d, v2.2d, #270
fmlsv0.2d, v4.2d, v1.2d
str q0, [x1, x2]
add x2, x2, 16
cmp x2, 3200
bne .L2
ret

which was the optimal sequence.

[Bug tree-optimization/116083] [14 Regression] Re-surfacing SLP vectorization slowness for gcc.dg/pr87985.c

2024-12-04 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116083

Richard Biener  changed:

   What|Removed |Added

Summary|[14/15 Regression]  |[14 Regression]
   |Re-surfacing SLP|Re-surfacing SLP
   |vectorization slowness for  |vectorization slowness for
   |gcc.dg/pr87985.c|gcc.dg/pr87985.c

--- Comment #8 from Richard Biener  ---
I'd say mitigated good enough for trunk.

[Bug c++/117826] [15 Regression] ICE: tree check: expected tree that contains 'decl minimal' structure, have 'tree_list' in hash, at tree.h:5958

2024-12-04 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117826

--- Comment #6 from GCC Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:205591919214cb5610e9c3b2394f05a3cfaa7f68

commit r15-5919-g205591919214cb5610e9c3b2394f05a3cfaa7f68
Author: Jakub Jelinek 
Date:   Wed Dec 4 10:54:41 2024 +0100

c++: Fix up erroneous template error recovery ICE [PR117826]

The testcase in the PR (which can't be easily reduced and is
way too large and has way too many errors) results in an ICE,
because the erroneous_templates hash_map holds trees of erroneous
templates across ggc_collect and some of the templates in there
could be removed, so the later lookup can crash on comparison of
already freed and reused trees.

The following patch makes the hash_map GTY((cache)) marked.
The cp-tree.h changes before the erroneous_template declaration
are needed to make gengtype happy, it didn't like using
directive nor using a template-id as a template parameter.

It is marked cache because if a decl would be solely referenced from
the erroneous_templates hash_map, then nothing would look it up.

2024-12-04  Jakub Jelinek  

PR c++/117826
* cp-tree.h (struct decl_location_traits): New type.
(erroneous_templates_t): Change using into typedef.
(erroneous_templates): Add GTY((cache)).
* error.cc (cp_adjust_diagnostic_info): Use
hash_map_safe_get_or_insert rather than
hash_map_safe_get_or_insert for erroneous_templates.

[Bug libgcc/78804] [RX] -m64bit-doubles does not work

2024-12-04 Thread olegendo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78804

Oleg Endo  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #20 from Oleg Endo  ---
I've been using -m64bit-doubles on GCC 8 with the patch from comment #19 for
several years now.  I guess this can be closed as fixed.

[Bug tree-optimization/116083] [14/15 Regression] Re-surfacing SLP vectorization slowness for gcc.dg/pr87985.c

2024-12-04 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116083

Richard Biener  changed:

   What|Removed |Added

   Keywords|needs-bisection |

--- Comment #7 from Richard Biener  ---
Thus for the testcase exposed by r14-4406-g6dc44436301143

[Bug tree-optimization/117875] [15 Regression] 28% regression for 456.hmmer on Zen4 with -Ofast -march=native

2024-12-04 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117875

Richard Biener  changed:

   What|Removed |Added

 CC||aldyh at gcc dot gnu.org,
   ||amacleod at redhat dot com

--- Comment #4 from Richard Biener  ---
Note the branch and exit condition are nicely present in the IL but indeed
there doesn't seem to be any infrastructure for special-casing known constant
amount of iteration and there's no cleanup done before more loop opts mess
things up.
I'm also not sure we'd handle this case with VRPs relation support,
niter analysis tries to simplify computed nonzero conditions, but just
with its weak simplify_using_initial_conditions which isn't even able
to compute M_9(D) < k_24 (aka the nonzero condition) as true.  We don't
simplify the computed niter expression itself.  The niter expression
is (unsigned int) M_9(D) - (unsigned int) k_24.

I'm not sure what ranger has at its disposal for us here when trying to
simplify GENERIC expressions, we could possibly try to compute it's
range and if that's singleton use that?

 [local count: 105119324]:
if (M_9(D) > 1)
  goto ; [64.00%]
else
  goto ; [36.00%]



 [local count: 611603345]:  // first loop
# k_15 = PHI 
...
k_12 = k_15 + 1;
if (k_12 < M_9(D))
  goto ; [89.00%] // latch
else
  goto ; [11.00%]

 [local count: 67276368]:
# k_29 = PHI 
if (M_9(D) >= k_29)
  goto ; [99.95%]
else
  goto ; [0.05%]

 [local count: 105119324]:
# k_24 = PHI <1(13), k_29(16)>   // 1 is from loop-around branch

 [local count: 343854870]:  // second loop
# k_17 = PHI 
...
k_23 = k_17 + 1;
if (M_9(D) >= k_23)

[Bug tree-optimization/115494] [14/15 Regression] wrong code at -O{2,3} on x86_64-linux-gnu since r14-3485

2024-12-04 Thread mikael at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115494

Mikael Morin  changed:

   What|Removed |Added

 CC||mikael at gcc dot gnu.org

--- Comment #12 from Mikael Morin  ---
Created attachment 59781
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59781&action=edit
Possible patch

I have had a look at this PR and can propose the attached patch that seems to
fix the issue.  It does so by creating new variables during phi translation of
expression whose variables are not defined in the current block.  I suppose
there could be an alternative fix that would avoid the creation of new
variables, and just reset the flow sensitive information of existing variables
when using them outside of their definition scope.  I didn't try it though, and
I won't have time to study this further, so I'm posting the patch so that it's
not lost.

[Bug tree-optimization/117875] [15 Regression] 28% regression for 456.hmmer on Zen4 with -Ofast -march=native

2024-12-04 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117875

--- Comment #5 from Richard Biener  ---
I tried get_range_query (cfun)->range_of_expr (vr, niter->niter, stmt) with
(unsigned int) M_9(D) - (unsigned int) k_24 and an enabled ranger
but that indeed returns [irange] unsigned int [0, 1024][4294966273, +INF]
and not a singleton as expected.
It seems to look for ranges of M_9 and k_24 and fold_range with the
minus op rather than trying to use relations to simplify the subtraction.
The ranges of the first and 2nd op are [1, 1025] and [1, 1024] respectively,
basically [1, INF] for our purpose (the constant array bound bounds them).

We'd also miss a way to inject niter->assumptions and niter->may_be_zero
as conditions known true for the purpose of simplifying (in this case
those don't add anything).

[Bug c++/117615] [12/13/14/15 Regression] constexpr failure static_cast of Derived virtual Pointer to Member function since r6-4014-gdcdbc004d531b4

2024-12-04 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117615

--- Comment #7 from GCC Commits  ---
The master branch has been updated by Simon Martin :

https://gcc.gnu.org/g:72a2380a306a1c3883cb7e4f99253522bc265af0

commit r15-5920-g72a2380a306a1c3883cb7e4f99253522bc265af0
Author: Simon Martin 
Date:   Tue Dec 3 14:30:43 2024 +0100

c++: Don't reject pointer to virtual method during constant evaluation
[PR117615]

We currently reject the following valid code:

=== cut here ===
struct Base {
virtual void doit (int v) const {}
};
struct Derived : Base {
void doit (int v) const {}
};
using fn_t = void (Base::*)(int) const;
struct Helper {
fn_t mFn;
constexpr Helper (auto && fn) : mFn(static_cast(fn)) {}
};
void foo () {
constexpr Helper h (&Derived::doit);
}
=== cut here ===

The problem is that since r6-4014-gdcdbc004d531b4, &Derived::doit is
represented with an expression with type pointer to method and using an
INTEGER_CST (here 1), and that cxx_eval_constant_expression rejects any
such expression with a non-null INTEGER_CST.

This patch uses the same strategy as r12-4491-gf45610a45236e9 (fix for
PR c++/102786), and simply lets such expressions go through.

PR c++/117615

gcc/cp/ChangeLog:

* constexpr.cc (cxx_eval_constant_expression): Don't reject
INTEGER_CSTs with type POINTER_TYPE to METHOD_TYPE.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/constexpr-virtual22.C: New test.

[Bug c++/102786] [c++20] virtual pmf sometimes rejected as not a constant

2024-12-04 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102786

--- Comment #8 from GCC Commits  ---
The master branch has been updated by Simon Martin :

https://gcc.gnu.org/g:72a2380a306a1c3883cb7e4f99253522bc265af0

commit r15-5920-g72a2380a306a1c3883cb7e4f99253522bc265af0
Author: Simon Martin 
Date:   Tue Dec 3 14:30:43 2024 +0100

c++: Don't reject pointer to virtual method during constant evaluation
[PR117615]

We currently reject the following valid code:

=== cut here ===
struct Base {
virtual void doit (int v) const {}
};
struct Derived : Base {
void doit (int v) const {}
};
using fn_t = void (Base::*)(int) const;
struct Helper {
fn_t mFn;
constexpr Helper (auto && fn) : mFn(static_cast(fn)) {}
};
void foo () {
constexpr Helper h (&Derived::doit);
}
=== cut here ===

The problem is that since r6-4014-gdcdbc004d531b4, &Derived::doit is
represented with an expression with type pointer to method and using an
INTEGER_CST (here 1), and that cxx_eval_constant_expression rejects any
such expression with a non-null INTEGER_CST.

This patch uses the same strategy as r12-4491-gf45610a45236e9 (fix for
PR c++/102786), and simply lets such expressions go through.

PR c++/117615

gcc/cp/ChangeLog:

* constexpr.cc (cxx_eval_constant_expression): Don't reject
INTEGER_CSTs with type POINTER_TYPE to METHOD_TYPE.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/constexpr-virtual22.C: New test.

[Bug rtl-optimization/117360] [15 regression] ext-dce.cc:573:15: runtime error: shift exponent 127 is too large for 64-bit type 'long long unsigned int'

2024-12-04 Thread jamborm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117360

--- Comment #8 from Martin Jambor  ---
I can confirm that our UBSAN bootstrap+testsuite buildbot run passed all tests
and is nicely green again. Thanks!

[Bug rtl-optimization/117360] [15 regression] ext-dce.cc:573:15: runtime error: shift exponent 127 is too large for 64-bit type 'long long unsigned int'

2024-12-04 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117360

Jakub Jelinek  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #9 from Jakub Jelinek  ---
Fixed then.

[Bug other/63426] [meta-bug] Issues found with -fsanitize=undefined

2024-12-04 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63426
Bug 63426 depends on bug 117360, which changed state.

Bug 117360 Summary: [15 regression] ext-dce.cc:573:15: runtime error: shift 
exponent 127 is too large for 64-bit type 'long long unsigned int'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117360

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

[Bug c/117909] New: gcc fails to save registers with no_caller_saved_registers

2024-12-04 Thread izaberina at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117909

Bug ID: 117909
   Summary: gcc fails to save registers with
no_caller_saved_registers
   Product: gcc
   Version: 14.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: izaberina at gmail dot com
  Target Milestone: ---

c version https://godbolt.org/z/z3v1W3njb
c++ version https://godbolt.org/z/T3cns6rPe

i need countbar to save rdi, and any registers used by __tls_get_addr

somehow the definition being above or below seems to affect the result

is this miscompiled or am i doing things wrong?

[Bug bootstrap/117893] [15 regression] space before -O0 in CFLAGS incorrectly removed

2024-12-04 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117893

--- Comment #3 from GCC Commits  ---
The master branch has been updated by Andreas Schwab :

https://gcc.gnu.org/g:1783b2030905567d1b89ccc5e674699619f159e5

commit r15-5921-g1783b2030905567d1b89ccc5e674699619f159e5
Author: Andreas Schwab 
Date:   Wed Dec 4 10:59:38 2024 +0100

gcc/configure: Properly remove -O flags from C[XX]FLAGS

PR bootstrap/117893
* configure.ac: Use shell loop to remove -O flags.
* configure: Regenerate.

[Bug bootstrap/117893] [15 regression] space before -O0 in CFLAGS incorrectly removed

2024-12-04 Thread schwab--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117893

Andreas Schwab  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #4 from Andreas Schwab  ---
Fixed.

[Bug rtl-optimization/117910] New: [avr][lra] Wrong code with -mlra in cmpdi-1.c

2024-12-04 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117910

Bug ID: 117910
   Summary: [avr][lra] Wrong code with -mlra in cmpdi-1.c
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gjl at gcc dot gnu.org
  Target Milestone: ---

Created attachment 59782
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59782&action=edit
Reduced C test case from cmpdi-1.c

There is yet another fail with -mlra with the attached cmpdi-1.c that produces
wrong code.

$avr-gcc -mmcu=atmega128 -dumpbase "" -save-temps -dp -Os -o cmpdi-1.elf -mlra
-fverbose-asm cmpdi-1.c

Notice the following load / stores involving Y+12 and X+12 in cmpdi-1.s:

   std Y+13,r25  ;  %sfp, _51;  685 [c=4 l=2]  *movhi/3
   std Y+12,r24  ;  %sfp, _51
...
   std Y+12,r25  ;  %sfp, tmp162 ;  588 [c=4 l=2]  *movhi/3
   std Y+11,r24  ;  %sfp, tmp162
...
   std Y+11,r24  ;  %sfp, arg1   ;  615 [c=4 l=1]  movqi_insn/2
...
   ldd r16,Y+11  ; , %sfp;  640 [c=4 l=1]  movqi_insn/3
...
   std Y+12,r9   ;  %sfp, res;  439 [c=4 l=2]  *movhi/3
   std Y+11,r8   ;  %sfp, res
...
   ldd r24,Y+12  ;  _51, %sfp;  435 [c=8 l=2]  *movhi/2
   ldd r25,Y+13  ;  _51, %sfp

So it seems LRA is trampling some values / is using wrong offsets.

Target: avr
Configured with: ../../source/gcc-master/configure --target=avr --disable-nls
--with-dwarf2 --with-gnu-as --with-gnu-ld --with-long-double=64
--enable-languages=c,c++
Thread model: single
Supported LTO compression algorithms: zlib
gcc version 15.0.0 20241203 (experimental) (GCC)

[Bug c/117909] gcc fails to save registers with no_caller_saved_registers

2024-12-04 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117909

Sam James  changed:

   What|Removed |Added

 CC||sjames at gcc dot gnu.org

--- Comment #1 from Sam James  ---
Created attachment 59783
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59783&action=edit
foo.c

Please always include godbolt link sources as attachments too. I'll do it here.

[Bug c/117909] gcc fails to save registers with no_caller_saved_registers

2024-12-04 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117909

--- Comment #2 from Sam James  ---
Created attachment 59784
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59784&action=edit
foo.cxx

[Bug libstdc++/62169] map iterators under _GLIBCXX_DEBUG diverge

2024-12-04 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62169

--- Comment #9 from Jonathan Wakely  ---
I was looking at this again in the context of
https://cplusplus.github.io/LWG/issue3578

If we ever wanted to make the debug mode iterators SCARY (or if the standard
ever required it) then I think we would need to enable implicit conversions
between distinct safe iterator types, constrained on them belonging to
compatible container types.

That way std::set::iterator and std::set>::iterator would
still be distinct types, but would be _mostly_ interchangeable. Not everything
would work though, e.g. anything working with pointers or references to the
iterator types would still be incompatible.

[Bug c/117911] New: Segmentation fault with '-O3 -fno-inline-functions-called-once -fno-inline-small-functions -fno-ipa-cp -fno-ipa-modref -fno-ipa-pure-const'

2024-12-04 Thread 19373742 at buaa dot edu.cn via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117911

Bug ID: 117911
   Summary: Segmentation fault with '-O3
-fno-inline-functions-called-once
-fno-inline-small-functions -fno-ipa-cp
-fno-ipa-modref -fno-ipa-pure-const'
   Product: gcc
   Version: 13.3.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: 19373742 at buaa dot edu.cn
  Target Milestone: ---

Created attachment 59785
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59785&action=edit
The preprocessed file

***
OS and Platform:
Ubuntu 22.04.3 LTS
***
gcc version:
$ gcc -v
Using built-in specs.
COLLECT_GCC=/gcc-releases/gcc-13/bin/gcc
COLLECT_LTO_WRAPPER=/gcc-releases/gcc-13/libexec/gcc/x86_64-pc-linux-gnu/13.3.1/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ./configure --prefix=/gcc-releases/gcc-13 --disable-multilib
--enable-laguages=c,c++
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 13.3.1 20241129 (GCC)  
***
Command Lines:
$ gcc -I ~/csmith/include/csmith-2.3.0 -O3 -fno-inline-functions-called-once
-fno-inline-small-functions -fno-ipa-cp -fno-ipa-modref -fno-ipa-pure-const a.c
-o ec
$ ./ec
Segmentation fault (core dumped)

$ gcc -I ~/csmith/include/csmith-2.3.0 a.c -o nec
$ ./nec
checksum = 94C3FC73
***
My Cvise encountered a bug during the reduction. I will try to provide the
reduced test case as soon as possible.

[Bug c/117911] Segmentation fault with '-O3 -fno-inline-functions-called-once -fno-inline-small-functions -fno-ipa-cp -fno-ipa-modref -fno-ipa-pure-const'

2024-12-04 Thread 19373742 at buaa dot edu.cn via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117911

--- Comment #1 from CTC <19373742 at buaa dot edu.cn> ---
Created attachment 59786
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59786&action=edit
The compiler output

[Bug rtl-optimization/117910] [avr][lra] Wrong code with -mlra in cmpdi-1.c

2024-12-04 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117910

--- Comment #1 from Georg-Johann Lay  ---
(In reply to Georg-Johann Lay from comment #0)
> ...involving Y+12 and X+12 in cmpdi-1.s
Meant "Y+12 and Y+11".

[Bug c/117912] Linux Kernel 6.13-rc1 Build Failure: 'Detected write beyond size of object'

2024-12-04 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117912

Sam James  changed:

   What|Removed |Added

 CC||kees at outflux dot net,
   ||qing.zhao at oracle dot com
   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=101941

--- Comment #1 from Sam James  ---
Please attach preprocessed source for trick-broadcast.c as well as the full
command line used (make V=1 will help there).

[Bug c/117911] Segmentation fault with '-O3 -fno-inline-functions-called-once -fno-inline-small-functions -fno-ipa-cp -fno-ipa-modref -fno-ipa-pure-const'

2024-12-04 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117911

--- Comment #3 from Sam James  ---
Fails with -fno-strict-aliasing too.

With Valgrind when (maybe) miscompiled:
```
==164909== Conditional jump or move depends on uninitialised value(s)
==164909==at 0x10A5E2: realsmith_proxy_j8M9c (in /tmp/a)
==164909==by 0x10A6A8: func_2.isra.0 (in /tmp/a)
==164909==by 0x10CB56: func_1.isra.0 (in /tmp/a)
==164909==by 0x108375: main (in /tmp/a)
==164909==
==164909== Conditional jump or move depends on uninitialised value(s)
==164909==at 0x10A614: realsmith_proxy_j8M9c (in /tmp/a)
==164909==by 0x10A6A8: func_2.isra.0 (in /tmp/a)
==164909==by 0x10CB56: func_1.isra.0 (in /tmp/a)
==164909==by 0x108375: main (in /tmp/a)
==164909==
==164909== Conditional jump or move depends on uninitialised value(s)
==164909==at 0x10A60F: realsmith_proxy_j8M9c (in /tmp/a)
==164909==by 0x10A6A8: func_2.isra.0 (in /tmp/a)
==164909==by 0x10CB56: func_1.isra.0 (in /tmp/a)
==164909==by 0x108375: main (in /tmp/a)
```

With GCC trunk asan+ubsan with -O3 -fno-stack-protector -fno-strict-aliasing
(so in theory not miscompiled):
```
==165062==ERROR: AddressSanitizer: stack-use-after-return on address
0x71feb7101020 at pc 0x5f8e5041002d bp 0x7ffc14e9c610 sp 0x7ffc14e9c600
READ of size 4 at 0x71feb7101020 thread T0
#0 0x5f8e5041002c in main (/tmp/a+0xa02c)
#1 0x75feb9205746 in __libc_start_call_main
../sysdeps/nptl/libc_start_call_main.h:58
#2 0x75feb92057f6 in __libc_start_main_impl ../csu/libc-start.c:360
#3 0x5f8e50411f20 in _start (/tmp/a+0xbf20)

Address 0x71feb7101020 is located in stack of thread T0 at offset 32 in frame
#0 0x5f8e5041c70f in func_14.constprop.0.isra.0 (/tmp/a+0x1670f)

  This frame has 14 object(s):
[32, 36) 'l_985' (line 3338) <== Memory access at offset 32 is inside this
variable
[48, 52) 'p_0_EnthG' (line )
[64, 68) 'p_1_Unkxz' (line )
[80, 88) 'l_1048' (line 3343)
[112, 120) 'l_1160' (line 3344)
[144, 152) 'l_1016' (line 3385)
[176, 186) 'proxy_hzikI'
[208, 220) 'l_1046' (line 3408)
[240, 264) 'l_1068' (line 3386)
[304, 356) 'proxy_J0MGc'
[400, 456) 'l_986' (line 3358)
[496, 556) 'proxy_1ZFFb'
[592, 672) 'l_1208' (line 3346)
[704, 2648) 'l_1209' (line 3345)
HINT: this may be a false positive if your program uses some custom stack
unwind mechanism, swapcontext or vfork
  (longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-use-after-return (/tmp/a+0xa02c) in main
Shadow bytes around the buggy address:
  0x71feb7100d80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x71feb7100e00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x71feb7100e80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x71feb7100f00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x71feb7100f80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x71feb7101000: f5 f5 f5 f5[f5]f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5
  0x71feb7101080: f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5
  0x71feb7101100: f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5
  0x71feb7101180: f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5
  0x71feb7101200: f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5
  0x71feb7101280: f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5
```

[Bug c/117911] Segmentation fault with '-O3 -fno-inline-functions-called-once -fno-inline-small-functions -fno-ipa-cp -fno-ipa-modref -fno-ipa-pure-const'

2024-12-04 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117911

Sam James  changed:

   What|Removed |Added

  Known to fail||12.4.1, 15.0
  Known to work||13.3.1

--- Comment #2 from Sam James  ---
Needs -fno-stack-protector. For me, 11.5.0/12.4.1 (tip)/trunk fail, while
13.3.1 (tip) works.

[Bug middle-end/64242] Longjmp expansion incorrect

2024-12-04 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64242

Georg-Johann Lay  changed:

   What|Removed |Added

   Keywords||wrong-code
   Last reconfirmed|2018-11-29 00:00:00 |2024-12-4

--- Comment #42 from Georg-Johann Lay  ---
The pr64242.c test case also FAILs on avr with v14.2 and current trunk (future
v15).

In the .expand dump, there is basically this:

;; __builtin_longjmp (&buf, 1);

(insn 13 12 14 (set (reg:HI 47)
(mem:HI (plus:HI (reg/f:HI 38 virtual-stack-vars)
(const_int 2 [0x2])) [4  S2 A8])) "pr64242.c":12:3 -1
 (nil))
...
(insn 16 15 17 (set (reg/f:HI 28 r28)
(mem:HI (reg/f:HI 38 virtual-stack-vars) [4  S2 A8])) "pr64242.c":12:3
-1
 (nil))
...
(insn 19 18 20 (set (reg/f:HI 32 __SP_L__)
(mem:HI (plus:HI (reg/f:HI 38 virtual-stack-vars)
(const_int 4 [0x4])) [4  S2 A8])) "pr64242.c":12:3 -1
 (nil))

Insn 13 is loading the jump address and is correct.

Insn 16 is setting the frame-poiner r28:HI to buf[0].

Insn 19 is reading SP from the frame buf[4], but the frame-pointer has already
been changed, hence SP will contain garbage.

[Bug c/117912] Linux Kernel 6.13-rc1 Build Failure: 'Detected write beyond size of object'

2024-12-04 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117912

--- Comment #3 from Sam James  ---
I'll give it a go now.

[Bug tree-optimization/116463] [15 Regression] complex multiply vectorizer detection failures after r15-3087-gb07f8a301158e5

2024-12-04 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116463

--- Comment #33 from Richard Biener  ---
I see.  Note it's SLP discoveries association code that figures out a SLP
graph, disabling this ends up with single-lane (store-lanes) from the start. 
The association that "succeeds" first wins, and it's an unfortunate one (for
SLP pattern detection).

The thing is that the re-association greedily figures the best operand
order as well.  We start with

t.c:3:21: note:   pre-sorted chains of plus_expr
plus_expr _19 plus_expr _27 minus_expr _26 
plus_expr _18 minus_expr _29 minus_expr _28

and if we'd start with

plus_expr _19 plus_expr _27 minus_expr _26 
plus_expr _18 minus_expr _28 minus_expr _29 

instead we get the desired SLP pattern match but still store-lanes is
prefered it seems (not sure how we got away with no store-lanes in GCC 13).
We could simply refuse to override the SLP graph with laod/store-lanes
when patterns were found:

diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
index 3892e1be3f2..4fb57a76f85 100644
--- a/gcc/tree-vect-slp.cc
+++ b/gcc/tree-vect-slp.cc
@@ -5064,7 +5065,7 @@ vect_analyze_slp (vec_info *vinfo, unsigned
max_tree_size,
  to cancel SLP when this applied to all instances in a loop but now
  we decide this per SLP instance.  It's important to do this only
  after SLP pattern recognition.  */
-  if (is_a  (vinfo))
+  if (!pattern_found && is_a  (vinfo))
 FOR_EACH_VEC_ELT (LOOP_VINFO_SLP_INSTANCES (vinfo), i, instance)
   if (SLP_INSTANCE_KIND (instance) == slp_inst_kind_store
  && !SLP_INSTANCE_TREE (instance)->ldst_lanes)

when starting with the swapped ops above we then get the desired code
again.  I've hacked that in with

diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
index 3892e1be3f2..4fb57a76f85 100644
--- a/gcc/tree-vect-slp.cc
+++ b/gcc/tree-vect-slp.cc
@@ -2275,6 +2275,7 @@ vect_build_slp_tree_2 (vec_info *vinfo, slp_tree node,
  /* 1. pre-sort according to def_type and operation.  */
  for (unsigned lane = 0; lane < group_size; ++lane)
chains[lane].stablesort (dt_sort_cmp, vinfo);
+ std::swap (chains[1][2], chains[1][1]);
  if (dump_enabled_p ())
{
  dump_printf_loc (MSG_NOTE, vect_location,

it happens that in this specific case the optimal operand order matches
stmt order so the following produces that - but I'm not positively sure
that's always good (though the 'stablesort' also tries to not disturb
order - but in this case it's the DFS order collecting the scalar ops).
In reality there's not enough info on the op or its definition to locally
decide a better order for future pattern matching.

diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
index 3892e1be3f2..f21e8b909ff 100644
--- a/gcc/tree-vect-slp.cc
+++ b/gcc/tree-vect-slp.cc
@@ -1684,7 +1684,12 @@ dt_sort_cmp (const void *op1_, const void *op2_, void *)
   auto *op2 = (const chain_op_t *) op2_;
   if (op1->dt != op2->dt)
 return (int)op1->dt - (int)op2->dt;
-  return (int)op1->code - (int)op2->code;
+  if ((int)op1->code != (int)op2->code)
+return (int)op1->code - (int)op2->code;
+  if (TREE_CODE (op1->op) == SSA_NAME && TREE_CODE (op2->op) == SSA_NAME)
+return (gimple_uid (SSA_NAME_DEF_STMT (op1->op))
+   - gimple_uid (SSA_NAME_DEF_STMT (op2->op)));
+  return 0;
 }

 /* Linearize the associatable expression chain at START with the


That said, I don't have a good idea on how to make this work better, not
even after re-doing SLP discovery.  Maybe SLP patterns need to work on the
initial single-lane SLP graph?  But then we'd have to find lane-matches
on two unconnected SLP sub-graphs which complicates the pattern matching
part.

We basically form SLP nodes from two sets of (two lanes) plus/minus ops
(three each) but we of course try to avoid SLP build of all 3! permutations
possible and stop at the first one that succeeds.

[Bug c/117912] Linux Kernel 6.13-rc1 Build Failure: 'Detected write beyond size of object'

2024-12-04 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117912

--- Comment #4 from Sam James  ---
Reproduced.

[Bug c/117912] New: Linux Kernel 6.13-rc1 Build Failure: 'Detected write beyond size of object'

2024-12-04 Thread laoar.shao at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117912

Bug ID: 117912
   Summary: Linux Kernel 6.13-rc1 Build Failure: 'Detected write
beyond size of object'
   Product: gcc
   Version: 14.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: laoar.shao at gmail dot com
  Target Milestone: ---

Created attachment 59787
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59787&action=edit
kernel config

GCC version: 14.2.0
Linux Kernel Version: 6.13-rc1 [0]
Kernel Config: Attached.

We recently encountered the following build failure with the latest Linux
Kernel:


  CC  kernel/time/tick-broadcast.o
In file included from
/work/build/trace/nobackup/linux-test.git/include/linux/string.h:390,
 from
/work/build/trace/nobackup/linux-test.git/include/linux/bitmap.h:13,
 from
/work/build/trace/nobackup/linux-test.git/include/linux/cpumask.h:12,
 from
/work/build/trace/nobackup/linux-test.git/include/linux/smp.h:13,
 from
/work/build/trace/nobackup/linux-test.git/include/linux/lockdep.h:14,
 from
/work/build/trace/nobackup/linux-test.git/include/linux/spinlock.h:63,
 from
/work/build/trace/nobackup/linux-test.git/include/linux/wait.h:9,
 from
/work/build/trace/nobackup/linux-test.git/include/linux/wait_bit.h:8,
 from
/work/build/trace/nobackup/linux-test.git/include/linux/fs.h:6,
 from
/work/build/trace/nobackup/linux-test.git/kernel/auditsc.c:37:
In function ‘sized_strscpy’,
inlined from ‘__audit_ptrace’ at
/work/build/trace/nobackup/linux-test.git/kernel/auditsc.c:2732:2:
/work/build/trace/nobackup/linux-test.git/include/linux/fortify-string.h:293:17:
error: call to ‘__write_overflow’ declared with attribute error: detected write
beyond size of object (1st parameter)
  293 | __write_overflow();
  | ^~
  CC  arch/x86/kernel/tracepoint.o
In function ‘sized_strscpy’,
inlined from ‘audit_signal_info_syscall’ at
/work/build/trace/nobackup/linux-test.git/kernel/auditsc.c:2759:3:
/work/build/trace/nobackup/linux-test.git/include/linux/fortify-string.h:293:17:
error: call to ‘__write_overflow’ declared with attribute error: detected write
beyond size of object (1st parameter)
  293 | __write_overflow();
  | ^~
  AR  drivers/nvmem/built-in.a
make[4]: ***
[/work/build/trace/nobackup/linux-test.git/scripts/Makefile.build:229:
kernel/auditsc.o] Error 1

It seems there may be an issue with GCC. After searching the GCC bug tracker, I
found some similar reports, such as Bug 101941 [1]. However, since Bug 101941
was resolved in GCC-12.1, this appears to be a new or related issue.

It’s worth noting that this problem can be temporarily addressed with a code
adjustment [2], which might help in investigating the root cause more
effectively.

[0] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/ 
[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101941  
[2] https://lore.kernel.org/audit/20241203060350.69472-1-laoar.s...@gmail.com/

[Bug tree-optimization/116463] [15 Regression] complex multiply vectorizer detection failures after r15-3087-gb07f8a301158e5

2024-12-04 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116463

--- Comment #34 from Richard Biener  ---
Possibly first computing a lattice val for each SSA name whether its origin is
a "real" or a "imag" component of a complex load could get us meta but even
then the individual sorting which determines the initial association to SLP
nodes would be only possible to adjust "cross-lane" (and to what?  I guess
combine real+imag parts?  Hopefully of the same entity).  Into vect we get with

  _19 = REALPART_EXPR <*_3>;
  _18 = IMAGPART_EXPR <*_3>;
  _5 = a_14(D) + _2;
  _23 = REALPART_EXPR <*_5>; // real
  _24 = IMAGPART_EXPR <*_5>; // imag
  _26 = b$real_11 * _23; // real?
  _27 = _24 * _53; // imag?
  _28 = _23 * _53;  // mixed?  but fed into imag
  _29 = b$real_11 * _24; // mixed?
  _7 = _18 - _28;  // mixed? or imag?
  _22 = _27 - _26;  // mixed?
  _32 = _19 + _22;  // mixed?  or real?
  _33 = _7 - _29; // mixed?  but fed into real?
  REALPART_EXPR <*_3> = _32;
  IMAGPART_EXPR <*_3> = _33;

so not sure if that will help.  That we'd like to have full load groups
is unfortunately only visible a node deeper.  We could also fill a lattice
with group IDs but we'd need to know parts to identify lane duplicates vs.
full groups.  It's also a lot of hassle for not much gain and very special
cases?

[Bug tree-optimization/115494] [14/15 Regression] wrong code at -O{2,3} on x86_64-linux-gnu since r14-3485

2024-12-04 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115494

--- Comment #13 from Richard Biener  ---
(In reply to Mikael Morin from comment #12)
> Created attachment 59781 [details]
> Possible patch
> 
> I have had a look at this PR and can propose the attached patch that seems
> to fix the issue.  It does so by creating new variables during phi
> translation of expression whose variables are not defined in the current
> block.  I suppose there could be an alternative fix that would avoid the
> creation of new variables, and just reset the flow sensitive information of
> existing variables when using them outside of their definition scope.  I
> didn't try it though, and I won't have time to study this further, so I'm
> posting the patch so that it's not lost.

The patch basically papers over the issue that we try to do PHI translation
on the expression representation to keep those "valid" but take a detour
through valueization.  Translating the expression operand newnary->op[i] we
do

pre_expr leader, result;
unsigned int op_val_id = VN_INFO (newnary->op[i])->value_id;
leader = find_leader_in_sets (op_val_id, set1, set2);
result = phi_translate (dest, leader, set1, set2, e);
if (result && result != leader)
  /* If op has a leader in the sets we translate make
 sure to use the value of the translated expression.
 We might need a new representative for that.  */
  newnary->op[i] = get_representative_for (result, pred);
else if (!result)
  return NULL;

changed |= newnary->op[i] != nary->op[i];

so we turn the op into a value and then look for a leader in our set
of expressions to translate (but not necessarily prefering nary->op[i]
itself).  But we only skip get_representative_for if phi-translation
did anything - the patch always assigns a new representative.
To that extent the original newnary->op[i] was already "bad" (but 
as indicated in comment#8 I believed that was all OK?).

That said, I know the transition to making expressions first class is done
only half-way.  Expressions in the sets should always be "representatives"
at the respective program point to be eligible for expression simplification.

One could also try to attack the issue by making sure all expressions for
the same value we'd insert for would behave the same before re-using the
value for the inserted PHI.  But somehow that also feels odd.

[Bug translation/90160] missing quote in diagnostic

2024-12-04 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90160

--- Comment #10 from GCC Commits  ---
The master branch has been updated by David Malcolm :

https://gcc.gnu.org/g:a0ac8fa55a4749979faa56a9e8bd389252edc89f

commit r15-5922-ga0ac8fa55a4749979faa56a9e8bd389252edc89f
Author: David Malcolm 
Date:   Wed Dec 4 08:40:34 2024 -0500

arm: use quotes when referring to command-line options [PR90160]

gcc/ChangeLog:
PR translation/90160
* config/arm/arm.cc (arm_option_check_internal): Use quotes in
messages that refer to command-line options.  Tweak wording.

Signed-off-by: David Malcolm 

[Bug c/117912] Linux Kernel 6.13-rc1 Build Failure: 'Detected write beyond size of object'

2024-12-04 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117912

--- Comment #6 from Sam James  ---
I'm reducing it.

[Bug c++/117855] [14/15 Regression] ICE with deduction guides, default template arguments and inherited deduction guides

2024-12-04 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117855

Richard Biener  changed:

   What|Removed |Added

   Priority|P3  |P2
  Known to work||13.3.0

--- Comment #8 from Richard Biener  ---
GCC 14.1 and 14.2 reject the testcase due to PR116276 instead.  While the ICE
is new on the branch that we rejected it errorneously before makes an argument
against putting this at P1.

[Bug c/107980] va_start does not warn about an arbitrary number of arguments in C2x mode

2024-12-04 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107980

--- Comment #19 from Jakub Jelinek  ---
Created attachment 59789
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59789&action=edit
gcc15-pr107980.patch

In that case here is an untested patch.
I've changed the macro to
#define va_start(...) __builtin_c23_va_start(__VA_ARGS__)
rather than
#define va_start(v, ...) __builtin_c23_va_start(v __VA_OPT__(,) __VA_ARGS__)
so that we diagnose even
va_start (ap,);
Is it a problem that va_start() won't be diagnosed during preprocessing and
only
during compilation?

[Bug rtl-optimization/117248] gcc/libgcc/libgcc2.h:232:25: internal compiler error: Arithmetic exception

2024-12-04 Thread vmakarov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117248

--- Comment #10 from Vladimir Makarov  ---
(In reply to John David Anglin from comment #7)
>
> Compile command:
>  /home/dave/gnu/gcc/objdir/./prev-gcc/cc1plus -fpreprocessed
> tree-vect-slp.ii -quiet -dumpbase tree-vect-slp.cc -dumpbase-ext .cc -g -O2
> -Wextra -Wall -Wno-error=narrowing -Wwrite-strings -
> Wcast-qual -Wsuggest-attribute=format -Wconditionally-supported
> -Woverloaded-virtual=2 -Wpedantic -Wno-long-long -Wno-variadic-macros
> -Wno-overlength-strings -Werror -version -fno-checking -fno
> -exceptions -fno-rtti -fasynchronous-unwind-tables -fno-PIE -o
> tree-vect-slp.s

Sorry, I tried to reproduce the bug but failed.  For yesterday trunk, I have
before LRA:

 2981: %r26:SI=r171:SI
 2982: %r25:SI=r104:SI
 2983: {%r29:SI=udiv(%r26:SI,%r25:SI);clobber r951:SI;clobber r952:SI;clobber
%r26:SI;clobber %r25:SI;clobber %r31:SI;}
  REG_DEAD %r26:SI
  REG_DEAD %r25:SI
  REG_UNUSED r952:SI
  REG_UNUSED r951:SI
  REG_UNUSED %r31:SI
  REG_UNUSED %r26:SI
  REG_UNUSED %r25:SI
  REG_EQUAL udiv(r171:SI,r104:SI)
 2985: %r26:SI=r171:SI
 5697: r1590:SI=%r29:SI
  REG_DEAD %r29:SI
 2986: %r25:SI=r104:SI
 2984: r121:SI=r1590:SI
  REG_DEAD r1590:SI
  REG_EQUAL udiv(r171:SI,r104:SI)
 2987: {%r29:SI=umod(%r26:SI,%r25:SI);clobber r956:SI;clobber r955:SI;clobber
%r26:SI;clobber %r25:SI;clobber %r31:SI;}

and after LRA:

 2981: %r26:SI=%r3:SI
 6560: %r22:SI=%r30:SI-0xe0
 2982: %r25:SI=[%r22:SI]
 2983: {%r29:SI=udiv(%r26:SI,%r25:SI);clobber %r1:SI;clobber %r28:SI;clobber
%r26:SI;clobber %r25:SI;clobber %r31:SI;}
  REG_EQUAL udiv(%r3:SI,[%r30:SI-0xe0])
 2985: %r26:SI=%r3:SI
 2986: %r25:SI=[%r22:SI]
 2984: %r6:SI=%r29:SI
  REG_EQUAL udiv(%r3:SI,[%r30:SI-0xe0])
 2987: {%r29:SI=umod(%r26:SI,%r25:SI);clobber %r1:SI;clobber %r28:SI;clobber
%r26:SI;clobber %r25:SI;clobber %r31:SI;}
  REG_EQUAL umod(%r3:SI,[%r30:SI-0xe0])

Insn 2986 still exists (absent in your RTL code).  If you give me the exact
commit hash, I could try to reproduce it again.

[Bug c/117912] Linux Kernel 6.13-rc1 Build Failure: 'Detected write beyond size of object'

2024-12-04 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117912

--- Comment #5 from Sam James  ---
Created attachment 59788
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59788&action=edit
auditsc.i.xz

```
$ gcc -c kernel/auditsc.i -std=gnu17 -Os
In file included from ./arch/x86/include/asm/linkage.h:6,
 from ./include/linux/linkage.h:8,
 from ./include/linux/fs.h:5,
 from kernel/auditsc.c:37:
./arch/x86/include/asm/ibt.h:77:1: warning: ‘nocf_check’ attribute ignored. Use
‘-fcf-protection’ option to enable it [-Wattributes]
   77 | extern __noendbr u64 ibt_save(bool disable);
  | ^~
./arch/x86/include/asm/ibt.h:78:1: warning: ‘nocf_check’ attribute ignored. Use
‘-fcf-protection’ option to enable it [-Wattributes]
   78 | extern __noendbr void ibt_restore(u64 save);
  | ^~
In file included from ./include/linux/string.h:389,
 from ./include/linux/bitmap.h:13,
 from ./include/linux/cpumask.h:12,
 from ./include/linux/smp.h:13,
 from ./include/linux/lockdep.h:14,
 from ./include/linux/spinlock.h:63,
 from ./include/linux/wait.h:9,
 from ./include/linux/wait_bit.h:8,
 from ./include/linux/fs.h:6:
In function ‘sized_strscpy’,
inlined from ‘__audit_ptrace’ at kernel/auditsc.c:2732:2:
./include/linux/fortify-string.h:293:17: error: call to ‘__write_overflow’
declared with attribute error: detected write beyond size of object (1st
parameter)
  293 | __write_overflow();
  | ^~
In function ‘sized_strscpy’,
inlined from ‘audit_signal_info_syscall’ at kernel/auditsc.c:2759:3:
./include/linux/fortify-string.h:293:17: error: call to ‘__write_overflow’
declared with attribute error: detected write beyond size of object (1st
parameter)
  293 | __write_overflow();
  | ^~
```

[Bug c++/117813] [14 Regression] GCC14 + -fsanitize=undefined + -Os + recursive_directory_iterator results in undefined reference since r14-5979-g99d114c15523e0

2024-12-04 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117813

--- Comment #11 from Jakub Jelinek  ---
E.g.
int baz ();
void qux (int *);
template 
struct S {
#define P qux (v);
#define Q P P P P P P P P P P
#define R Q Q Q Q Q Q Q Q Q Q
  constexpr S () { if consteval { } else { int v[baz ()]; R R R R R R R R } }
};
//extern template struct S<0>;
constexpr S<0> a;
S<0> b;
S<0> c;
S<0> d;
compiled with g++ 12 -Os -std=c++23 -fno-ipa-cp -fno-ipa-sra
only exported _ZN1SILi0EEC4Ev.
With -Os -std=c++23 -fno-ipa-cp -fno-ipa-sra
only exported _ZN1SILi0EEC1Ev.
Though, admittedly with
template struct S<0>; at the end it also exports
_ZN1SILi0EEC2Ev and _ZN1SILi0EEC1Ev.
Now, if one in some other TU uses the same with extern template struct S<0>;
seems C1 is called in all cases.  So maybe just arranging for the extern
template cases
that one never references C4 might be good enough.

[Bug tree-optimization/116463] [15 Regression] complex multiply vectorizer detection failures after r15-3087-gb07f8a301158e5

2024-12-04 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116463

--- Comment #35 from Tamar Christina  ---
(In reply to Richard Biener from comment #34)
> Possibly first computing a lattice val for each SSA name whether its origin
> is a "real" or a "imag" component of a complex load could get us meta but
> even then the individual sorting which determines the initial association to
> SLP nodes would be only possible to adjust "cross-lane" (and to what?  I
> guess combine real+imag parts?  Hopefully of the same entity).  Into vect we
> get with
> 
>   _19 = REALPART_EXPR <*_3>;
>   _18 = IMAGPART_EXPR <*_3>;
>   _5 = a_14(D) + _2;
>   _23 = REALPART_EXPR <*_5>; // real
>   _24 = IMAGPART_EXPR <*_5>; // imag
>   _26 = b$real_11 * _23; // real?
>   _27 = _24 * _53; // imag?
>   _28 = _23 * _53;  // mixed?  but fed into imag
>   _29 = b$real_11 * _24; // mixed?
>   _7 = _18 - _28;  // mixed? or imag?
>   _22 = _27 - _26;  // mixed?
>   _32 = _19 + _22;  // mixed?  or real?
>   _33 = _7 - _29; // mixed?  but fed into real?
>   REALPART_EXPR <*_3> = _32;
>   IMAGPART_EXPR <*_3> = _33;
> 
> so not sure if that will help.  That we'd like to have full load groups
> is unfortunately only visible a node deeper.  We could also fill a lattice
> with group IDs but we'd need to know parts to identify lane duplicates vs.
> full groups.  It's also a lot of hassle for not much gain and very special
> cases?

That should help, because all it's after is that the final loads be permuted.
The reason I'm keep to fix this is because it's not that niche. complex-add due
to the operation being just +/- with a permute is by far the most common one.

Not recognizing this is e.g. 10% on fotonik in SPECCPU2017 FP, which is also a
regression I'm trying to fix.

I can try to reduce a testcase for that to see if maybe that specific one is
easier to fix.  I'm just wondering if we can't do better in the future, e.g.
LLVM recognizes both fms180snd cases for instance.

If it's easier, I could see if we can just have another pattern to discover
fmul + fcadd?

Could maybe work and fix the SPEC regression... need to make a testcase

[Bug tree-optimization/117912] Linux Kernel 6.13-rc1 Build Failure: 'Detected write beyond size of object'

2024-12-04 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117912

Sam James  changed:

   What|Removed |Added

  Component|c   |tree-optimization

--- Comment #7 from Sam James  ---
At -O2, 4.8.1 fails and 4.7.4 works for:
```
long sized_strscpy_size;
struct audit_context *audit_context();
struct lsm_prop {
} security_task_getlsmprop_obj();

void __write_overflow() __attribute__((
__error__("detected write beyond size of object (1st parameter)")));

struct audit_context {
  int target_pid;
  struct lsm_prop target_ref;
  char target_comm[];
} __audit_ptrace() {
  struct audit_context *context = audit_context();
  security_task_getlsmprop_obj(&context->target_ref);
  unsigned long p_size = __builtin_object_size(context->target_comm, 1);
  if (p_size < sized_strscpy_size)
__write_overflow();
}
```

Clang accepts it. s/_dynamic/ to test with older GCC.

[Bug tree-optimization/117912] Linux Kernel 6.13-rc1 Build Failure: 'Detected write beyond size of object'

2024-12-04 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117912

--- Comment #8 from Sam James  ---
I think the testcase needs refining, though -- it's really right for
sized_strscpy_size (happens w/o target_comm it being a FAM too).

[Bug tree-optimization/117912] Linux Kernel 6.13-rc1 Build Failure: 'Detected write beyond size of object'

2024-12-04 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117912

--- Comment #9 from Jakub Jelinek  ---
That is completely unreadable.
Slightly cleaned up:

long s;
struct A *foo (void);
struct B {};
struct B bar (struct B *);
void baz (void) __attribute__((__error__ ("baz")));
struct A { int a; struct B b; char c[]; };
void
qux (void)
{
  struct A *p = foo ();
  bar (&p->b);
  unsigned long q = __builtin_object_size (p->c, 1);
  if (q < s)
baz ();
}

I think the error started with r0-118806-g17742d62a2438144b6235.
The early_objsz pass doesn't do anything special here, because it sees it is
used on a flexible array member (and even char c[24]; wouldn't do anything, as
it is still considered flexible-like array member, only having some other field
after that would cause MIN_EXPR <__builtin_object_size (p->c, 1), 24>.
But then fre1 comes and changes
   _1 = &p_8->b;
   bar (_1);
-  _2 = &p_8->c;
-  q_10 = __builtin_object_size (_2, 1);
+  q_10 = __builtin_object_size (_1, 1);
(i.e. value numbers &p_8->b and &p_8->c the same, p->b is zero sized structure,
so
&p->b == &p->c and for GIMPLE most pointer conversions are useless).

Similar testcase could be
struct S { int a; int b[2]; int c; };
struct T { int a; int b[24]; int c; };
union U { struct S s; struct T t; };
void bar (int *);
void baz (union U *);

__SIZE_TYPE__
foo (union U *p)
{
  bar (&p->s.b[0]);
  baz (p); /* Assume this changes current member from p->s to p->t.  */
  return __builtin_object_size (&p->t.b[0], 1);
}
although SCCVN doesn't value number &p->t.b[0] the same as &p->s.b[0] for some
reason.

The only fix I have in mind right now is simply treat __bos and __bdos 1 and 3
modes as 0 and 2 during late objsz.  Though it surely will lead to some
regressions in security coverage, e.g. if early inlining doesn't figure out
what field (or fields) certain pointer points to (or could point to) and it is
only late inlining that allows us to see it.
Or a targetted fix just for this case, if during late objsz we see __bos/__bdos
1 or 3 on zero sized field and there is a non-zero sized one or flexible array
member or flexible-like array member at the same address, conservatively assume
the larger.

[Bug tree-optimization/117912] Linux Kernel 6.13-rc1 Build Failure: 'Detected write beyond size of object'

2024-12-04 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117912

--- Comment #11 from Jakub Jelinek  ---
Note, in the second testcase in the above comment, if I change it to
struct S { int a; int b[2]; int c; };
struct T { int a; int b[24]; int c; };
union U { struct S s; struct T t; };
void bar (int *);
void baz (union U *);

__SIZE_TYPE__
foo (union U *p)
{
  bar (&p->s.b[0]);
  baz (p); /* Assume this changes current member from p->s to p->t.  */
  bar (&p->t.b[0]);
  return __builtin_object_size (&p->t.b[0], 1);
}
so that we see VN of the different pointers, fre3 actually optimizes that
   _1 = &p_3(D)->s.b[0];
   bar (_1);
   baz (p_3(D));
-  _2 = &p_3(D)->t.b[0];
-  bar (_2);
+  bar (_1);
so that optimization clearly must be one of the
(cfun->curr_properties & PROP_objsz)
guarded ones, while what happens on the first testcase happens regardless of
that condition.
So yet another fix might be guard that optimization with (cfun->curr_properties
& PROP_objsz) too.

[Bug tree-optimization/117912] [12/13/14/15 regression] Linux Kernel 6.13-rc1 Build Failure: 'Detected write beyond size of object' since r0-118806-g17742d62a2438144b6235

2024-12-04 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117912

Sam James  changed:

   What|Removed |Added

Summary|Linux Kernel 6.13-rc1 Build |[12/13/14/15 regression]
   |Failure: 'Detected write|Linux Kernel 6.13-rc1 Build
   |beyond size of object'  |Failure: 'Detected write
   ||beyond size of object'
   ||since
   ||r0-118806-g17742d62a2438144
   ||b6235
   Target Milestone|--- |12.5
  Known to fail||4.8.1
  Known to work||4.7.4
   Keywords||missed-optimization,
   ||rejects-valid

[Bug c++/117913] New: destroying delete operator should have implicit expection speciification

2024-12-04 Thread alisdairm at me dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117913

Bug ID: 117913
   Summary: destroying delete operator should have implicit
expection speciification
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: alisdairm at me dot com
  Target Milestone: ---

According to [except.spec] 14.5p9, "A deallocation function (6.7.6.5.3) with no
explicit noexcept-specifier has a non-throwing exception specification."

According to the xref (6.7.6.5.3), destroying delete is a deallocation
function, so should have an implicit `noexcept(true)` where no exception
specification is provided.

The following test program verifies all permutations, and gcc fails for only
the case of the implicit exception specification.

Note that when there is a destroying delete, the destructor for the class is
*not* called as part of the delete expression, so the throwing destructor in
the example is there to confirm other behavior that GCC handles correctly.


#include 
#include 

struct Implicit {
~Implicit() noexcept(false) {}

void operator delete(Implicit*, std::destroying_delete_t) {
std::puts("destroyed implicit");
}
};

struct Explicit {
~Explicit() noexcept(false) {}

void operator delete(Explicit*, std::destroying_delete_t) noexcept {
std::puts("destroyed explicit");
}
};

struct Undefined {
~Undefined() noexcept(false) {}

void operator delete(Undefined*, std::destroying_delete_t) noexcept(false)
{
std::puts("destroyed UB");
throw 42;
}
};

Implicit * pn = nullptr;
static_assert( noexcept(delete(pn)));

Explicit * qn = nullptr;
static_assert( noexcept(delete(qn)));

Undefined * un = nullptr;
static_assert(!noexcept(delete(un)));

int main() {
Implicit *p = new Implicit();
delete p;

Explicit *q = new Explicit();
delete q;

try {
   Undefined *u = new Undefined();
   delete u;
}
catch (...) {
 std::puts("undefined behavior");   
}
}

[Bug c++/117913] destroying delete operator should have implicit expection speciification

2024-12-04 Thread alisdairm at me dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117913

--- Comment #1 from Alisdair Meredith  ---
Sorry, I made no effort to verify how far back this bug goes, but I expect it
has been an issue ever since destroying delete was first implemented.

[Bug tree-optimization/117912] [12/13/14/15 regression] Linux Kernel 6.13-rc1 Build Failure: 'Detected write beyond size of object' since r0-118806-g17742d62a2438144b6235

2024-12-04 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117912

Jakub Jelinek  changed:

   What|Removed |Added

   Keywords|rejects-valid   |wrong-code

--- Comment #12 from Jakub Jelinek  ---
rejects-valid seems inappropriate here, sure, with the error attribute one can
error on all kinds of things and so most missed optimizations could be that way
rejects-valid as well.
I think it is wrong-code, __builtin_object_size (x, 1) is supposed to return
the minimum object size, so returning something smaller (like here 0) than the
minimum (which isn't really known, because we don't see what allocated it and
it is flexible array member) is wrong-code, returning something larger is
perhaps missed optimization (but we document that it can always give up and
return something more conservative up to all ones).

[Bug c++/117913] destroying delete operator should have implicit expection speciification

2024-12-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117913

--- Comment #2 from Andrew Pinski  ---
GCC, clang and MSVC all say the 1st static_assert fails.
GCC passes the 2nd while clang and MSVC says it fails.

[Bug c++/117913] destroying delete operator should have implicit expection speciification

2024-12-04 Thread alisdairm at me dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117913

--- Comment #3 from Alisdair Meredith  ---
Clang and MSVC have bigger bugs that I am filing bug reports on shortly!

EDG gets this correct and passes all parts of the test, including the
"expected" undefined behavior:
https://godbolt.org/z/EG3EP5EW4

[Bug c++/117913] destroying delete operator should have implicit expection speciification

2024-12-04 Thread alisdairm at me dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117913

--- Comment #4 from Alisdair Meredith  ---
In this case, clang and MSVC are not even considering the destroying delete
within the noexcept operator within a static_assert --- I am not sure at what
point that breaks down though.  The runtime tests demonstrate that the correct
destroying delete is called at runtime.

[Bug fortran/117901] [ 15 regression] class_transformational_1.f90 with -O3 and -fcheck=bounds gives ICE in make_ssa_name_fn

2024-12-04 Thread pault at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117901

Paul Thomas  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |pault at gcc dot gnu.org
Summary|class_transformational_1.f9 |[ 15 regression]
   |0 with -O3 and  |class_transformational_1.f9
   |-fcheck=bounds gives ICE in |0 with -O3 and
   |make_ssa_name_fn|-fcheck=bounds gives ICE in
   ||make_ssa_name_fn

--- Comment #3 from Paul Thomas  ---
I am trying to fix this at source.

However,

diff --git a/gcc/testsuite/gfortran.dg/class_transformational_1.f90
b/gcc/testsuite/gfortran.dg/class_transformational_1.f90
index 77ec24a43c0..3e64f5d91e5 100644
--- a/gcc/testsuite/gfortran.dg/class_transformational_1.f90
+++ b/gcc/testsuite/gfortran.dg/class_transformational_1.f90
@@ -169,7 +169,7 @@ contains
   end

   subroutine unlimited_rebar (arg)
-class(*) :: arg(:)
+class(*), allocatable :: arg(:)  ! Not having this allocatable
=> pr117901
 call class_bar (arg)
   end


removes the problem. If I don't get to the source by the end of the day, I will
apply the patch as a temporary measure.

Paul

[Bug c++/114844] A trivial but noexcept(false) destructor is incorrectly considered non-throwing

2024-12-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114844

--- Comment #3 from Andrew Pinski  ---
(In reply to Andrew Pinski from comment #2)
> https://cplusplus.github.io/CWG/issues/2886.html
> 
> The DR report is still have not been accepted here ...

`[Accepted as a DR at the June, 2024 meeting.]` It is now.

[Bug c/107980] va_start does not warn about an arbitrary number of arguments in C2x mode

2024-12-04 Thread jsm28 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107980

--- Comment #20 from Joseph S. Myers  ---
I think it's fine for such QoI diagnostics to happen during compilation.

Note that there is no requirement for [] or {} to be balanced in the additional
arguments, only ().

[Bug fortran/116261] [15 regression] gfortran.dg/sizeof_6.f90 -O1 timeout since r15-2739-g4cb07a38233

2024-12-04 Thread pault at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116261

--- Comment #13 from Paul Thomas  ---
Created attachment 59790
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59790&action=edit
Fix for this PR

[Bug fortran/116261] [15 regression] gfortran.dg/sizeof_6.f90 -O1 timeout since r15-2739-g4cb07a38233

2024-12-04 Thread pault at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116261

Paul Thomas  changed:

   What|Removed |Added

 Status|RESOLVED|REOPENED
 Resolution|FIXED   |---

--- Comment #14 from Paul Thomas  ---
This restarted with my updated patch for pr102689. The new attachment fixes it
and will be applied tonight.

Paul

[Bug tree-optimization/117875] [15 Regression] 28% regression for 456.hmmer on Zen4 with -Ofast -march=native

2024-12-04 Thread amacleod at redhat dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117875

--- Comment #6 from Andrew Macleod  ---
(In reply to Richard Biener from comment #5)
> I tried get_range_query (cfun)->range_of_expr (vr, niter->niter, stmt) with
> (unsigned int) M_9(D) - (unsigned int) k_24 and an enabled ranger
> but that indeed returns [irange] unsigned int [0, 1024][4294966273, +INF]
> and not a singleton as expected.
> It seems to look for ranges of M_9 and k_24 and fold_range with the
> minus op rather than trying to use relations to simplify the subtraction.
> The ranges of the first and 2nd op are [1, 1025] and [1, 1024] respectively,
> basically [1, INF] for our purpose (the constant array bound bounds them).
> 
> We'd also miss a way to inject niter->assumptions and niter->may_be_zero
> as conditions known true for the purpose of simplifying (in this case
> those don't add anything).

It will always use ranges, and utilize any known relation in addition to that.
range-op.cc :  minus_op1_op2_relation_effect () is used for that purpose, but
if can only adjust something to a known subrange, ie  if m_9 > k_24, then it
will limit whatever the calculated range is to  [1, +INF], so we can trim out
any negatives.

What singleton are you expecting?  (and where?)I hacked a VRP pass right
after loop interchange and after the loop I see:

=== BB 16 
M_9(D)  [irange] int [2, 1024]
Relational : (k_12 >= M_9(D))
 [local count: 67276368]:
goto ; [100.00%]

but IM not sure how we use relations to come up with a singleton constant?
The loop backedge has:

=== BB 8 
k_12[irange] int [2, 1024]
Relational : (M_9(D) > k_15)
Relational : (k_12 < M_9(D))
 [local count: 544326978]:
goto ; [100.00%]

so we always know also M_9 > either k, 'mm just not sure how that helps  OR are
we talking about the second loop?  Im having difficulty seeing where I might
find a singleton?

[Bug c++/117845] [14/15 Regression] ICE in pass eh after error with -fsanitize=address

2024-12-04 Thread simartin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117845

Simon Martin  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |simartin at gcc dot 
gnu.org
 Status|NEW |ASSIGNED

--- Comment #3 from Simon Martin  ---
Working on this one.

[Bug tree-optimization/117912] Linux Kernel 6.13-rc1 Build Failure: 'Detected write beyond size of object'

2024-12-04 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117912

--- Comment #10 from Sam James  ---
Yes, I deliberately hadn't tidied it up as I was still working on it. Anyway,
thanks.

[Bug c/117912] Linux Kernel 6.13-rc1 Build Failure: 'Detected write beyond size of object'

2024-12-04 Thread laoar.shao at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117912

--- Comment #2 from Yafang Shao  ---
Hello Sam,

Unfortunately, I don’t have access to GCC 14.0 or newer versions at the moment.
I’m currently using an older version, GCC 11.3, which is also able to reproduce
this issue.

Could you please try the following steps to reproduce the problem with the
latest GCC?

1. Clone the linux kernel src code 

  git clone https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 

2. Copy the attached configuration file to the Linux source directory:

  cp config linux/.config

3. Navigate to the Linux source directory and build the kernel:

  make -j64  

4. You should encounter the error during the build process.


If it’s inconvenient for you to test this, I will work on building the latest
version of GCC. However, that might take some time to set up.

[Bug tree-optimization/117912] [12/13/14/15 regression] Linux Kernel 6.13-rc1 Build Failure: 'Detected write beyond size of object' since r0-118806-g17742d62a2438144b6235

2024-12-04 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117912

Sam James  changed:

   What|Removed |Added

   Last reconfirmed||2024-12-04
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW

[Bug rtl-optimization/117248] gcc/libgcc/libgcc2.h:232:25: internal compiler error: Arithmetic exception

2024-12-04 Thread danglin at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117248

--- Comment #11 from John David Anglin  ---
LRA is not yet enabled by default on hppa.  To enable, use "-mlra" option
or hack pa.opt to enable by default:

mlra
Target Var(pa_lra_p) Init(1)
Use LRA instead of reload (transitional).

On linux, the problem occurs as of this commit:

commit 4b09e2c67ef593db171b0755b46378964421782b
Author: Vladimir N. Makarov 
Date:   Mon Nov 25 16:09:00 2024 -0500

I will try to replicate on x86_64.

[Bug rtl-optimization/117816] ICE: in rtl_verify_bb_insns, at cfgrtl.cc:2837: flow control insn inside a basic block with -O -favoid-store-forwarding -fnon-call-exceptions -fno-forward-propagate -fins

2024-12-04 Thread konstantinos.eleftheriou at vrull dot eu via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117816

--- Comment #1 from Konstantinos Eleftheriou  ---
The cause of this is that we are not handling the case that we have a
REG_EH_REGION note on an instruction in the store-load sequence. Thus, we are
inserting instructions after it, leading to it no longer being the last
instruction in the block (as it should be). A possible solution would be to
reject the transformation when we have instructions that might throw exceptions
in the sequence.

[Bug rtl-optimization/117248] gcc/libgcc/libgcc2.h:232:25: internal compiler error: Arithmetic exception

2024-12-04 Thread vmakarov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117248

--- Comment #12 from Vladimir Makarov  ---
(In reply to John David Anglin from comment #11)
> LRA is not yet enabled by default on hppa.  To enable, use "-mlra" option
> or hack pa.opt to enable by default:
> 
> mlra
> Target Var(pa_lra_p) Init(1)
> Use LRA instead of reload (transitional).
> 
> On linux, the problem occurs as of this commit:
> 
> 
I see.  Thank you.  I've reproduced it with using -mlra.

[Bug tree-optimization/117912] [12/13/14/15 regression] Linux Kernel 6.13-rc1 Build Failure: 'Detected write beyond size of object' since r0-118806-g17742d62a2438144b6235

2024-12-04 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117912

--- Comment #13 from Jakub Jelinek  ---
>From what I can see, this VN is purely from vn_reference_eq and it just skips
over
the COMPONENT_REFs after taking into account their vro?->off (but I do not yet
understand why the similar case with unions isn't handled the same).
So if we want to disallow value numbering them the same before PROP_objsz in
problematic cases which lead to same offset through different access path, we'd
need to do it somewhere before for (; vr1->operands.iterate (i, &vro1); i++).
And the problematic cases for objsz are either zero sized FIELD_DECLs (one of
them but not both), or I guess as well starting offset of one field vs. ending
offset of another field, say if we have
struct S { char a[24]; char b[24]; char c; };
and struct S *p, then &p->a[24] might value number the same as &p->b[0] or vice
versa, but __builtin_object_size (, 1) in one case should be 0 and in the other
case should be 24 (minimum with actual object size).

[Bug ipa/117892] [15 Regression] ICE on valid code at -O1 and above on x86_64-linux-gnu: in single_succ_edge, at basic-block.h:332 since r15-5336-gcee7d080d5c2a5

2024-12-04 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117892

Jakub Jelinek  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |hubicka at gcc dot 
gnu.org
 Status|NEW |ASSIGNED
 CC||jakub at gcc dot gnu.org

[Bug target/84211] [avr] Perform a post-reload register optimization pass

2024-12-04 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84211

--- Comment #5 from GCC Commits  ---
The master branch has been updated by Georg-Johann Lay :

https://gcc.gnu.org/g:2b75fe3708f062a8bbb432d4b0002a7a94149ab3

commit r15-5924-g2b75fe3708f062a8bbb432d4b0002a7a94149ab3
Author: Georg-Johann Lay 
Date:   Wed Dec 4 16:08:15 2024 +0100

AVR: ad target/84211 - Fix dumping INSN_UID for null insn.

gcc/
PR target/84211
* config/avr/avr-passes.cc (insninfo_t) : Preset to 0.
(run_find_plies) [hamm=0, dump_file]: Don't print INSN_UID
for a null m_insn.

[Bug c++/117516] [12/13/14/15 Regression] compile time hog figuring out has flexarrays since r6-5791-g7e9a3ad30076ad

2024-12-04 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117516

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #2 from Jakub Jelinek  ---
Same just using the preprocessor:
struct S10 { int a, b; };
#define A(m, n) struct S##m { struct S##n a, b; };
#define B(m) A(m##1, m##0) A(m##2, m##1) A(m##3, m##2) A(m##4, m##3) \
 A(m##5, m##4) A(m##6, m##5) A(m##7, m##6) A(m##8, m##7)
B(1)
#define S20 S18
B(2)
#define S30 S28
B(3)

int
main ()
{
  struct S38 s;
  __builtin_memset (&s, 0, sizeof (s));
}

I must say I don't really understand the recursion.
If we have say
struct A { int a; char b[]; int c; };
struct B { struct A a; struct A b[]; struct A c; };
struct C { struct B a; struct B b[]; struct B c; };
C c;
then we want to diagnose just 3 errors, the flex array in the middle of each
struct.
But that can be done when completing each of the structs.  I can understand
recursion for the checks whether the last FIELD_DECL has a type which ends with
a flex array, but I don't see why it should recurse on other FIELD_DECLs for
more than one level (so
say when finalizing C, check whether B ends with a flex array (for a field's
purposes),
see the problem with the b[] flex array and be ok with flex array at the end of
B for field c because nothing follows that.

[Bug rtl-optimization/114729] RISC-V SPEC2017 507.cactu excessive spillls with -fschedule-insns

2024-12-04 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114729

--- Comment #22 from GCC Commits  ---
The master branch has been updated by Vineet Gupta :

https://gcc.gnu.org/g:7bef3482f27ce13ba7e6c4f43943f28a49e63a40

commit r15-5925-g7bef3482f27ce13ba7e6c4f43943f28a49e63a40
Author: Vineet Gupta 
Date:   Wed Dec 4 10:42:37 2024 -0800

sched1: parameterize pressure scheduling spilling aggressiveness
[PR/114729]

sched1 computes ECC (Excess Change Cost) for each insn, which represents
the register pressure attributed to the insn.
Currently the pressure sensitive scheduling algorithm deliberately ignores
negative ECC values (pressure reduction), making them 0 (neutral), leading
to more spills. This happens due to the assumption that the compiler has
a reasonably accurate processor pipeline scheduling model and thus tries
to aggresively fill pipeline bubbles with spill slots.

This however might not be true, as the model might not be available for
certains uarches or even applicable especially for modern out-of-order
cores.

The existing heuristic induces spill frenzy on RISC-V, noticably so on
SPEC2017 507.Cactu. If insn scheduling is disabled completely, the
total dynamic icounts for this workload are reduced in half from
~2.5 trillion insns to ~1.3 (w/ -fno-schedule-insns).

This patch adds --param=cycle-accurate-model={0,1} to gate the spill
behavior.

 - The default (1) preserves existing spill behavior.

 - targets/uarches sensitive to spilling can override the param to (0)
   to get the reverse effect. RISC-V backend does so too.

The actual perf numbers are very promising.

(1) On RISC-V BPI-F3 in-order CPU, -Ofast -march=rv64gcv_zba_zbb_zbs:

  Before:
  --
  Performance counter stats for './cactusBSSN_r_base.rivos spec_ref.par':

  4,917,712.97 msec task-clock:u #1.000 CPUs
utilized
 5,314  context-switches:u   #1.081 /sec
 3  cpu-migrations:u #0.001 /sec
   204,784  page-faults:u#   41.642 /sec
 7,868,291,222,513  cycles:u #1.600 GHz
 2,615,069,866,153  instructions:u   #0.33  insn
per cycle
10,799,381,890  branches:u   #2.196 M/sec
15,714,572  branch-misses:u  #0.15% of all
branches

  After:
  -
  Performance counter stats for './cactusBSSN_r_base.rivos spec_ref.par':

  4,552,979.58 msec task-clock:u #0.998 CPUs
utilized
   205,020  context-switches:u   #   45.030 /sec
 2  cpu-migrations:u #0.000 /sec
   204,221  page-faults:u#   44.854 /sec
 7,285,176,204,764  cycles:u(7.4% faster)#1.600 GHz
 2,145,284,345,397  instructions:u (17.96% fewer)#0.29  insn
per cycle
10,799,382,011  branches:u   #2.372 M/sec
16,235,628  branch-misses:u  #0.15% of all
branches

(2) Wilco reported 20% perf gains on aarch64 Neoverse V2 runs.

gcc/ChangeLog:
PR target/11472
* params.opt (--param=cycle-accurate-model=): New opt.
* doc/invoke.texi (cycle-accurate-model): Document.
* haifa-sched.cc (model_excess_group_cost): Return negative
delta if param_cycle_accurate_model is 0.
(model_excess_cost): Ceil negative baseECC to 0 only if
param_cycle_accurate_model is 1.
Dump the actual ECC value.
* config/riscv/riscv.cc (riscv_option_override): Set param
to 0.

gcc/testsuite/ChangeLog:
PR target/114729
* gcc.target/riscv/riscv.exp: Enable new tests to build.
* gcc.target/riscv/sched1-spills/spill1.cpp: Add new test.

Signed-off-by: Vineet Gupta 

[Bug libstdc++/11472] [3.4 regression] abs not found if is included

2024-12-04 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=11472

--- Comment #2 from GCC Commits  ---
The master branch has been updated by Vineet Gupta :

https://gcc.gnu.org/g:7bef3482f27ce13ba7e6c4f43943f28a49e63a40

commit r15-5925-g7bef3482f27ce13ba7e6c4f43943f28a49e63a40
Author: Vineet Gupta 
Date:   Wed Dec 4 10:42:37 2024 -0800

sched1: parameterize pressure scheduling spilling aggressiveness
[PR/114729]

sched1 computes ECC (Excess Change Cost) for each insn, which represents
the register pressure attributed to the insn.
Currently the pressure sensitive scheduling algorithm deliberately ignores
negative ECC values (pressure reduction), making them 0 (neutral), leading
to more spills. This happens due to the assumption that the compiler has
a reasonably accurate processor pipeline scheduling model and thus tries
to aggresively fill pipeline bubbles with spill slots.

This however might not be true, as the model might not be available for
certains uarches or even applicable especially for modern out-of-order
cores.

The existing heuristic induces spill frenzy on RISC-V, noticably so on
SPEC2017 507.Cactu. If insn scheduling is disabled completely, the
total dynamic icounts for this workload are reduced in half from
~2.5 trillion insns to ~1.3 (w/ -fno-schedule-insns).

This patch adds --param=cycle-accurate-model={0,1} to gate the spill
behavior.

 - The default (1) preserves existing spill behavior.

 - targets/uarches sensitive to spilling can override the param to (0)
   to get the reverse effect. RISC-V backend does so too.

The actual perf numbers are very promising.

(1) On RISC-V BPI-F3 in-order CPU, -Ofast -march=rv64gcv_zba_zbb_zbs:

  Before:
  --
  Performance counter stats for './cactusBSSN_r_base.rivos spec_ref.par':

  4,917,712.97 msec task-clock:u #1.000 CPUs
utilized
 5,314  context-switches:u   #1.081 /sec
 3  cpu-migrations:u #0.001 /sec
   204,784  page-faults:u#   41.642 /sec
 7,868,291,222,513  cycles:u #1.600 GHz
 2,615,069,866,153  instructions:u   #0.33  insn
per cycle
10,799,381,890  branches:u   #2.196 M/sec
15,714,572  branch-misses:u  #0.15% of all
branches

  After:
  -
  Performance counter stats for './cactusBSSN_r_base.rivos spec_ref.par':

  4,552,979.58 msec task-clock:u #0.998 CPUs
utilized
   205,020  context-switches:u   #   45.030 /sec
 2  cpu-migrations:u #0.000 /sec
   204,221  page-faults:u#   44.854 /sec
 7,285,176,204,764  cycles:u(7.4% faster)#1.600 GHz
 2,145,284,345,397  instructions:u (17.96% fewer)#0.29  insn
per cycle
10,799,382,011  branches:u   #2.372 M/sec
16,235,628  branch-misses:u  #0.15% of all
branches

(2) Wilco reported 20% perf gains on aarch64 Neoverse V2 runs.

gcc/ChangeLog:
PR target/11472
* params.opt (--param=cycle-accurate-model=): New opt.
* doc/invoke.texi (cycle-accurate-model): Document.
* haifa-sched.cc (model_excess_group_cost): Return negative
delta if param_cycle_accurate_model is 0.
(model_excess_cost): Ceil negative baseECC to 0 only if
param_cycle_accurate_model is 1.
Dump the actual ECC value.
* config/riscv/riscv.cc (riscv_option_override): Set param
to 0.

gcc/testsuite/ChangeLog:
PR target/114729
* gcc.target/riscv/riscv.exp: Enable new tests to build.
* gcc.target/riscv/sched1-spills/spill1.cpp: Add new test.

Signed-off-by: Vineet Gupta 

[Bug target/106265] RISC-V SPEC2017 507.cactu code bloat due to address generation

2024-12-04 Thread vineetg at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106265

--- Comment #13 from Vineet Gupta  ---
(In reply to Vineet Gupta from comment #12)
> Two years hence and we are a little wiser.
> 
> The root-cause of spills is sched1
> [PR/114729](https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114729).

With above PR partially fixed we are down to 2,145,284,345,397 (from
2,564,736,063,742)

[Bug other/117914] New: [reload][avr] In function '__objc_add_class_to_hash class-i.c:2162:1: error: insn does not satisfy its constraints:

2024-12-04 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117914

Bug ID: 117914
   Summary: [reload][avr] In function '__objc_add_class_to_hash
class-i.c:2162:1: error: insn does not satisfy its
constraints:
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: other
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gjl at gcc dot gnu.org
  Target Milestone: ---

Created attachment 59791
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59791&action=edit
precompiled C test case

$ avr-gcc-15  class-i.c -S -Os -mmcu=avrtiny -w -da

class-i.c: In function '__objc_add_class_to_hash':
class-i.c:2162:1: error: insn does not satisfy its constraints:
 2162 | }
  | ^
(insn 46 153 154 8 (set (reg:SI 24 r24 [orig:52 _10 ] [52])
(ashift:SI (reg:SI 30 r30)
(const_int 16 [0x10]))) "class-i.c":2153:141 494 {ashlsi3}
 (nil))
during RTL pass: postreload
class-i.c:2162:1: internal compiler error: in extract_constrain_insn, at
recog.cc:2770
0x7f2ed6359d8f __libc_start_call_main
../sysdeps/nptl/libc_start_call_main.h:58


reg30:SI is obviously invalid as a register, because the last GPR is reg31.QI,
and the maximal mode which reg30 can hold is HImode.

In .asmcons we have:

(insn 46 44 47 8 (set (reg:SI 52 [ _10 ])
(ashift:SI (subreg:SI (reg:HI 50 [ class_number.6_8 ]) 0)
(const_int 16 [0x10]))) "class-i.c":2153:141 494 {ashlsi3}
 (nil))

which in .postreload has been turned into:

(insn 46 153 154 8 (set (reg:SI 24 r24 [orig:52 _10 ] [52])
(ashift:SI (reg:SI 30 r30)
(const_int 16 [0x10]))) "class-i.c":2153:141 494 {ashlsi3}
 (nil))

So paradoxical subreg subreg:SI (reg:HI 50 [ class_number.6_8 ]) 0) is turned
into an invalid hard register.

FYI, the ICE goes away with -mlra.

[Bug libfortran/117857] libgfortran on powerpc-darwin8 doesn't compile: `-Wint-conversion` in `stream_ttyname`

2024-12-04 Thread glex.spb at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117857

--- Comment #14 from Gleb Mazovetskiy  ---
In macports, a lot of packages seem to simply pass `-D__DARWIN_UNIX03=1`:

https://github.com/search?q=repo%3Amacports%2Fmacports-ports+D__DARWIN_UNIX03&type=code

[Bug libfortran/117857] libgfortran on powerpc-darwin8 doesn't compile: `-Wint-conversion` in `stream_ttyname`

2024-12-04 Thread glex.spb at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117857

--- Comment #15 from Gleb Mazovetskiy  ---
Also in macports, there is only one use of `-D_APPLE_C_SOURCE` in macports, for
the tmux package.

https://github.com/macports/macports-ports/blob/ad9460ffa5b5ff8ce8e7b4e3dd01cf7db07dac8c/sysutils/tmux/Portfile#L80-L83

[Bug c++/117516] [12/13/14/15 Regression] compile time hog figuring out has flexarrays since r6-5791-g7e9a3ad30076ad

2024-12-04 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117516

--- Comment #3 from Jakub Jelinek  ---
Though, I wonder about unions.  For those all members are last and so if one
has
struct S { union U u; int v; };
and union U has 2 union members with 2 union members each ... like in this
example and
some of the ultimate structures have flexible array member, maybe this would
still have the same complexity.  So perhaps we need a flag whether
RECORD_TYPE/UNION_TYPE or at least the latter has flexible array member at the
end (or hash_map mapping to the field etc.).
Anyway, haven't read the whole patch yet in detail what exactly it is doing and
why.
Even for base classes, I'd hope those were finalized first.  And even for
virtual bases they were...

[Bug sanitizer/117716] [15 regression] ASAN broken on riscv64

2024-12-04 Thread schwab--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117716

Andreas Schwab  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|UNCONFIRMED |RESOLVED

--- Comment #1 from Andreas Schwab  ---
Fixed by r15-5645-gc84a8a274af316

[Bug tree-optimization/117912] [12/13/14/15 regression] Linux Kernel 6.13-rc1 Build Failure: 'Detected write beyond size of object' since r0-118806-g17742d62a2438144b6235

2024-12-04 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117912

--- Comment #14 from Jakub Jelinek  ---
struct S { int a; int b[24]; int c[24]; int d; };
void bar (int *);

__SIZE_TYPE__
foo (struct S *p)
{
  bar (&p->b[24]);
  bar (&p->c[0]);
  return __builtin_object_size (&p->c[0], 1);
}

is another miscompiled testcase, should return 96, but returns 0.
Regressed with the same commit.

[Bug tree-optimization/117912] [12/13/14/15 regression] Linux Kernel 6.13-rc1 Build Failure: 'Detected write beyond size of object' since r0-118806-g17742d62a2438144b6235

2024-12-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117912

--- Comment #15 from Andrew Pinski  ---
(In reply to Jakub Jelinek from comment #14)
> struct S { int a; int b[24]; int c[24]; int d; };
> void bar (int *);
> 
> __SIZE_TYPE__
> foo (struct S *p)
> {
>   bar (&p->b[24]);
>   bar (&p->c[0]);
>   return __builtin_object_size (&p->c[0], 1);
> }
> 
> is another miscompiled testcase, should return 96, but returns 0.
> Regressed with the same commit.

Sounds exactly like https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109968#c4 .

[Bug tree-optimization/117915] New: ICE on valid code at -O{s,2,3} with "-fno-tree-copy-prop -fno-tree-vrp" on x86_64-linux-gnu: tree check: expected ssa_name, have integer_cst in ifcombine_mark_ssa_n

2024-12-04 Thread zhendong.su at inf dot ethz.ch via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117915

Bug ID: 117915
   Summary: ICE on valid code at -O{s,2,3} with
"-fno-tree-copy-prop -fno-tree-vrp" on
x86_64-linux-gnu: tree check: expected ssa_name, have
integer_cst in ifcombine_mark_ssa_name, at
tree-ssa-ifcombine.cc:478
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zhendong.su at inf dot ethz.ch
  Target Milestone: ---

It appears to be a recent regression as it doesn't reproduce with 14.2 and
earlier.

Compiler Explorer: https://godbolt.org/z/nrdcez6GP


[533] % gcctk -v
Using built-in specs.
COLLECT_GCC=gcctk
COLLECT_LTO_WRAPPER=/local/suz-local/software/local/gcc-trunk/libexec/gcc/x86_64-pc-linux-gnu/15.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc-trunk/configure --disable-bootstrap
--enable-checking=yes --prefix=/local/suz-local/software/local/gcc-trunk
--enable-sanitizers --enable-languages=c,c++ --disable-werror --enable-multilib
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 15.0.0 20241204 (experimental) (GCC) 
[534] % 
[534] % gcctk -O3 -fno-tree-copy-prop -fno-tree-vrp small.c
during GIMPLE pass: ifcombine
small.c: In function ‘main’:
small.c:3:5: internal compiler error: tree check: expected ssa_name, have
integer_cst in ifcombine_mark_ssa_name, at tree-ssa-ifcombine.cc:478
3 | int main() {
  | ^~~~
0x26d34e5 internal_error(char const*, ...)
../../gcc-trunk/gcc/diagnostic-global-context.cc:517
0x913ad7 tree_check_failed(tree_node const*, char const*, int, char const*,
...)
../../gcc-trunk/gcc/tree.cc:9038
0x8d013b tree_check(tree_node*, char const*, int, char const*, tree_code)
../../gcc-trunk/gcc/tree.h:3665
0x8d013b ifcombine_mark_ssa_name
../../gcc-trunk/gcc/tree-ssa-ifcombine.cc:478
0x8d013b ifcombine_mark_ssa_name
../../gcc-trunk/gcc/tree-ssa-ifcombine.cc:476
0x1380242 ifcombine_replace_cond
../../gcc-trunk/gcc/tree-ssa-ifcombine.cc:648
0x1381b60 ifcombine_ifandif
../../gcc-trunk/gcc/tree-ssa-ifcombine.cc:1007
0x138261c tree_ssa_ifcombine_bb_1
../../gcc-trunk/gcc/tree-ssa-ifcombine.cc:1095
0x1382ada tree_ssa_ifcombine_bb
../../gcc-trunk/gcc/tree-ssa-ifcombine.cc:1195
0x1382ada execute
../../gcc-trunk/gcc/tree-ssa-ifcombine.cc:1388
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.
[535] % 
[535] % cat small.c
unsigned a;
int b, c;
int main() {
  a = a & b || (c || b) | a;
  return 0;
}

[Bug tree-optimization/117915] [15 Regression] ICE on valid code at -O{s,2,3} with "-fno-tree-copy-prop -fno-tree-vrp" on x86_64-linux-gnu: tree check: expected ssa_name, have integer_cst in ifcombine

2024-12-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117915

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |15.0
   Keywords||ice-on-valid-code
Version|unknown |15.0
Summary|ICE on valid code at|[15 Regression] ICE on
   |-O{s,2,3} with  |valid code at -O{s,2,3}
   |"-fno-tree-copy-prop|with "-fno-tree-copy-prop
   |-fno-tree-vrp" on   |-fno-tree-vrp" on
   |x86_64-linux-gnu: tree  |x86_64-linux-gnu: tree
   |check: expected ssa_name,   |check: expected ssa_name,
   |have integer_cst in |have integer_cst in
   |ifcombine_mark_ssa_name, at |ifcombine_mark_ssa_name, at
   |tree-ssa-ifcombine.cc:478   |tree-ssa-ifcombine.cc:478
 CC||pinskia at gcc dot gnu.org

[Bug tree-optimization/117915] [15 Regression] ICE on valid code at -O{s,2,3} with "-fno-tree-copy-prop -fno-tree-vrp" on x86_64-linux-gnu: tree check: expected ssa_name, have integer_cst in ifcombine

2024-12-04 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117915

Andrew Pinski  changed:

   What|Removed |Added

   Last reconfirmed||2024-12-04
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1

--- Comment #1 from Andrew Pinski  ---
  # RANGE [irange] int [0, 1] MASK 0x1 VALUE 0x0
  # iftmp.4_10 = PHI <0(4)>
  iftmp.7_6 = (unsigned intD.9) iftmp.4_10;
  _7 = a.1_1 | iftmp.7_6;
  if (_7 != 0)


Confirmed.

[Bug c++/117887] [12/13/14/15 regression] ICE when building qtwebengine-6.8.1 (add_extra_args, at cp/pt.cc:13682) since r11-3261-gb28b621ac67bee

2024-12-04 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117887

Patrick Palka  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |ppalka at gcc dot 
gnu.org

[Bug target/116623] [15 regression] regressions on arm-linux-gnueabihf since r15-1619-g3b9b8d6cfdf593

2024-12-04 Thread thiago.bauermann at linaro dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116623

--- Comment #2 from Thiago Jung Bauermann  
---
(In reply to GCC Commits from comment #1)
> The master branch has been updated by Torbjorn Svensson :
> 
> https://gcc.gnu.org/g:e152a734337a06ed085c2e6700f21cda9ca7ad17
> 
> commit r15-4966-ge152a734337a06ed085c2e6700f21cda9ca7ad17
> Author: Torbjörn SVENSSON 
> Date:   Sat Oct 19 18:08:01 2024 +0200
> 
> testsuite: arm: Relax register selection [PR116623]

Thank you! I confirmed that the dlstp-compile-asm-2.c failures have been fixed.

I also checked the bfloat16_scalar_*.c failures, and they're still present.

  1   2   >