[Bug target/116010] [15 regression] vectorization regressions on arm and aarch64 since r15-491-gc290e6a0b7a9de

2024-07-22 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116010

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |15.0
   Keywords||needs-bisection

--- Comment #2 from Richard Biener  ---
The gfortran.dg/vect/vect-8.f90 testcase is incredibly bad because it has so
many loops that are or are not vectorized.  It should ideally be split up.

But I think the blame is incorrect, the test uses
-fno-tree-loop-distribute-patterns and thus isn't effected by the rev in
question.

As Andrew says the fix for the other regression is trivial, I'm leaving that
to ARM folks as an exercise.

[Bug target/116013] Missed optimization opportunity with andn involving consts

2024-07-22 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116013

Richard Biener  changed:

   What|Removed |Added

  Component|middle-end  |target
 Target|x86_64  |x86_64-*-*

--- Comment #2 from Richard Biener  ---
On the gimple level this is about canonicalization.  Now that we have a andn
optab RTL expansion/insn selection should do the trick but I wonder why
combine doesn't - that looks like a target bug or a simplify-rtx issue
(again a missed canonicalization rule?)

[Bug c/116016] enhancement: add __builtin_set_counted_by(P->FAM, COUNT) or equivalent

2024-07-22 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116016

Richard Biener  changed:

   What|Removed |Added

   Severity|normal  |enhancement

[Bug c/66425] (void) cast doesn't suppress __attribute__((warn_unused_result))

2024-07-22 Thread rusty at rustcorp dot com.au via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66425

--- Comment #81 from rusty at rustcorp dot com.au ---
(In reply to Jakub Jelinek from comment #76)
> (void) casts not quieting the warning was an intentional request when the
> warning has been added, I really don't think it is a good idea to change
> that.

Indeed, but it wasn't called "dont_ignore_realloc", it seemed far more generic,
and there wasn't an alternative for a long time.  Once GLIBC started using it
(such as for write()), it became an ongoing thorn in the side of many users :(
The escalation continued with GNUlib implementing a suppression macro.

(I'm not picking on the GNU project fighting itself here, but it's a clear case
showing the problem).

> Perhaps you can ask
> glibc to recategorize some of the declarations to use [[nodiscard]] instead
> of __attribute__((__warn_unused_result__))
...
> E.g. ignoring return value of realloc is pretty much always a bad idea and
> just (void) realloc (...); is something that shouldn't be supported.

Indeed!  I think this is the solution (once [[nodiscard]] is old enough for
projects to migrate).  Not many function returns are as clearly required as
realloc...

[Bug rtl-optimization/116028] New: [15 regression] gcc.dg/pr10474.c test failure

2024-07-22 Thread jskumari at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116028

Bug ID: 116028
   Summary: [15 regression] gcc.dg/pr10474.c test failure
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jskumari at gcc dot gnu.org
  Target Milestone: ---

The test fails with r15-1619-g3b9b8d6cfdf593.

FAIL: gcc.dg/pr10474.c scan-rtl-dump pro_and_epilogue "Performing
shrink-wrapping"

The testcase fails to shrink wrap and hence the failure.

[Bug rtl-optimization/116028] [15 regression] gcc.dg/pr10474.c test failure

2024-07-22 Thread jskumari at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116028

Surya Kumari Jangala  changed:

   What|Removed |Added

   Last reconfirmed||2024-07-22
 Ever confirmed|0   |1
 Status|UNCONFIRMED |ASSIGNED

[Bug rtl-optimization/116028] [15 regression] gcc.dg/pr10474.c test failure since r15-1619-g3b9b8d6cfdf593

2024-07-22 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116028

Sam James  changed:

   What|Removed |Added

 CC||sjames at gcc dot gnu.org
Summary|[15 regression] |[15 regression]
   |gcc.dg/pr10474.c test   |gcc.dg/pr10474.c test
   |failure |failure since
   ||r15-1619-g3b9b8d6cfdf593
   Keywords||testsuite-fail
   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=115673

--- Comment #1 from Sam James  ---
Yeah, I mentioned it when filing PR115673, but I wasn't sure if they were all
the same cause so didn't want to file a bunch without knowing.

[Bug rtl-optimization/98289] [8/9/10 Regression] [x86] Suboptimal optimization of stack usage when function call does not occur

2024-07-22 Thread amylaar at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98289

--- Comment #7 from Jorn Wolfgang Rennecke  ---
Created attachment 58719
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58719&action=edit
patch to fix internal compiler errors in shrink-wrap.cc on EDGE_CROSSING edges

I'm currently using this patch.
The only issue that remains for gcc.dg/torture/pr98289.c is that before
prologue/epilogue threading, there are four separate crossing jumps to labels
at the same location as lab.  Without shrink wrapping, this gets cleaned up by
try_forward_edges (called from try_optimize_cfg, called from cleanup_cfg,
called from
pass_jump2::execute) .  With shrink wrapping, the clean up doesn't happen or
has no
effect, so the clutter of all these crossing jumps in the non-cold partition
defeat the
purpose of the hot-cold partitioning.

[Bug c/116016] enhancement: add __builtin_set_counted_by(P->FAM, COUNT) or equivalent

2024-07-22 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116016

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #2 from Jakub Jelinek  ---
So, what would be the prototype of the builtin?
Would it be type-generic for both arguments, i.e. effectively
void __builtin_set_counted_by (...);
which would just verify that 2 arguments are passed, the first one is some
flexible array member with counted_by argument and the second argument has some
type implicitly convertible to the type of the counted_by member?

[Bug sanitizer/115793] signed integer overflow check missing at optimization levels -O2, -O3, and -Os

2024-07-22 Thread bic60176 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115793

--- Comment #5 from Bi6c  ---
gcc-trunk also not reporting signed integer overflow at -O2, -O3, and -Os
(https://godbolt.org/z/8xnq1bo7s).

[Bug fortran/115997] Findloc does not find the result of a function with a deferred-length character return value

2024-07-22 Thread anlauf at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115997

--- Comment #3 from anlauf at gcc dot gnu.org ---
The original issue was reported for findloc, but did occur with several other
intrinsics accepting deferred-length character.  If you edit the bugzilla
search to include resolved issues, you will also see the mentioned one.

You have several options to go beyond 13.1:
- try upgrading to 13.3
- wait for the upcoming 14.2 release (expected soon)

[Bug rtl-optimization/115565] [11/12/13/14/15 Regression] CSE: Comparison incorrectly evaluated as constant causing optimization to produce wrong code

2024-07-22 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115565

--- Comment #6 from GCC Commits  ---
The releases/gcc-14 branch has been updated by Maciej W. Rozycki
:

https://gcc.gnu.org/g:323d010fa5d433e6eb5ec5124544f19fb4b4eee6

commit r14-10485-g323d010fa5d433e6eb5ec5124544f19fb4b4eee6
Author: Maciej W. Rozycki 
Date:   Sat Jun 29 23:26:55 2024 +0100

[PR115565] cse: Don't use a valid regno for non-register in comparison_qty

Use INT_MIN rather than -1 in `comparison_qty' where a comparison is not
with a register, because the value of -1 is actually a valid reference
to register 0 in the case where it has not been assigned a quantity.

Using -1 makes `REG_QTY (REGNO (folded_arg1)) == ent->comparison_qty'
comparison in `fold_rtx' to incorrectly trigger in rare circumstances
and return true for a memory reference, making CSE consider a comparison
operation to evaluate to a constant expression and consequently make the
resulting code incorrectly execute or fail to execute conditional
blocks.

This has caused a miscompilation of rwlock.c from LinuxThreads for the
`alpha-linux-gnu' target, where `rwlock->__rw_writer != thread_self ()'
expression (where `thread_self' returns the thread pointer via a PALcode
call) has been decided to be always true (with `ent->comparison_qty'
using -1 for a reference to to `rwlock->__rw_writer', while register 0
holding the thread pointer retrieved by `thread_self') and code for the
false case has been optimized away where it mustn't have, causing
program lockups.

The issue has been observed as a regression from commit 08a692679fb8
("Undefined cse.c behaviour causes 3.4 regression on HPUX"),
, and up to
commit 932ad4d9b550 ("Make CSE path following use the CFG"),
, where CSE
has been restructured sufficiently for the issue not to trigger with the
original reproducer anymore.  However the original bug remains and can
trigger, because `comparison_qty' will still be assigned -1 for a memory
reference and the `reg_qty' member of a `cse_reg_info_table' entry will
still be assigned -1 for register 0 where the entry has not been
assigned a quantity, e.g. at initialization.

Use INT_MIN then as noted above, so that the value remains negative, for
consistency with the REGNO_QTY_VALID_P macro (even though not used on
`comparison_qty'), and then so that it should not ever match a valid
negated register number, fixing the regression with commit 08a692679fb8.

gcc/
PR rtl-optimization/115565
* cse.cc (record_jump_cond): Use INT_MIN rather than -1 for
`comparison_qty' if !REG_P.

(cherry picked from commit 69bc5fb97dc3fada81869e00fa65d39f7def6acf)

[Bug rtl-optimization/115565] [11/12/13/14/15 Regression] CSE: Comparison incorrectly evaluated as constant causing optimization to produce wrong code

2024-07-22 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115565

--- Comment #7 from GCC Commits  ---
The releases/gcc-13 branch has been updated by Maciej W. Rozycki
:

https://gcc.gnu.org/g:4ce7c81212c7819dfe6dbbe2399220fb12da6d71

commit r13-8932-g4ce7c81212c7819dfe6dbbe2399220fb12da6d71
Author: Maciej W. Rozycki 
Date:   Sat Jun 29 23:26:55 2024 +0100

[PR115565] cse: Don't use a valid regno for non-register in comparison_qty

Use INT_MIN rather than -1 in `comparison_qty' where a comparison is not
with a register, because the value of -1 is actually a valid reference
to register 0 in the case where it has not been assigned a quantity.

Using -1 makes `REG_QTY (REGNO (folded_arg1)) == ent->comparison_qty'
comparison in `fold_rtx' to incorrectly trigger in rare circumstances
and return true for a memory reference, making CSE consider a comparison
operation to evaluate to a constant expression and consequently make the
resulting code incorrectly execute or fail to execute conditional
blocks.

This has caused a miscompilation of rwlock.c from LinuxThreads for the
`alpha-linux-gnu' target, where `rwlock->__rw_writer != thread_self ()'
expression (where `thread_self' returns the thread pointer via a PALcode
call) has been decided to be always true (with `ent->comparison_qty'
using -1 for a reference to to `rwlock->__rw_writer', while register 0
holding the thread pointer retrieved by `thread_self') and code for the
false case has been optimized away where it mustn't have, causing
program lockups.

The issue has been observed as a regression from commit 08a692679fb8
("Undefined cse.c behaviour causes 3.4 regression on HPUX"),
, and up to
commit 932ad4d9b550 ("Make CSE path following use the CFG"),
, where CSE
has been restructured sufficiently for the issue not to trigger with the
original reproducer anymore.  However the original bug remains and can
trigger, because `comparison_qty' will still be assigned -1 for a memory
reference and the `reg_qty' member of a `cse_reg_info_table' entry will
still be assigned -1 for register 0 where the entry has not been
assigned a quantity, e.g. at initialization.

Use INT_MIN then as noted above, so that the value remains negative, for
consistency with the REGNO_QTY_VALID_P macro (even though not used on
`comparison_qty'), and then so that it should not ever match a valid
negated register number, fixing the regression with commit 08a692679fb8.

gcc/
PR rtl-optimization/115565
* cse.cc (record_jump_cond): Use INT_MIN rather than -1 for
`comparison_qty' if !REG_P.

(cherry picked from commit 69bc5fb97dc3fada81869e00fa65d39f7def6acf)

[Bug rtl-optimization/115565] [11/12/13/14/15 Regression] CSE: Comparison incorrectly evaluated as constant causing optimization to produce wrong code

2024-07-22 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115565

--- Comment #8 from GCC Commits  ---
The releases/gcc-12 branch has been updated by Maciej W. Rozycki
:

https://gcc.gnu.org/g:8d8f804b18e4a38671957b3e4c239ef625506317

commit r12-10633-g8d8f804b18e4a38671957b3e4c239ef625506317
Author: Maciej W. Rozycki 
Date:   Sat Jun 29 23:26:55 2024 +0100

[PR115565] cse: Don't use a valid regno for non-register in comparison_qty

Use INT_MIN rather than -1 in `comparison_qty' where a comparison is not
with a register, because the value of -1 is actually a valid reference
to register 0 in the case where it has not been assigned a quantity.

Using -1 makes `REG_QTY (REGNO (folded_arg1)) == ent->comparison_qty'
comparison in `fold_rtx' to incorrectly trigger in rare circumstances
and return true for a memory reference, making CSE consider a comparison
operation to evaluate to a constant expression and consequently make the
resulting code incorrectly execute or fail to execute conditional
blocks.

This has caused a miscompilation of rwlock.c from LinuxThreads for the
`alpha-linux-gnu' target, where `rwlock->__rw_writer != thread_self ()'
expression (where `thread_self' returns the thread pointer via a PALcode
call) has been decided to be always true (with `ent->comparison_qty'
using -1 for a reference to to `rwlock->__rw_writer', while register 0
holding the thread pointer retrieved by `thread_self') and code for the
false case has been optimized away where it mustn't have, causing
program lockups.

The issue has been observed as a regression from commit 08a692679fb8
("Undefined cse.c behaviour causes 3.4 regression on HPUX"),
, and up to
commit 932ad4d9b550 ("Make CSE path following use the CFG"),
, where CSE
has been restructured sufficiently for the issue not to trigger with the
original reproducer anymore.  However the original bug remains and can
trigger, because `comparison_qty' will still be assigned -1 for a memory
reference and the `reg_qty' member of a `cse_reg_info_table' entry will
still be assigned -1 for register 0 where the entry has not been
assigned a quantity, e.g. at initialization.

Use INT_MIN then as noted above, so that the value remains negative, for
consistency with the REGNO_QTY_VALID_P macro (even though not used on
`comparison_qty'), and then so that it should not ever match a valid
negated register number, fixing the regression with commit 08a692679fb8.

gcc/
PR rtl-optimization/115565
* cse.cc (record_jump_cond): Use INT_MIN rather than -1 for
`comparison_qty' if !REG_P.

(cherry picked from commit 69bc5fb97dc3fada81869e00fa65d39f7def6acf)

[Bug tree-optimization/116024] [14/15 Regression] unnecessary integer comparison(s) for a simple loop since r14-5628-g53ba8d669550d3

2024-07-22 Thread artemiy at synopsys dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116024

--- Comment #5 from Artemiy Volkov  ---
Hi Andrew, thank you for the breakdown.  For i1() (the case applicable to the
initial bug report) something like this seems to fix the issue:

diff --git a/gcc/match.pd b/gcc/match.pd
index cf359b0ec0f..8ab6d47e278 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -8773,2 +8773,10 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)

+/* Transform comparisons of the form C1 - X CMP C2 to X - C1 CMP -C2.  */
+(for cmp (lt le gt ge eq ne)
+ rcmp (gt ge lt le eq ne)
+  (simplify
+   (cmp (minus INTEGER_CST@0 @1) INTEGER_CST@2)
+   (if (TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@1)))
+ (rcmp (minus @1 @0) (negate @2)
+
 /* Canonicalizations of BIT_FIELD_REFs.  */

Would it make sense for this ticket to be assigned to me so I could refine and
post the above patch as well as tackle i2() and i3() (should those be extracted
to a separate PR or is it fine to fix all three under this PR)?

C preprocessor bug

2024-07-22 Thread Ovidiu Panait

Hi,

When processing large header files, the C preprocessor reports error on 
the wrong line.


This is 100% reproducible on my side with gcc mainline.

Reproducer:

 # Build
mkdir build; cd build
../configure --host=x86_64-pc-linux-gnu --target=x86_64-wrs-linux 
--enable-languages=c --disable-multilib --disable-libstdcxx-pch 
--disable-libsanitizer --disable-libssp --disable-libquadmath 
--disable-libquadmath-support --disable-libgomp --disable-libvtv 
--disable-bootstrap

make all-host -j$(nproc)
mkdir install; make install-host DESTDIR=$(realpath ./install) -j$(nproc)


 # Generate testcase
mkdir testcase; cd testcase
cat > main.c < header1.h
for i in {1..327676}; do
    echo "extern int header1_begins;" >> header1.h
done
cat >> header1.h < header2.h


 # Test
../install/usr/local/bin/x86_64-wrs-linux-gcc main.c
In file included from main.c:4:
header1.h:327677:20: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or 
‘__attribute__’ before ‘ends’

327677 | #include "header2.h"
   |    ^


The line where the error is reported("327677") is bogus, the actual 
error is on the next line("327678") in header1.h:


$ cat -n header1.h | tail -n3
327676    extern int header1_begins;
327677    #include "header2.h"
327678    extern int header1 ends;


Ovidiu


[Bug tree-optimization/115531] vectorizer generates inefficient code for masked conditional update loops

2024-07-22 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115531

--- Comment #5 from GCC Commits  ---
The master branch has been updated by Tamar Christina :

https://gcc.gnu.org/g:af792f0226e479b165a49de5e8f9e1d16a4b26c0

commit r15-2191-gaf792f0226e479b165a49de5e8f9e1d16a4b26c0
Author: Tamar Christina 
Date:   Mon Jul 22 10:26:14 2024 +0100

middle-end: Implement conditonal store vectorizer pattern [PR115531]

This adds a conditional store optimization for the vectorizer as a pattern.
The vectorizer already supports modifying memory accesses because of the
pattern
based gather/scatter recognition.

Doing it in the vectorizer allows us to still keep the ability to vectorize
such
loops for architectures that don't have MASK_STORE support, whereas doing
this
in ifcvt makes us commit to MASK_STORE.

Concretely for this loop:

void foo1 (char *restrict a, int *restrict b, int *restrict c, int n, int
stride)
{
  if (stride <= 1)
return;

  for (int i = 0; i < n; i++)
{
  int res = c[i];
  int t = b[i+stride];
  if (a[i] != 0)
res = t;
  c[i] = res;
}
}

today we generate:

.L3:
ld1bz29.s, p7/z, [x0, x5]
ld1wz31.s, p7/z, [x2, x5, lsl 2]
ld1wz30.s, p7/z, [x1, x5, lsl 2]
cmpne   p15.b, p6/z, z29.b, #0
sel z30.s, p15, z30.s, z31.s
st1wz30.s, p7, [x2, x5, lsl 2]
add x5, x5, x4
whilelo p7.s, w5, w3
b.any   .L3

which in gimple is:

  vect_res_18.9_68 = .MASK_LOAD (vectp_c.7_65, 32B, loop_mask_67);
  vect_t_20.12_74 = .MASK_LOAD (vectp.10_72, 32B, loop_mask_67);
  vect__9.15_77 = .MASK_LOAD (vectp_a.13_75, 8B, loop_mask_67);
  mask__34.16_79 = vect__9.15_77 != { 0, ... };
  vect_res_11.17_80 = VEC_COND_EXPR ;
  .MASK_STORE (vectp_c.18_81, 32B, loop_mask_67, vect_res_11.17_80);

A MASK_STORE is already conditional, so there's no need to perform the load
of
the old values and the VEC_COND_EXPR.  This patch makes it so we generate:

  vect_res_18.9_68 = .MASK_LOAD (vectp_c.7_65, 32B, loop_mask_67);
  vect__9.15_77 = .MASK_LOAD (vectp_a.13_75, 8B, loop_mask_67);
  mask__34.16_79 = vect__9.15_77 != { 0, ... };
  .MASK_STORE (vectp_c.18_81, 32B, mask__34.16_79, vect_res_18.9_68);

which generates:

.L3:
ld1bz30.s, p7/z, [x0, x5]
ld1wz31.s, p7/z, [x1, x5, lsl 2]
cmpne   p7.b, p7/z, z30.b, #0
st1wz31.s, p7, [x2, x5, lsl 2]
add x5, x5, x4
whilelo p7.s, w5, w3
b.any   .L3

gcc/ChangeLog:

PR tree-optimization/115531
* tree-vect-patterns.cc (vect_cond_store_pattern_same_ref): New.
(vect_recog_cond_store_pattern): New.
(vect_vect_recog_func_ptrs): Use it.
* target.def (conditional_operation_is_expensive): New.
* doc/tm.texi: Regenerate.
* doc/tm.texi.in: Document it.
* targhooks.cc (default_conditional_operation_is_expensive): New.
* targhooks.h (default_conditional_operation_is_expensive): New.

[Bug tree-optimization/115531] vectorizer generates inefficient code for masked conditional update loops

2024-07-22 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115531

--- Comment #6 from GCC Commits  ---
The master branch has been updated by Tamar Christina :

https://gcc.gnu.org/g:0c5c0c959c2e592b84739f19ca771fa69eb8dfee

commit r15-2192-g0c5c0c959c2e592b84739f19ca771fa69eb8dfee
Author: Tamar Christina 
Date:   Mon Jul 22 10:28:19 2024 +0100

AArch64: implement TARGET_VECTORIZE_CONDITIONAL_OPERATION_IS_EXPENSIVE 
[PR115531].

This implements the new target hook indicating that for AArch64 when
possible
we prefer masked operations for any type vs doing LOAD + SELECT or
SELECT + STORE.

Thanks,
Tamar

gcc/ChangeLog:

PR tree-optimization/115531
* config/aarch64/aarch64.cc
(aarch64_conditional_operation_is_expensive): New.
(TARGET_VECTORIZE_CONDITIONAL_OPERATION_IS_EXPENSIVE): New.

gcc/testsuite/ChangeLog:

PR tree-optimization/115531
* gcc.dg/vect/vect-conditional_store_1.c: New test.
* gcc.dg/vect/vect-conditional_store_2.c: New test.
* gcc.dg/vect/vect-conditional_store_3.c: New test.
* gcc.dg/vect/vect-conditional_store_4.c: New test.

[Bug tree-optimization/53947] [meta-bug] vectorizer missed-optimizations

2024-07-22 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
Bug 53947 depends on bug 115531, which changed state.

Bug 115531 Summary: vectorizer generates inefficient code for masked 
conditional update loops
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115531

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

[Bug fortran/88624] [Coarray] Rejects allocatable coarray passed as a dummy argument

2024-07-22 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88624

--- Comment #5 from GCC Commits  ---
The master branch has been updated by Andre Vehreschild :

https://gcc.gnu.org/g:9d650e97cb76e4ea3b5d060e4a4cef38fc58

commit r15-2193-g9d650e97cb76e4ea3b5d060e4a4cef38fc58
Author: Andre Vehreschild 
Date:   Thu Jul 11 10:07:12 2024 +0200

Fix Rejects allocatable coarray passed as a dummy argument [88624]

Coarray parameters of procedures/functions need to be dereffed, because
they are references to the descriptor but the routine expected the
descriptor directly.

PR fortran/88624

gcc/fortran/ChangeLog:

* trans-expr.cc (gfc_conv_procedure_call): Treat
pointers/references (e.g. from parameters) correctly by derefing
them.

gcc/testsuite/ChangeLog:

* gfortran.dg/coarray/dummy_1.f90: Add calling function trough
function.
* gfortran.dg/pr88624.f90: New test.

[Bug tree-optimization/115531] vectorizer generates inefficient code for masked conditional update loops

2024-07-22 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115531

Tamar Christina  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #7 from Tamar Christina  ---
Fixed on trunk

[Bug middle-end/45215] Tree-optimization misses a trick with bit tests

2024-07-22 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=45215

--- Comment #4 from rguenther at suse dot de  ---
On Fri, 19 Jul 2024, pinskia at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=45215
> 
> Andrew Pinski  changed:
> 
>What|Removed |Added
> 
>  CC||pinskia at gcc dot gnu.org
>   Component|tree-optimization   |middle-end
>  Status|NEW |ASSIGNED
> 
> --- Comment #3 from Andrew Pinski  ---
>   _1 = t_3(D) & 256;
>   if (_1 != 0)
> goto ; [1.04%]
>   else
> goto ; [98.96%]
> 
>[local count: 1062574912]:
> 
>[local count: 1073741824]:
>   # _2 = PHI <-26(2), 0(3)>
> 
> 
> So the trick here is that 256 is `0x1<<8` so we want to shift that bit up to
> the sign bit and then arthimetic shift down to get 0/-1 and then and with -26.

So .BIT_SPLAT (t_3(D), 8) & -26, there's nothing special in x86 to help
.BIT_SPLAT though, back-to-back shift might be throughput constrained.
I think x86 can do -1 vs. 0 set from flags of the and though.

I'm not sure whether two shifts and and are a good way to recover an
optimal non-branch insn sequence later?

[Bug libstdc++/115907] Libstdc++ and GCC itself should avoid glibc above 2.34 dependency

2024-07-22 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115907

--- Comment #58 from Jonathan Wakely  ---
(In reply to cqwrteur from comment #43)
> If GNU folks continue f things up, I can guarantee
> you everyone will move to LLVM

You keep saying this, but you're still here. Feel free to leave any time.

[Bug target/89270] [12/13 regression] AVR ICE: verify_gimple failed

2024-07-22 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89270

--- Comment #17 from Richard Biener  ---
(In reply to Georg-Johann Lay from comment #16)
> (In reply to Richard Biener from comment #14)
> > Fixed on trunk sofar.  Joseph correctly mentioned that iff AVR would define
> > __int24 using INT_N in avr-modes.def the issue would have been mitigated as
> > well
> > (that's a comparatively "modern" way of registering additional integer
> > types).
> > 
> > So it's really also a target issue.
> 
> INT_N isn't even mentioned in the internals documentation.
> 
> And are you saying that FRACTIONAL_INT_MODE is bogus or broken by design?

No, INT_N is just a convenient way to add new builtin types that are
accessible as [u]intN_t to the user.  Those happened to be considered
already for the conversions.

Re: C preprocessor bug

2024-07-22 Thread Jonathan Wakely via Gcc-bugs

On 22/07/24 12:24 +0300, Ovidiu Panait wrote:

Hi,

When processing large header files, the C preprocessor reports error 
on the wrong line.


This mailing list is for automated emails fom our bug tracker, not for
reporting bugs. Emails sent directly to this list will not get tracked
as bugs, and will generally be ignored.

To report a bug please follow the instructions at
https://gcc.gnu.org/bugs/ - thanks.



This is 100% reproducible on my side with gcc mainline.

Reproducer:

 # Build
mkdir build; cd build
../configure --host=x86_64-pc-linux-gnu --target=x86_64-wrs-linux 
--enable-languages=c --disable-multilib --disable-libstdcxx-pch 
--disable-libsanitizer --disable-libssp --disable-libquadmath 
--disable-libquadmath-support --disable-libgomp --disable-libvtv 
--disable-bootstrap

make all-host -j$(nproc)
mkdir install; make install-host DESTDIR=$(realpath ./install) -j$(nproc)


 # Generate testcase
mkdir testcase; cd testcase
cat > main.c < header1.h
for i in {1..327676}; do
    echo "extern int header1_begins;" >> header1.h
done
cat >> header1.h < header2.h


 # Test
../install/usr/local/bin/x86_64-wrs-linux-gcc main.c
In file included from main.c:4:
header1.h:327677:20: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or 
‘__attribute__’ before ‘ends’

327677 | #include "header2.h"
   |    ^


The line where the error is reported("327677") is bogus, the actual 
error is on the next line("327678") in header1.h:


$ cat -n header1.h | tail -n3
327676    extern int header1_begins;
327677    #include "header2.h"
327678    extern int header1 ends;


Ovidiu




[Bug fortran/87477] [meta-bug] [F03] issues concerning the ASSOCIATE statement

2024-07-22 Thread vehre at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87477

--- Comment #9 from Andre Vehreschild  ---
(In reply to Paul Thomas from comment #8)
> Hi Andre,
> 
> Two of the remaining dependencies are associated with (excuse the pun)
> coarrays. PR102973 is probably spurious and is marked as waiting.
> 
> Could you take a look, please?
> 
> Thanks
> 
> Paul

Hi Paul,

yes, I take a look right away.

- Andre

[Bug sanitizer/115793] signed integer overflow check missing at optimization levels -O2, -O3, and -Os

2024-07-22 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115793

--- Comment #6 from Jakub Jelinek  ---
This bugreport is based on the unwarranted assumption that UBSAN reports all UB
even at higher optimization levels.  It doesn't, that is part of the tradeoff
between code speed and amount of reported issues.  We don't report all the UB
in clearly dead code even at -O0, here VRP simply figures out that the
multiplication result would be
  # RANGE [irange] int [-INF, +INF] MASK 0xe441 VALUE 0x8d9f133a
  _2 = .UBSAN_CHECK_MUL (56506, 42049);
and because that result is only used in (_2 & 65534) == 0 comparison, that
comparison is folded to 0 and so the multiplication is optimized away.
With e.g. -O2 -fsanitize=undefined, one generally gets diagnosed UB that will
still happen in the program, which won't be DCEd.

[Bug libstdc++/115907] Libstdc++ and GCC itself should avoid glibc above 2.34 dependency

2024-07-22 Thread unlvsur at live dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115907

--- Comment #59 from cqwrteur  ---
you think it is not a reality? android freebsd wasm and even windows had
already moved to llvm. GCC does not even support android any more. glibc on
linux is the only reason people still stay with gcc. But now llvm is making
libc. We will finally see everyone moves to llvm.

Get Outlook for Android

From: redi at gcc dot gnu.org 
Sent: Monday, July 22, 2024 5:41:29 AM
To: unlv...@live.com 
Subject: [Bug libstdc++/115907] Libstdc++ and GCC itself should avoid glibc
above 2.34 dependency

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115907

--- Comment #58 from Jonathan Wakely  ---
(In reply to cqwrteur from comment #43)
> If GNU folks continue f things up, I can guarantee
> you everyone will move to LLVM

You keep saying this, but you're still here. Feel free to leave any time.

--
You are receiving this mail because:
You reported the bug.
You are on the CC list for the bug.

[Bug libstdc++/115907] Libstdc++ and GCC itself should avoid glibc above 2.34 dependency

2024-07-22 Thread unlvsur at live dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115907

--- Comment #60 from cqwrteur  ---
i am here? Have you even checked my repository? i have been working on llvm
backend for my projects for nearly a year. At this point i won't even be
shocked even microsoft giving up msvc and moving to llvm.

Get Outlook for Android

From: redi at gcc dot gnu.org 
Sent: Monday, July 22, 2024 5:41:29 AM
To: unlv...@live.com 
Subject: [Bug libstdc++/115907] Libstdc++ and GCC itself should avoid glibc
above 2.34 dependency

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115907

--- Comment #58 from Jonathan Wakely  ---
(In reply to cqwrteur from comment #43)
> If GNU folks continue f things up, I can guarantee
> you everyone will move to LLVM

You keep saying this, but you're still here. Feel free to leave any time.

--
You are receiving this mail because:
You reported the bug.
You are on the CC list for the bug.

[Bug target/116021] Ada build on Darwin: gen_il-main: Symbol not found: ___builtin_nested_func_ptr_created

2024-07-22 Thread iains at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116021

--- Comment #8 from Iain Sandoe  ---
(In reply to Eric Gallager from comment #7)
> Well ok, could someone send me a binary x86_64 build of GCC for darwin20
> with Ada support that they can bootstrap with successfully, then, so that I
> can get back to bootstrapping, too? Either that, or send me the files that
> gen_il-main generates...

Eric:

To the best of my knowledge every release of GCC after 4.6 (when we fixed
powerpc-darwin9) should bootstrap correctly on all Darwin archs supported by
upstream (i.e. not including Arm64 yet).

There can be (sometimes extended) periods where trunk (or even branches) are
broken for some/all Darwin - since there's not many folks fixing it - but
x86_64 is not currently broken anywhere AFAIK.

=

Bringing up Ada on a new plafform version - the devil is in the details:

AFAIK you have a copy of my gcc-7.5-darwin19 toolchain?
This _is_ sufficient to build a new bootstrap compiler on Darwin20 including
Ada.

the following should work - for 11.5, 12.4, 13.3, 14.2 and trunk ..

$ uname -v
Darwin Kernel Version 20.6.0: Thu Jul  6 22:12:47 PDT 2023;
root:xnu-7195.141.49.702.12~1/RELEASE_X86_64

1. start a shell with just the normal OS PATH
2. you need to have texinfo-6.7 or similar ahead of the OS version (which is
not new enough to support trunk).
3. My PATH looks like:
PATH=/opt/iains/x86_64-apple-darwin20/gcc-build-tools/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/opt/iains/x86_64-apple-darwin19/gcc-7-5-toolchain/bin

The first entry has texinfo-6.7 and dejagnu.

4. $  gnatmake --version
GNATMAKE 7.5.0

there is no other GCC or gnatmake in my PATH - but remember that Xcode will
claim 'gcc/g++' is 'clang/++'.

5. configure:

/src-local/gcc-master/configure
--prefix=/opt/iains/x86_64-apple-darwin20/gcc-15-0-0
--build=x86_64-apple-darwin20
--with-sysroot=/Library/Developer/CommandLineTools/SDKs/MacOSX11.sdk
--disable-libstdcxx-pch --enable-languages=all CC=x86_64-apple-darwin19-gcc
CXX=x86_64-apple-darwin19-g++

NOTE1: we _have_ to put CC and CXX because otherwise we run into problems
because of the claiming of gcc/g++ as above.

NOTE2: x86_64-apple-darwin19-gcc <<- this MUST match the version of gnatmake
(there's only one in the PATH so that should be OK)

NOTE3: the --disable-libstdcxx-pch should be irrelevant

NOTE4: There might _still_ be places in the Ada build where "gnatmake" is used
literally - instead of GNATMAKE_FOR_ so it is very important to make sure
that the NOTE2 is observed.

6. make -jN .. 

7. $ ./gcc/xgcc --version
xgcc (GCC) 15.0.0 20240721 (experimental) [master revision
r15-2183-g58b78cf068b3] (Sunday AM trunk)



Works For Me as I have repeatedly said - you need to examine carefully what you
are doing differently - if there's a real bug I'd like to fix it - but I cannot
see one at present.



11.5 might be a good one to build since that also gives you a D compiler to
bootstrap D on gcc-12+

I just built the darwin branch released over the weekend...

configure: /src-local/gcc-git-11/configure
--prefix=/opt/iains/x86_64-apple-darwin20/gcc-11-5-darwin
--build=x86_64-apple-darwin20
--with-sysroot=/Library/Developer/CommandLineTools/SDKs/MacOSX11.sdk
--disable-libstdcxx-pch --enable-languages=all CC=x86_64-apple-darwin19-gcc
CXX=x86_64-apple-darwin19-g++

$ ./gcc/xgcc --version
xgcc (GCC) 11.5.0



[Bug rtl-optimization/116009] [15 regression] ICE when building cython-3.0.10 on arm64 for Python 3.12 (insert_def_after, at rtl-ssa/accesses.cc:622)

2024-07-22 Thread rsandifo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116009

Richard Sandiford  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |rsandifo at gcc dot 
gnu.org

--- Comment #6 from Richard Sandiford  ---
Testing a patch.

[Bug rtl-optimization/115876] [15 regression] ext-dce.cc has ubsan issues; shifting negative values

2024-07-22 Thread jamborm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115876

--- Comment #11 from Martin Jambor  ---
Our weekend ubsan bootstrap and test (of revision
r15-2173-ge0d997e913f811) still reported failures when compiling
testcase gfortran.dg/ieee/large_1.f90 (at -O2 and higher).

[Bug target/116029] New: Linux kernel doesn't build with gcc 11.5.0

2024-07-22 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116029

Bug ID: 116029
   Summary: Linux kernel doesn't build with gcc 11.5.0
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jakub at gcc dot gnu.org
  Target Milestone: ---

./cc1.r11-11539 -quiet -mlittle-endian -Wall -Wundef -Werror=strict-prototypes
-Wno-trigraphs -fno-strict-aliasing -fno-common -fshort-wchar -fno-PIE
-Werror=implicit-function-declaration -Werror=implicit-int -Werror=return-type
-Wno-format-security -std=gnu11 -Wno-psabi -mabi=lp64
-fno-asynchronous-unwind-tables -fno-unwind-tables -mbranch-protection=none
-fno-delete-null-pointer-checks -Wno-frame-address -Wno-format-truncation
-Wno-format-overflow -Wno-address-of-packed-member -O2
-fno-allow-store-data-races -Wframe-larger-than=2048 -fstack-protector-strong
-Wno-main -Wno-unused-but-set-variable -Wno-unused-const-variable
-fno-omit-frame-pointer -fno-optimize-sibling-calls -fno-stack-clash-protection
-g -fpatchable-function-entry=2 -fno-inline-functions-called-once -Wvla
-Wno-pointer-sign -Wcast-function-type -Wno-stringop-truncation
-Wno-stringop-overflow -Wno-restrict -Wno-maybe-uninitialized -Wno-array-bounds
-Wno-alloc-size-larger-than -Wimplicit-fallthrough=5 -fno-strict-overflow
-fno-stack-check -fconserve-stack -Werror=date-time
-Werror=incompatible-pointer-types -Werror=designated-init
-Wno-packed-not-aligned -mstack-protector-guard=sysreg
-mstack-protector-guard-reg=sp_el0 -mstack-protector-guard-offset=2144 -Wextra
-Wunused -Wmissing-prototypes -Wmissing-declarations -Wmissing-include-dirs
-Wold-style-definition -Wmissing-format-attribute -Wunused-but-set-variable
-Wunused-const-variable -Wstringop-truncation -Wpacked-not-aligned
-Wno-unused-parameter -Wno-type-limits -Wno-sign-compare
-Wno-missing-field-initializers -Wno-override-init -Wframe-larger-than=3072
-fsanitize=kernel-address -fasan-shadow-offset=0xdfff8000 --param
asan-globals=1 --param asan-instrumentation-with-call-threshold=1 --param
asan-instrument-allocas=1 --param asan-stack=1 display_mode_core.i -nostdinc
-g0
compiles without warnings, while
./cc1.r11-11540 -quiet -mlittle-endian -Wall -Wundef -Werror=strict-prototypes
-Wno-trigraphs -fno-strict-aliasing -fno-common -fshort-wchar -fno-PIE
-Werror=implicit-function-declaration -Werror=implicit-int -Werror=return-type
-Wno-format-security -std=gnu11 -Wno-psabi -mabi=lp64
-fno-asynchronous-unwind-tables -fno-unwind-tables -mbranch-protection=none
-fno-delete-null-pointer-checks -Wno-frame-address -Wno-format-truncation
-Wno-format-overflow -Wno-address-of-packed-member -O2
-fno-allow-store-data-races -Wframe-larger-than=2048 -fstack-protector-strong
-Wno-main -Wno-unused-but-set-variable -Wno-unused-const-variable
-fno-omit-frame-pointer -fno-optimize-sibling-calls -fno-stack-clash-protection
-g -fpatchable-function-entry=2 -fno-inline-functions-called-once -Wvla
-Wno-pointer-sign -Wcast-function-type -Wno-stringop-truncation
-Wno-stringop-overflow -Wno-restrict -Wno-maybe-uninitialized -Wno-array-bounds
-Wno-alloc-size-larger-than -Wimplicit-fallthrough=5 -fno-strict-overflow
-fno-stack-check -fconserve-stack -Werror=date-time
-Werror=incompatible-pointer-types -Werror=designated-init
-Wno-packed-not-aligned -mstack-protector-guard=sysreg
-mstack-protector-guard-reg=sp_el0 -mstack-protector-guard-offset=2144 -Wextra
-Wunused -Wmissing-prototypes -Wmissing-declarations -Wmissing-include-dirs
-Wold-style-definition -Wmissing-format-attribute -Wunused-but-set-variable
-Wunused-const-variable -Wstringop-truncation -Wpacked-not-aligned
-Wno-unused-parameter -Wno-type-limits -Wno-sign-compare
-Wno-missing-field-initializers -Wno-override-init -Wframe-larger-than=3072
-fsanitize=kernel-address -fasan-shadow-offset=0xdfff8000 --param
asan-globals=1 --param asan-instrumentation-with-call-threshold=1 --param
asan-instrument-allocas=1 --param asan-stack=1 display_mode_core.i -nostdinc
-g0 -fdump-tree-optimized -fdump-rtl-expand -fdump-rtl-final
drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/display_mode_core.c: In function
‘dml_prefetch_check’:
drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/display_mode_core.c:6707:1:
warning: the frame size of 3632 bytes is larger than 3072 bytes
[-Wframe-larger-than=]

[Bug target/116029] Linux kernel doesn't build with gcc 11.5.0

2024-07-22 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116029

Jakub Jelinek  changed:

   What|Removed |Added

 CC||ktkachov at gcc dot gnu.org
 Target||aarch64-linux

--- Comment #1 from Jakub Jelinek  ---
(insn/f:TI 17679 1114 18278 (set (reg/f:DI 31 sp)
(plus:DI (reg/f:DI 31 sp)
(const_int -1856 [0xf8c0])))
"drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/display_mode_core.c":6266:1 125
{*adddi3_aarch64}
 (nil))
in r11-11539 compiled case in that function, while
(insn/f:TI 17679 1114 18278 (set (reg/f:DI 31 sp)
(plus:DI (reg/f:DI 31 sp)
(const_int -1856 [0xf8c0])))
"drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/display_mode_core.c":6266:1 125
{*adddi3_aarch64}
 (nil))
in r11-11540.
I can reproduce both in native build
--enable-bootstrap --enable-host-pie --enable-host-bind-now
--enable-languages=c,c++,fortran,lto --prefix=/usr --mandir=/usr/share/man
--infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla
--enable-shared --enable-threads=posix --enable-checking=release
--with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions
--enable-gnu-unique-object --enable-linker-build-id
--with-gcc-major-version-only --enable-plugin --enable-initfini-array
--without-isl --enable-multilib --with-linker-hash-style=gnu
--enable-gnu-indirect-function --build=aarch64-redhat-linux
--with-build-config=bootstrap-lto --enable-link-serialization=1
and in a x86_64-linux to aarch64-linux cross:
--target aarch64-linux --disable-bootstrap --enable-languages=c,c++,fortran

[Bug c++/110171] [[nodiscard]] of await_resume ignored when discarding result of co_await expression

2024-07-22 Thread arsen at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110171

--- Comment #2 from Arsen Arsenović  ---
no - it is because convert_to_void does not know how to warn about discarded
co_awaits, and it does not get re-invoked when we expand co_awaits

[Bug target/116029] Linux kernel doesn't build with gcc 11.5.0

2024-07-22 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116029

--- Comment #2 from Jakub Jelinek  ---
First differences are in the veclower21 dump in several functions.

[Bug c/66425] (void) cast doesn't suppress __attribute__((warn_unused_result))

2024-07-22 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66425

--- Comment #82 from Segher Boessenkool  ---
(In reply to rusty from comment #81)
> Not many function returns are as clearly required as realloc...

Then they shouldn't use warn_unused_result!  The documentation of that is
very very clear: both about what it does, and about what situations it is
meant for.  People who want something else should *use* something else!

(I have no opinion about what that should be called, or what shape it
should take, I have no interest in warnings like that, I can't imagine ever
using such a warning or marking up code for it.  The existing w_u_r is very
obviously very useful though, as it is, and should not be weakened in any
way).

[Bug target/116029] Linux kernel doesn't build with gcc 11.5.0

2024-07-22 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116029

Jakub Jelinek  changed:

   What|Removed |Added

 Resolution|--- |WONTFIX
 Status|UNCONFIRMED |RESOLVED

--- Comment #3 from Jakub Jelinek  ---
So, the problem is that with the addition of TARGET_CPU_grace,
TARGET_CPU_generic is now 64, which causes problems, because of
return &all_cores[TARGET_CPU_DEFAULT & 0x3f];
or
#define TARGET_CPU_DEFAULT \
 (TARGET_CPU_generic | (AARCH64_CPU_DEFAULT_FLAGS << 6))

As pointed out by Sam James, this was fixed in GCC 12 with
r12-8060-g5522dec054cb940fe83661b96249aa12c54c1d77
Tested a backport of this or even backport thereof with
#define TARGET_CPU_NBITS 8
replaced with
#define TARGET_CPU_NBITS 7
fixes this.
Unfortunately, this can't be fixed anymore for 11.5 as the branch is closed.

[Bug fortran/85510] [12/13/14/15 Regression][Coarray] Linking error when accessing a coindexed variable inside an associate block

2024-07-22 Thread vehre at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85510

Andre Vehreschild  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |vehre at gcc dot gnu.org
 Status|NEW |ASSIGNED
 CC||vehre at gcc dot gnu.org

[Bug c/116016] enhancement: add __builtin_set_counted_by(P->FAM, COUNT) or equivalent

2024-07-22 Thread qinzhao at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116016

qinzhao at gcc dot gnu.org changed:

   What|Removed |Added

 CC||qinzhao at gcc dot gnu.org

--- Comment #3 from qinzhao at gcc dot gnu.org ---
(In reply to Jakub Jelinek from comment #2)
> So, what would be the prototype of the builtin?
> Would it be type-generic for both arguments, i.e. effectively
> void __builtin_set_counted_by (...);
> which would just verify that 2 arguments are passed, the first one is some
> flexible array member with counted_by argument and the second argument has
> some type implicitly convertible to the type of the counted_by member?

Is the prototype of this builtin good enough:

void __builtin_set_counted_by (ptr->FAM, const_exp_with_int_type)

i.e, the first argument should be a FAM array reference (in which, ptr is a
pointer to the object that include the FAM), and the second argument is a
constant expression with integer type.

[Bug c/116016] enhancement: add __builtin_set_counted_by(P->FAM, COUNT) or equivalent

2024-07-22 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116016

--- Comment #4 from Jakub Jelinek  ---
That is not a prototype.  Prototype is what is the C or C++ function type of
the builtin.  Neither ptr->FAM nor const_exp_with_int_type are valid C types.
There is no reason why the second argument should be const, it can be anything
convertible to whatever type size_t has, including say for C++ classes with
operator long (), _Bool/bool, enumerators, floating point expressions, ...

[Bug c++/104981] [coroutines] Internal compiler error when promise object's constructor takes a base class of the object parameter

2024-07-22 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104981

Patrick Palka  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |ppalka at gcc dot 
gnu.org
 Status|NEW |ASSIGNED

--- Comment #6 from Patrick Palka  ---
Since I already had a small patch nearly done, I finished and posted it at
https://gcc.gnu.org/pipermail/gcc-patches/2024-July/657920.html

[Bug target/116021] Ada build on Darwin: gen_il-main: Symbol not found: ___builtin_nested_func_ptr_created

2024-07-22 Thread egallager at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116021

--- Comment #9 from Eric Gallager  ---
Ah, looking at gcc/ada/gcc-interface/Makefile.in, perhaps the issue is that I
need to set GNATLINK in my environment, too, besides just GNATMAKE and
GNATBIND... perhaps the issue was arising due to having had a version mismatch
previously...

[Bug c++/115897] [14/15 Regression] vector_size attribute on alias template has no effect when used in a dependent variable template-id

2024-07-22 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115897

--- Comment #9 from Patrick Palka  ---
Looks like we also need to consider dependent attributes when stripping
non-template aliases.  Patch posted at
https://gcc.gnu.org/pipermail/gcc-patches/2024-July/657706.html, I wonder if it
fully handles your use case?

[Bug tree-optimization/116023] Failure to optimize (x+x)*(y+y) to (x*y)*4 when intermediate result is cast to larger type

2024-07-22 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116023

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2024-07-22
 Ever confirmed|0   |1

--- Comment #1 from Richard Biener  ---
it's related to the int expression x + x having undefined behavior on overflow
and reassoc refusing to work on such types (and mixed precision types in
general).

[Bug tree-optimization/116024] [14/15 Regression] unnecessary integer comparison(s) for a simple loop since r14-5628-g53ba8d669550d3

2024-07-22 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116024

--- Comment #6 from Richard Biener  ---
(In reply to Artemiy Volkov from comment #5)
> Hi Andrew, thank you for the breakdown.  For i1() (the case applicable to
> the initial bug report) something like this seems to fix the issue:
> 
> diff --git a/gcc/match.pd b/gcc/match.pd
> index cf359b0ec0f..8ab6d47e278 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -8773,2 +8773,10 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>  
> +/* Transform comparisons of the form C1 - X CMP C2 to X - C1 CMP -C2.  */
> +(for cmp (lt le gt ge eq ne)
> + rcmp (gt ge lt le eq ne)
> +  (simplify
> +   (cmp (minus INTEGER_CST@0 @1) INTEGER_CST@2)
> +   (if (TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@1)))
> + (rcmp (minus @1 @0) (negate @2)
> +
>  /* Canonicalizations of BIT_FIELD_REFs.  */
> 
> Would it make sense for this ticket to be assigned to me so I could refine
> and post the above patch as well as tackle i2() and i3() (should those be
> extracted to a separate PR or is it fine to fix all three under this PR)?

I don't think this is correct for types with undefined behavior on overflow
because you can't negate INT_MIN.

[Bug target/116021] Ada build on Darwin: gen_il-main: Symbol not found: ___builtin_nested_func_ptr_created

2024-07-22 Thread iains at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116021

--- Comment #10 from Iain Sandoe  ---
(In reply to Eric Gallager from comment #9)
> Ah, looking at gcc/ada/gcc-interface/Makefile.in, perhaps the issue is that
> I need to set GNATLINK in my environment, too, besides just GNATMAKE and
> GNATBIND... perhaps the issue was arising due to having had a version
> mismatch previously...

please try the simple case first [see NOTE4 above] - once you can repeat that -
adding more configure stuff can be done ... if you only have one, consistent
GCC bootstrap compiler in your path it should just work (modulo the issue with
Xcode claiming gcc/g++)

[Bug rtl-optimization/116028] [15 regression] gcc.dg/pr10474.c test failure since r15-1619-g3b9b8d6cfdf593

2024-07-22 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116028

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |15.0

[Bug lto/114501] [12/13/14/15 Regression] ICE during lto streaming

2024-07-22 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114501

Jan Hubicka  changed:

   What|Removed |Added

Summary|[12/13/14/15 Regression]|[12/13/14/15 Regression]
   |ICE during modref with LTO  |ICE during lto streaming
 CC||hubicka at gcc dot gnu.org
  Component|ipa |lto

--- Comment #11 from Jan Hubicka  ---
Note that this is not modref related - it is just last pass run before
streaming. We miss some free lang data I guess. Will take a look

[Bug c/116016] enhancement: add __builtin_set_counted_by(P->FAM, COUNT) or equivalent

2024-07-22 Thread qinzhao at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116016

--- Comment #5 from qinzhao at gcc dot gnu.org ---
(In reply to Jakub Jelinek from comment #4)
> That is not a prototype.  Prototype is what is the C or C++ function type of
> the builtin.  Neither ptr->FAM nor const_exp_with_int_type are valid C types.
> There is no reason why the second argument should be const, it can be
> anything convertible to whatever type size_t has, including say for C++
> classes with operator long (), _Bool/bool, enumerators, floating point
> expressions, ...

>From GCC doc:
(https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html#index-_005f_005fbuiltin_005fcall_005fwith_005fstatic_005fchain)

Built-in Function: type __builtin_call_with_static_chain (call_exp,
pointer_exp) The call_exp expression must be a function call, and the
pointer_exp expression must be a pointer.

If we allow the second argument to be variable, I am fine.

Then how about:

void __builtin_set_counted_by (FAM_exp, exp)

The FAM_exp expression must be a flexible array member reference. The second
argument must be an expression that can be converted to the type of the
flexible array member.

[Bug c/116016] enhancement: add __builtin_set_counted_by(P->FAM, COUNT) or equivalent

2024-07-22 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116016

--- Comment #6 from Jakub Jelinek  ---
That is a bad example, __builtin_call_with_static_chain is not a builtin
function, but a keyword.

[Bug target/116030] New: ICE "could not split insn" in final_scan_insn_1, at final.cc on power pc

2024-07-22 Thread pheeck at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116030

Bug ID: 116030
   Summary: ICE "could not split insn" in final_scan_insn_1, at
final.cc on power pc
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Keywords: ice-on-valid-code
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pheeck at gcc dot gnu.org
  Target Milestone: ---
  Host: x86_64-pc-linux-gnu
Target: ppc64le-linux-gnu

Compiling the gcc.dg/dfp/int128-3.c GCC testsuite file with a power pc
crosscompiler like this

ppc64le-linux-gnu-gcc
/home/worker/buildworker/tiber-option-juggler/build/gcc/testsuite/gcc.dg/dfp/int128-3.c
-Os -fno-forward-propagate -ftrivial-auto-var-init=zero

results in an ICE

/home/worker/buildworker/tiber-option-juggler/build/gcc/testsuite/gcc.dg/dfp/int128-3.c:
In function ‘main’:
/home/worker/buildworker/tiber-option-juggler/build/gcc/testsuite/gcc.dg/dfp/int128-3.c:81:1:
error: could not split insn
   81 | }
  | ^
(insn:TI 346 359 25 (set (mem/v/c:V4SI (reg:DI 9 9 [330]) [0 MEM 
[(void *)&u128]+0 S16 A128])
(const_vector:V4SI [
(const_int 0 [0]) repeated x4
]))
"/home/worker/buildworker/tiber-option-juggler/build/gcc/testsuite/gcc.dg/dfp/int128-3.c":34:23
1371 {vsx_stxvd2x4_le_const_v4si}
 (expr_list:REG_DEAD (reg:DI 9 9 [330])
(nil)))
during RTL pass: final
/home/worker/buildworker/tiber-option-juggler/build/gcc/testsuite/gcc.dg/dfp/int128-3.c:81:1:
internal compiler error: in final_scan_insn_1, at final.cc:2807
0x188f93e internal_error(char const*, ...)
   
/home/worker/buildworker/tiber-gcc-trunk-ppc64le/build/gcc/diagnostic-global-context.cc:491
0x682957 fancy_abort(char const*, int, char const*)
   
/home/worker/buildworker/tiber-gcc-trunk-ppc64le/build/gcc/diagnostic.cc:1725
0x660a17 _fatal_insn(char const*, rtx_def const*, char const*, int, char
const*)
   
/home/worker/buildworker/tiber-gcc-trunk-ppc64le/build/gcc/rtl-error.cc:108
0x6520ff final_scan_insn_1
   
/home/worker/buildworker/tiber-gcc-trunk-ppc64le/build/gcc/final.cc:2807
0x8fb968 final_scan_insn(rtx_insn*, _IO_FILE*, int, int, int*)
   
/home/worker/buildworker/tiber-gcc-trunk-ppc64le/build/gcc/final.cc:2886
0x8fbbf5 final_1
   
/home/worker/buildworker/tiber-gcc-trunk-ppc64le/build/gcc/final.cc:1977
0x8fc3e6 rest_of_handle_final
   
/home/worker/buildworker/tiber-gcc-trunk-ppc64le/build/gcc/final.cc:4239
0x8fc3e6 execute
   
/home/worker/buildworker/tiber-gcc-trunk-ppc64le/build/gcc/final.cc:4317
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See  for instructions.


Compiler configuration:

Using built-in specs.
COLLECT_GCC=/home/worker/cross/bin/ppc64le-linux-gnu-gcc
COLLECT_LTO_WRAPPER=/home/worker/cross/libexec/gcc/ppc64le-linux-gnu/15.0.0/lto-wrapper
Target: ppc64le-linux-gnu
Configured with:
/home/worker/buildworker/tiber-gcc-trunk-ppc64le/build/configure
--enable-languages=c,c++,fortran --disable-bootstrap --disable-libsanitizer
--disable-multilib --enable-checking=release --prefix=/home/worker/cross
--target=ppc64le-linux-gnu --with-as=/usr/bin/powerpc64le-suse-linux-as
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 15.0.0 20240721 (experimental)
838999bb23303edc14e96b6034cd837fa4454cfd (GCC)

[Bug target/116030] ICE "could not split insn" in final_scan_insn_1, at final.cc on power pc

2024-07-22 Thread pheeck at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116030

Filip Kastl  changed:

   What|Removed |Added

   Target Milestone|--- |15.0

[Bug ipa/115033] [12/13/14/15 Regression] Incorrect optimization of by-reference closure fields by fre1 pass since r12-5113-gd70ef65692fced

2024-07-22 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115033

--- Comment #18 from Jan Hubicka  ---
modref_eaf_analysis::analyze_ssa_name misinterprets EAF flags.  If dereferenced
parameter is passed (to map_iterator in the testcase) it can be returned
indirectly which in turn makes it to escape into the next function call.

I am testing:

diff --git a/gcc/ipa-modref.cc b/gcc/ipa-modref.cc
index a5adce8ea39..a4e3cc34b4d 100644
--- a/gcc/ipa-modref.cc
+++ b/gcc/ipa-modref.cc
@@ -2571,8 +2571,7 @@ modref_eaf_analysis::analyze_ssa_name (tree name, bool
deferred)
int call_flags = deref_flags
(gimple_call_arg_flags (call, i), ignore_stores);
if (!ignore_retval && !(call_flags & EAF_UNUSED)
-   && !(call_flags & EAF_NOT_RETURNED_DIRECTLY)
-   && !(call_flags & EAF_NOT_RETURNED_INDIRECTLY))
+   && !(call_flags & (EAF_NOT_RETURNED_DIRECTLY ||
EAF_NOT_RETURNED_INDIRECTLY)))
  merge_call_lhs_flags (call, i, name, false, true);
if (ecf_flags & (ECF_CONST | ECF_NOVOPS))
  m_lattice[index].merge_direct_load ();

[Bug rtl-optimization/115877] [15 Regression] wrong code at -Os (missing zero extension)

2024-07-22 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115877

--- Comment #11 from GCC Commits  ---
The master branch has been updated by Jeff Law :

https://gcc.gnu.org/g:88d16194d0c8a6bdc2896c8944bfbf3e6038c9d2

commit r15-2196-g88d16194d0c8a6bdc2896c8944bfbf3e6038c9d2
Author: Jeff Law 
Date:   Mon Jul 22 08:45:10 2024 -0600

[NFC][PR rtl-optimization/115877] Avoid setting irrelevant bit groups as
live in ext-dce

Another patch to refine liveness computations.  This should be NFC and is
designed to help debugging.

In simplest terms the patch avoids setting bit groups outside the size of a
pseudo as live.  Consider a HImode pseudo, bits 16..63 for such a pseudo
don't
really have meaning, yet we often set bit groups related to bits 16.63 on
in
the liveness bitmaps.

This makes debugging harder than it needs to be by simply having larger
bitmaps
to verify when walking through the code in a debugger.

This has been bootstrapped and regression tested on x86_64.  It's also been
tested on the crosses in my tester without regressions.

Pushing to the trunk,

PR rtl-optimization/115877
gcc/
* ext-dce.cc (group_limit): New function.
(mark_reg_live): Likewise.
(ext_dce_process_sets): Use new functions.
(ext_dce_process_uses): Likewise.
(ext_dce_init): Likewise.

[Bug ipa/115033] [12/13/14/15 Regression] Incorrect optimization of by-reference closure fields by fre1 pass since r12-5113-gd70ef65692fced

2024-07-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115033

--- Comment #19 from Andrew Pinski  ---
(In reply to Jan Hubicka from comment #18)
> modref_eaf_analysis::analyze_ssa_name misinterprets EAF flags.  If
> dereferenced
> parameter is passed (to map_iterator in the testcase) it can be returned
> indirectly which in turn makes it to escape into the next function call.
> 
> I am testing:
> 
> diff --git a/gcc/ipa-modref.cc b/gcc/ipa-modref.cc
> index a5adce8ea39..a4e3cc34b4d 100644
> --- a/gcc/ipa-modref.cc
> +++ b/gcc/ipa-modref.cc
> @@ -2571,8 +2571,7 @@ modref_eaf_analysis::analyze_ssa_name (tree name, bool
> deferred)
> int call_flags = deref_flags
> (gimple_call_arg_flags (call, i), ignore_stores);
> if (!ignore_retval && !(call_flags & EAF_UNUSED)
> -   && !(call_flags & EAF_NOT_RETURNED_DIRECTLY)
> -   && !(call_flags & EAF_NOT_RETURNED_INDIRECTLY))
> +   && !(call_flags & (EAF_NOT_RETURNED_DIRECTLY ||
> EAF_NOT_RETURNED_INDIRECTLY)))

`||` looks wrong, I suspect it should be `|`.

>   merge_call_lhs_flags (call, i, name, false, true);
> if (ecf_flags & (ECF_CONST | ECF_NOVOPS))
>   m_lattice[index].merge_direct_load ();

[Bug c++/109867] -Wswitch-default reports missing default in coroutine

2024-07-22 Thread arsen at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109867

--- Comment #3 from Arsen Arsenović  ---
(In reply to Arsen Arsenović from comment #2)
> this corresponds to the four switches emitted for the coroutine
> implementation after morphing these fns into coroutine functions.  the other
> cases are unreachable except by corruption of the frame, perhaps we should
> emit calls to either __builtin_unreachable or (IMO, better) some diagnostic
> hook (perhaps the best of both worlds by emitting a call to a UBsan hook or
> somesuch).
> 
> anyway, the diagnostic is unactionable by the user and hence bad anyway

ah, never mind, we emit traps in the default case (which is fine), meaning we
have a default.  the reason the warning happens is that we don't use
finish_case_label, so the analysis later fails to find the default label since
we never registered it properly

[Bug tree-optimization/114207] [12/13/14/15 Regression] modref gets confused by vecotorized code ` -O3 -fno-tree-forwprop` since r12-5439

2024-07-22 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114207

--- Comment #5 from Jan Hubicka  ---
The offset gets lost in ipa-prop.cc

diff --git a/gcc/ipa-prop.cc b/gcc/ipa-prop.cc
index 7d7cb3835d2..99ebd6229ec 100644
--- a/gcc/ipa-prop.cc
+++ b/gcc/ipa-prop.cc
@@ -1370,9 +1370,9 @@ unadjusted_ptr_and_unit_offset (tree op, tree *ret,
poly_int64 *offset_ret)
 {
   if (TREE_CODE (op) == ADDR_EXPR)
{
- poly_int64 extra_offset = 0;
+ poly_int64 extra_offset;
  tree base = get_addr_base_and_unit_offset (TREE_OPERAND (op, 0),
-&offset);
+&extra_offset);
  if (!base)
{
  base = get_base_address (TREE_OPERAND (op, 0));

here offset is the offset being tracked and get_addr_base_and_unit_offset is
intended to initialize extra_offset which is later added to offset.

In the testcase the pointer is first offseted by +4 and later by -4 which
combines to 0.

[Bug rtl-optimization/116009] [15 regression] ICE when building cython-3.0.10 on arm64 for Python 3.12 (insert_def_after, at rtl-ssa/accesses.cc:622)

2024-07-22 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116009

--- Comment #7 from Sam James  ---
I have a sparc testcase too but I won't bother spending time on that unless you
want it.

[Bug rtl-optimization/116009] [15 regression] ICE when building cython-3.0.10 on arm64 for Python 3.12 (insert_def_after, at rtl-ssa/accesses.cc:622)

2024-07-22 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116009

--- Comment #8 from Sam James  ---
(mpfr)

[Bug sanitizer/116031] New: signed integer overflow check at optimization level -O3

2024-07-22 Thread bic60176 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116031

Bug ID: 116031
   Summary: signed integer overflow check at optimization level
-O3
   Product: gcc
   Version: 13.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: sanitizer
  Assignee: unassigned at gcc dot gnu.org
  Reporter: bic60176 at gmail dot com
CC: dodji at gcc dot gnu.org, dvyukov at gcc dot gnu.org,
jakub at gcc dot gnu.org, kcc at gcc dot gnu.org
  Target Milestone: ---

Created attachment 58720
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58720&action=edit
testcase

OS: Ubuntu 22.04.3 LTS
We found a case that UBSAN not reporting signed integer overflow when compiling
with gcc-13.2.0 and gcc-14.1.0 at optimization level -O3.

$ ../compiler-builds/gcc-13.2.0_build/bin/gcc -fsanitize=undefined
-fsanitize=address -g -lgcc_s -O3 testcase.c -o exec

$ timeout 1s ./exec 2>exec.err

$ ../compiler-builds/gcc-12.3.0_build/bin/gcc -fsanitize=undefined
-fsanitize=address -g -lgcc_s -O3 testcase.c -o exec

$ timeout 1s ./exec 2>exec.err
testcase.c:22:15: runtime error: signed integer overflow: 754946112 +
1912123656 cannot be represented in type 'int'

I wonder if gcc-13 and gcc-14 have optimized out the part that cause signed
integer overflow.

[Bug c/116016] enhancement: add __builtin_set_counted_by(P->FAM, COUNT) or equivalent

2024-07-22 Thread qinzhao at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116016

--- Comment #7 from qinzhao at gcc dot gnu.org ---
(In reply to Jakub Jelinek from comment #6)
> That is a bad example, __builtin_call_with_static_chain is not a builtin
> function, but a keyword.

A little confused here, this function is clearly listed as a built_in function
provided by gcc, why it's not a builtin function? can you explain a little bit
here? is it convenient for you to provide a good example for reference? thanks.

[Bug c/116016] enhancement: add __builtin_set_counted_by(P->FAM, COUNT) or equivalent

2024-07-22 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116016

--- Comment #8 from Jakub Jelinek  ---
It doesn't matter how it is documented, what matters is how it is implemented.
E.g. can you do (__builtin_call_with_static_chain) (fn, ptr)?
Or __typeof (__builtin_call_with_static_chain)?
Regular builtins are what is defined in builtins.def.

[Bug target/115969] [15 regression] ICE when building clang-16.0.6 on arm64 (output_operand: invalid expression as operand)

2024-07-22 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115969

--- Comment #9 from GCC Commits  ---
The trunk branch has been updated by Richard Sandiford :

https://gcc.gnu.org/g:ebde0cc101a3b26bc8c188e0d2f79b649bacc43a

commit r15-2197-gebde0cc101a3b26bc8c188e0d2f79b649bacc43a
Author: Richard Sandiford 
Date:   Mon Jul 22 16:42:15 2024 +0100

aarch64: Tighten aarch64_simd_mem_operand_p [PR115969]

aarch64_simd_mem_operand_p checked for a memory with a POST_INC
or REG address, but it didn't check what kind of register was
being used.  This meant that it allowed DImode FPRs as well as GPRs.

I wondered about rewriting it to use aarch64_classify_address,
but this one-line fix seemed simpler.  The structure then mirrors
the existing early exit in aarch64_classify_address itself:

  /* On LE, for AdvSIMD, don't support anything other than POST_INC or
 REG addressing.  */
  if (advsimd_struct_p
  && TARGET_SIMD
  && !BYTES_BIG_ENDIAN
  && (code != POST_INC && code != REG))
return false;

gcc/
PR target/115969
* config/aarch64/aarch64.cc (aarch64_simd_mem_operand_p): Require
the operand to be a legitimate memory_operand.

gcc/testsuite/
PR target/115969
* gcc.target/aarch64/pr115969.c: New test.

[Bug rtl-optimization/116009] [15 regression] ICE when building cython-3.0.10 on arm64 for Python 3.12 (insert_def_after, at rtl-ssa/accesses.cc:622)

2024-07-22 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116009

--- Comment #9 from GCC Commits  ---
The trunk branch has been updated by Richard Sandiford :

https://gcc.gnu.org/g:34f33ea801563e2eabb348e8d3e9344a91abfd48

commit r15-2199-g34f33ea801563e2eabb348e8d3e9344a91abfd48
Author: Richard Sandiford 
Date:   Mon Jul 22 16:42:16 2024 +0100

rtl-ssa: Avoid using a stale splay tree root [PR116009]

In the fix for PR115928, I'd failed to notice that "root" was used
later in the function, so needed to be updated.

gcc/
PR rtl-optimization/116009
* rtl-ssa/accesses.cc (function_info::add_def): Set the root
local variable after removing the old clobber group.

gcc/testsuite/
PR rtl-optimization/116009
* gcc.c-torture/compile/pr116009.c: New test.

[Bug sanitizer/116031] signed integer overflow check at optimization level -O3

2024-07-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116031

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |INVALID

--- Comment #1 from Andrew Pinski  ---
  int32_t s;
  int32_t *n = &s;
  uint32_t o = 1912123656;
  c(*n >> (*l + (int32_t)o >= 0));

First off s is uninitialized. So it being 0 is valid and 0 shifted by any value
is still 0 so the overflow add is optimized away.

[Bug target/115969] [15 regression] ICE when building clang-16.0.6 on arm64 (output_operand: invalid expression as operand)

2024-07-22 Thread rsandifo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115969

Richard Sandiford  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #10 from Richard Sandiford  ---
Fixed.  The patch could be backported to release branches if the same error
shows up in other circumstances, but as Andrew says, the problem seems to have
been latent since the port was added.

[Bug rtl-optimization/116009] [15 regression] ICE when building cython-3.0.10 on arm64 for Python 3.12 (insert_def_after, at rtl-ssa/accesses.cc:622)

2024-07-22 Thread rsandifo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116009

Richard Sandiford  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #10 from Richard Sandiford  ---
Fixed.

[Bug sanitizer/115793] signed integer overflow check missing at optimization levels -O2, -O3, and -Os

2024-07-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115793

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |INVALID

--- Comment #7 from Andrew Pinski  ---
.

[Bug middle-end/115277] [13/14/15 regression] ICF needs to match loop bound estimates

2024-07-22 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115277

--- Comment #4 from GCC Commits  ---
The master branch has been updated by Jan Hubicka :

https://gcc.gnu.org/g:0d19fbc7b0760ce665fa6a88cd40cfa0311358d7

commit r15-2200-g0d19fbc7b0760ce665fa6a88cd40cfa0311358d7
Author: Jan Hubicka 
Date:   Mon Jul 22 18:01:57 2024 +0200

Compare loop bounds in ipa-icf

Hi,
this testcase shows another poblem with missing comparators for metadata
in ICF. With value ranges available to loop optimizations during early
opts we can estimate number of iterations based on guarding condition that
can be split away by the fnsplit pass. This patch disables ICF when
number of iteraitons does not match.

Bootstrapped/regtesed x86_64-linux, will commit it shortly

gcc/ChangeLog:

PR ipa/115277
* ipa-icf-gimple.cc (func_checker::compare_loops): compare loop
bounds.

gcc/testsuite/ChangeLog:

* gcc.c-torture/compile/pr115277.c: New test.

[Bug ipa/114207] [12/13/14/15 Regression] modref gets confused by vectorized code `-O3 -fno-tree-forwprop` since r12-5439

2024-07-22 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114207

--- Comment #6 from GCC Commits  ---
The master branch has been updated by Jan Hubicka :

https://gcc.gnu.org/g:391f46f10b0586c074014de82efe76787739bb0c

commit r15-2201-g391f46f10b0586c074014de82efe76787739bb0c
Author: Jan Hubicka 
Date:   Mon Jul 22 18:05:26 2024 +0200

Fix accounting of offsets in unadjusted_ptr_and_unit_offset

unadjusted_ptr_and_unit_offset accidentally throws away the offset computed
by
get_addr_base_and_unit_offset. Instead of passing extra_offset it passes
offset.

PR ipa/114207

gcc/ChangeLog:

* ipa-prop.cc (unadjusted_ptr_and_unit_offset): Fix accounting of
offsets in ADDR_EXPR.

gcc/testsuite/ChangeLog:

* gcc.c-torture/execute/pr114207.c: New test.

[Bug c++/116020] Incorrect treatment of (this void) parameter

2024-07-22 Thread fchelnokov at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116020

--- Comment #1 from Fedor Chelnokov  ---
Another problematic problem example is as follows:

struct A {
static void f();
};

void foo() {
A::f(); //ok
}

void A::f(this void) {}

int main() {
A::f(); //error in GCC after A::f definition
}

Here the definition of declared no-argument member function with (this void) is
accepted but makes this function no longer callable. Clang has no issue with it
again. Online demo: https://gcc.godbolt.org/z/vKfzo7fe9

[Bug ipa/115033] [12/13/14/15 Regression] Incorrect optimization of by-reference closure fields by fre1 pass since r12-5113-gd70ef65692fced

2024-07-22 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115033

--- Comment #20 from GCC Commits  ---
The master branch has been updated by Jan Hubicka :

https://gcc.gnu.org/g:cf8ffc58aad3127031c229a75cc4b99c8ace25e0

commit r15-2202-gcf8ffc58aad3127031c229a75cc4b99c8ace25e0
Author: Jan Hubicka 
Date:   Mon Jul 22 18:08:08 2024 +0200

Fix modref_eaf_analysis::analyze_ssa_name handling of values dereferenced
to function call parameters

modref_eaf_analysis::analyze_ssa_name misinterprets EAF flags.  If
dereferenced
parameter is passed (to map_iterator in the testcase) it can be returned
indirectly which in turn makes it to escape into the next function call.

PR ipa/115033

gcc/ChangeLog:

* ipa-modref.cc (modref_eaf_analysis::analyze_ssa_name): Fix
checking of
EAF flags when analysing values dereferenced as function
parameters.

gcc/testsuite/ChangeLog:

* gcc.c-torture/execute/pr115033.c: New test.

[Bug ipa/113291] [14/15 Regression] compilation never (?) finishes with recursive always_inline functions at -O and above since r14-2172

2024-07-22 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113291

--- Comment #11 from GCC Commits  ---
The releases/gcc-14 branch has been updated by Jan Hubicka
:

https://gcc.gnu.org/g:9a7d668fc58f817027ec7f9fa7e20a6dce08bddb

commit r14-10486-g9a7d668fc58f817027ec7f9fa7e20a6dce08bddb
Author: Jan Hubicka 
Date:   Tue May 14 12:58:56 2024 +0200

Reduce recursive inlining of always_inline functions

this patch tames down inliner on (mutiply) self-recursive always_inline
functions.
While we already have caps on recursive inlning, the testcase combines
early inliner
and late inliner to get very wide recursive inlining tree.  The basic idea
is to
ignore DISREGARD_INLINE_LIMITS when deciding on inlining self recursive
functions
(so we cut on function being large) and clear the flag once it is detected.

I did not include the testcase since it still produces a lot of code and
would
slow down testing.  It also outputs many inlining failed messages that is
not
very nice, but it is hard to detect self recursin cycles in full generality
when indirect calls and other tricks may happen.

gcc/ChangeLog:

PR ipa/113291

* ipa-inline.cc (enum can_inline_edge_by_limits_flags): New enum.
(can_inline_edge_by_limits_p): Take flags instead of multiple
bools; add flag
for forcing inlinie limits.
(can_early_inline_edge_p): Update.
(want_inline_self_recursive_call_p): Update; use FORCE_LIMITS mode.
(check_callers): Update.
(update_caller_keys): Update.
(update_callee_keys): Update.
(recursive_inlining): Update.
(add_new_edges_to_heap): Update.
(speculation_useful_p): Update.
(inline_small_functions): Clear DECL_DISREGARD_INLINE_LIMITS on
self recursion.
(flatten_function): Update.
(inline_to_all_callers_1): Update.

(cherry picked from commit 1ec49897253e093e1ef6261eb104ac0c111bac83)

[Bug middle-end/115277] [13/14/15 regression] ICF needs to match loop bound estimates

2024-07-22 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115277

Sam James  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2024-07-22
 Ever confirmed|0   |1

--- Comment #5 from Sam James  ---
honza, should that be in execute/ instead?

[Bug c++/105475] coroutines: ICE in coerce_template_parms, at cp/pt.cc:9183

2024-07-22 Thread arsen at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105475

--- Comment #3 from Arsen Arsenović  ---
ah, seems that we're missing handling of error_mark_node in a few places while
processing a coroutine, causing the middle-end to be confused later.  I'll
leave that for later.

[Bug rtl-optimization/115877] [15 Regression] wrong code at -Os (missing zero extension)

2024-07-22 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115877

--- Comment #12 from GCC Commits  ---
The master branch has been updated by Jeff Law :

https://gcc.gnu.org/g:ab7c0aed52054976d0b5e12c52e82239d4277b98

commit r15-2203-gab7c0aed52054976d0b5e12c52e82239d4277b98
Author: Jeff Law 
Date:   Mon Jul 22 10:11:57 2024 -0600

[4/n][PR rtl-optimization/115877] Correct SUBREG handling in a destination

If we encounter something during SET handling that we can not handle, the
safe
thing to do is to ignore the destination and continue the loop.

We've actually been trying to do slightly better with SUBREG destinations
by
iterating into SUBREG_REG.  It turns out that wasn't working as expected.

The problem is once we "continue" we lose the state that we were inside the
SET
and thus we ended up ignoring the destination completely rather than
tracking
the SUBREG_REG object.  This could be fixed by restarting SET processing,
but I
just don't see this as all that important to handle.  So rather than leave
the
code as-is, not working per design, I'm twiddling it to use the common
'skip
subrtxs and continue' idiom used elsewhere.

This is a prerequisite for another patch in this series.  Specifically I
have a
patch that explicitly tracks if we skipped a destination rather than trying
to
imply it from the state of LIVE_TMP.  So this is probably NFC right now,
but
that's a short-lived NFC.

Bootstrapped and regression tested on x86 and also run as part of a larger
kit
on the crosses in my tester.

PR rtl-optimization/115877
gcc/
* ext-dce.cc (ext_dce_process_sets): More correctly handle SUBREG
destinations.

[Bug c/116016] enhancement: add __builtin_set_counted_by(P->FAM, COUNT) or equivalent

2024-07-22 Thread qinzhao at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116016

--- Comment #9 from qinzhao at gcc dot gnu.org ---
(In reply to Jakub Jelinek from comment #8)
> It doesn't matter how it is documented, what matters is how it is
> implemented.
> E.g. can you do (__builtin_call_with_static_chain) (fn, ptr)?
> Or __typeof (__builtin_call_with_static_chain)?
> Regular builtins are what is defined in builtins.def.

Okay, I guess what did you mean is the following:

there are two categories of all the builtin functions provided by GCC to the
user as documented in gcc documentation:
https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html#index-_005f_005fbuiltin_005fcall_005fwith_005fstatic_005fchain

Category A: a C/C++ language extension (RID_BUILTIN), which is implemented in
the 
language FE by the parser. For example, __buitlin_call_with_static_chain,
__builtin_has_attribute, both belong to this category;

Category B: a regular builtin that is defined in builtins.def, which is
implemented in the middle-end. For example, __builtin_object_size,
__builtin_strcmp_eq, belong to this category;

>From my understanding, the new __builtin_set_counted_by could be either
implemented in C FE as an C language extension, or in Middle-end as a regular
builtin. just depend on what's the user interface we are planing to provide to
the user.

[Bug ipa/111613] [12/13/14/15 Regression] Bit field stores can be incorrectly optimized away when -fstore-merging is in effect since r12-5383-g22c242342e38eb

2024-07-22 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111613

--- Comment #7 from Jan Hubicka  ---
I suppose there is not much to do about past noread flags. I do not see how
optimization can invalidate other properties, so I am testing the following:

diff --git a/gcc/ipa-modref.cc b/gcc/ipa-modref.cc
index f994388a96a..53a2e35133d 100644
--- a/gcc/ipa-modref.cc
+++ b/gcc/ipa-modref.cc
@@ -3004,6 +3004,9 @@ analyze_parms (modref_summary *summary,
modref_summary_lto *summary_lto,
 (past, ecf_flags,
  VOID_TYPE_P (TREE_TYPE
  (TREE_TYPE (current_function_decl;
+ /* Store merging can produce reads when combining together multiple
+bitfields.  See PR111613.  */
+ past &= ~(EAF_NO_DIRECT_READ | EAF_NO_INDIRECT_READ);
  if (dump_file && (flags | past) != flags && !(flags & EAF_UNUSED))
{
  fprintf (dump_file,

[Bug middle-end/115277] [13/14/15 regression] ICF needs to match loop bound estimates

2024-07-22 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115277

--- Comment #6 from GCC Commits  ---
The releases/gcc-14 branch has been updated by Jan Hubicka
:

https://gcc.gnu.org/g:c5397d343ff1365fcebcf3ebabe140608874aac3

commit r14-10487-gc5397d343ff1365fcebcf3ebabe140608874aac3
Author: Jan Hubicka 
Date:   Mon Jul 22 18:01:57 2024 +0200

Compare loop bounds in ipa-icf

Hi,
this testcase shows another poblem with missing comparators for metadata
in ICF. With value ranges available to loop optimizations during early
opts we can estimate number of iterations based on guarding condition that
can be split away by the fnsplit pass. This patch disables ICF when
number of iteraitons does not match.

Bootstrapped/regtesed x86_64-linux, will commit it shortly

gcc/ChangeLog:

PR ipa/115277
* ipa-icf-gimple.cc (func_checker::compare_loops): compare loop
bounds.

gcc/testsuite/ChangeLog:

* gcc.c-torture/compile/pr115277.c: New test.

(cherry picked from commit 0d19fbc7b0760ce665fa6a88cd40cfa0311358d7)

[Bug ipa/114207] [12/13/14/15 Regression] modref gets confused by vectorized code `-O3 -fno-tree-forwprop` since r12-5439

2024-07-22 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114207

--- Comment #7 from GCC Commits  ---
The releases/gcc-14 branch has been updated by Jan Hubicka
:

https://gcc.gnu.org/g:f2e98084792821c3849074867d5b007c49028854

commit r14-10488-gf2e98084792821c3849074867d5b007c49028854
Author: Jan Hubicka 
Date:   Mon Jul 22 18:05:26 2024 +0200

Fix accounting of offsets in unadjusted_ptr_and_unit_offset

unadjusted_ptr_and_unit_offset accidentally throws away the offset computed
by
get_addr_base_and_unit_offset. Instead of passing extra_offset it passes
offset.

PR ipa/114207

gcc/ChangeLog:

* ipa-prop.cc (unadjusted_ptr_and_unit_offset): Fix accounting of
offsets in ADDR_EXPR.

gcc/testsuite/ChangeLog:

* gcc.c-torture/execute/pr114207.c: New test.

(cherry picked from commit 391f46f10b0586c074014de82efe76787739bb0c)

[Bug middle-end/115277] [13 regression] ICF needs to match loop bound estimates

2024-07-22 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115277

Jan Hubicka  changed:

   What|Removed |Added

Summary|[13/14/15 regression] ICF   |[13 regression] ICF needs
   |needs to match loop bound   |to match loop bound
   |estimates   |estimates

--- Comment #7 from Jan Hubicka  ---
Fixed on 14/15 so far

[Bug ipa/115033] [12/13/14/15 Regression] Incorrect optimization of by-reference closure fields by fre1 pass since r12-5113-gd70ef65692fced

2024-07-22 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115033

--- Comment #21 from GCC Commits  ---
The releases/gcc-14 branch has been updated by Jan Hubicka
:

https://gcc.gnu.org/g:27ef3a0779e551ca116c56c431436c8d2191b253

commit r14-10489-g27ef3a0779e551ca116c56c431436c8d2191b253
Author: Jan Hubicka 
Date:   Mon Jul 22 18:08:08 2024 +0200

Fix modref_eaf_analysis::analyze_ssa_name handling of values dereferenced
to function call parameters

modref_eaf_analysis::analyze_ssa_name misinterprets EAF flags.  If
dereferenced
parameter is passed (to map_iterator in the testcase) it can be returned
indirectly which in turn makes it to escape into the next function call.

PR ipa/115033

gcc/ChangeLog:

* ipa-modref.cc (modref_eaf_analysis::analyze_ssa_name): Fix
checking of
EAF flags when analysing values dereferenced as function
parameters.

gcc/testsuite/ChangeLog:

* gcc.c-torture/execute/pr115033.c: New test.

(cherry picked from commit cf8ffc58aad3127031c229a75cc4b99c8ace25e0)

[Bug ipa/113291] [14/15 Regression] compilation never (?) finishes with recursive always_inline functions at -O and above since r14-2172

2024-07-22 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113291

Jan Hubicka  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #12 from Jan Hubicka  ---
Fixed.

[Bug ipa/115033] [12/13 Regression] Incorrect optimization of by-reference closure fields by fre1 pass since r12-5113-gd70ef65692fced

2024-07-22 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115033

Jan Hubicka  changed:

   What|Removed |Added

Summary|[12/13/14/15 Regression]|[12/13 Regression]
   |Incorrect optimization of   |Incorrect optimization of
   |by-reference closure fields |by-reference closure fields
   |by fre1 pass since  |by fre1 pass since
   |r12-5113-gd70ef65692fced|r12-5113-gd70ef65692fced

--- Comment #22 from Jan Hubicka  ---
Fixed on 14/15 so far

[Bug ipa/114207] [12/13 Regression] modref gets confused by vectorized code `-O3 -fno-tree-forwprop` since r12-5439

2024-07-22 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114207

Jan Hubicka  changed:

   What|Removed |Added

Summary|[12/13/14/15 Regression]|[12/13 Regression] modref
   |modref gets confused by |gets confused by vectorized
   |vectorized code `-O3|code `-O3
   |-fno-tree-forwprop` since   |-fno-tree-forwprop` since
   |r12-5439|r12-5439

--- Comment #8 from Jan Hubicka  ---
Fixed on 14/15 so far

[Bug target/116032] New: [12/13/14/15 Regression] gcc.target/arm/pr40457-2.c produces larger code for armv7ve+neon

2024-07-22 Thread azoff at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116032

Bug ID: 116032
   Summary: [12/13/14/15 Regression] gcc.target/arm/pr40457-2.c
produces larger code for armv7ve+neon
   Product: gcc
   Version: 13.3.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: azoff at gcc dot gnu.org
  Target Milestone: ---

In test case gcc.target/arm/pr40457-2.c, scan-assembler "strd|stm" fails on
-march=armv7ve+neon as it emits vst1 instruction with a literal pool.

Below assembly was generated using: arm-none-eabi-gcc
gcc/testsuite/gcc.target/arm/pr40457-2.c -mthumb -march=armv7ve+neon
-mfloat-abi=hard -O2 -S -o -

With r12-4239-g50e20ee6e40:
.arch armv7-a
.arch_extension virt
.arch_extension idiv
.arch_extension sec
.arch_extension mp
.fpu neon
.eabi_attribute 28, 1
.eabi_attribute 20, 1
.eabi_attribute 21, 1
.eabi_attribute 23, 3
.eabi_attribute 24, 1
.eabi_attribute 25, 1
.eabi_attribute 26, 1
.eabi_attribute 30, 2
.eabi_attribute 34, 1
.eabi_attribute 18, 4
.file   "pr40457-2.c"
.text
.align  1
.align  2
.global foo
.syntax unified
.thumb
.thumb_func
.type   foo, %function
foo:
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 0, uses_anonymous_args = 0
@ link register save eliminated.
movsr2, #1
movsr3, #0
strdr2, r3, [r0]
bx  lr
.size   foo, .-foo

.ident  "GCC: (r12-4239-g50e20ee6e40) 12.0.0 20211008 (experimental)"



With r12-4240-g2b8453c401b:
.arch armv7-a
.arch_extension virt
.arch_extension idiv
.arch_extension sec
.arch_extension mp
.fpu neon
.eabi_attribute 20, 1
.eabi_attribute 21, 1
.eabi_attribute 23, 3
.eabi_attribute 24, 1
.eabi_attribute 25, 1
.eabi_attribute 26, 1
.eabi_attribute 30, 2
.eabi_attribute 34, 1
.eabi_attribute 18, 4
.file   "pr40457-2.c"
.text
.align  1
.align  2
.global foo
.syntax unified
.thumb
.thumb_func
.type   foo, %function
foo:
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 0, uses_anonymous_args = 0
@ link register save eliminated.
vldrd16, .L3
vst1.32 {d16}, [r0]
bx  lr
.L4:
.align  3
.L3:
.word   1
.word   0
.size   foo, .-foo

.ident  "GCC: (r12-4240-g2b8453c401b) 12.0.0 20211008 (experimental)"


I sent a patch that added the "vst1" instruction to the allowed list in the
test case in https://gcc.gnu.org/pipermail/gcc-patches/2024-July/657772.html,
but it was instead suggested to log a ticket for reviewing the SLP cost model
for arm.

[Bug ipa/111613] [12/13/14/15 Regression] Bit field stores can be incorrectly optimized away when -fstore-merging is in effect since r12-5383-g22c242342e38eb

2024-07-22 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111613

--- Comment #8 from GCC Commits  ---
The master branch has been updated by Jan Hubicka :

https://gcc.gnu.org/g:14074773350ffed7efdebbc553adf0f23b572e87

commit r15-2205-g14074773350ffed7efdebbc553adf0f23b572e87
Author: Jan Hubicka 
Date:   Mon Jul 22 19:00:39 2024 +0200

Fix modref's iteraction with store merging

Hi,
this patch fixes wrong code in case store-merging introduces load of
function
parameter that was previously write-only (which happens for bitfields).
Without this, the whole store-merged area is consdered to be killed.

PR ipa/111613

gcc/ChangeLog:

* ipa-modref.cc (analyze_parms): Do not preserve EAF_NO_DIRECT_READ
and
EAF_NO_INDIRECT_READ from past flags.

gcc/testsuite/ChangeLog:

* gcc.c-torture/pr111613.c: New test.

[Bug ipa/111613] [12/13 Regression] Bit field stores can be incorrectly optimized away when -fstore-merging is in effect since r12-5383-g22c242342e38eb

2024-07-22 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111613

Jan Hubicka  changed:

   What|Removed |Added

Summary|[12/13/14/15 Regression]|[12/13 Regression] Bit
   |Bit field stores can be |field stores can be
   |incorrectly optimized away  |incorrectly optimized away
   |when -fstore-merging is in  |when -fstore-merging is in
   |effect since|effect since
   |r12-5383-g22c242342e38eb|r12-5383-g22c242342e38eb

--- Comment #9 from Jan Hubicka  ---
Fixed on 14/15

[Bug ipa/113907] [12/13 regression] ICU miscompiled on x86 since r14-5109-ga291237b628f41

2024-07-22 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907

Jan Hubicka  changed:

   What|Removed |Added

Summary|[12/13/14/15 regression]|[12/13 regression] ICU
   |ICU miscompiled on x86  |miscompiled on x86 since
   |since   |r14-5109-ga291237b628f41
   |r14-5109-ga291237b628f41|

--- Comment #82 from Jan Hubicka  ---
All wrong code issues i know of are now fixed on 14/15

[Bug target/115086] bic is not used when the non-not part is a constant

2024-07-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115086

--- Comment #6 from Andrew Pinski  ---
Note andc optab was added with r15-1890-gf379596e0ba99d .

[Bug gcov-profile/83355] autofdo g++.dg/bprob/g++-bprob-1.C FAILS with ICE

2024-07-22 Thread andi-gcc at firstfloor dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83355

Andi Kleen  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #4 from Andi Kleen  ---
Long fixed in tree

[Bug target/116033] New: [14/15] RISC-V: -march=rv64gv_xtheadmemidx generates illegal vse8.v insn

2024-07-22 Thread patrick at rivosinc dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116033

Bug ID: 116033
   Summary: [14/15] RISC-V: -march=rv64gv_xtheadmemidx generates
illegal vse8.v insn
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: patrick at rivosinc dot com
  Target Milestone: ---

Testcase:
char arr_3[20][20];
void init() {
  for (int i_0 = 0; i_0 < 20; ++i_0)
for (int i_1 = 0; i_0 < 20; ++i_0)
  for (int i_1 = 0; i_1 < 20; ++i_0)
for (int i_1 = 0; i_1 < 20; ++i_1)
  arr_3[i_0][i_1] = i_1;
}

Command:
> /scratch/tc-testing/tc-compiler-fuzz-trunk/build-gcv/bin/riscv64-unknown-linux-gnu-gcc
>  -march=rv64gv_xtheadmemidx -mabi=lp64d driver.c -c -O3
/scratch/tmp/cc4KI0Jh.s: Assembler messages:
/scratch/tmp/cc4KI0Jh.s:27: Error: illegal operands `vse8.v v1,(a5),10,1'
/scratch/tmp/cc4KI0Jh.s:28: Error: illegal operands `vse8.v v5,(a1),10,1'
/scratch/tmp/cc4KI0Jh.s:29: Error: illegal operands `vse8.v v4,(a2),10,1'
/scratch/tmp/cc4KI0Jh.s:30: Error: illegal operands `vse8.v v3,(a3),10,1'
/scratch/tmp/cc4KI0Jh.s:31: Error: illegal operands `vse8.v v2,(a4),10,1'

Generated asm:
init:
vsetivlizero,4,e8,mf4,ta,ma
vid.v   v1
lui a5,%hi(.LANCHOR0)
vadd.vi v5,v1,4
vadd.vi v4,v1,8
vadd.vi v3,v1,12
addia5,a5,%lo(.LANCHOR0)
li  a4,16
vadd.vx v2,v1,a4
addia1,a5,4
addia2,a5,8
addia3,a5,12
addia4,a5,16
.L2:
vse8.v  v1,(a5),10,1
vse8.v  v5,(a1),10,1
vse8.v  v4,(a2),10,1
vse8.v  v3,(a3),10,1
vse8.v  v2,(a4),10,1
j   .L2
arr_3:
.zero   400

Godbolt:
https://godbolt.org/z/qGqxrTr3K

Using -march=rv64gcv generates the expected syntax of vse8.v v1,(a5):
https://godbolt.org/z/4Pda4fc6E

Found via fuzzer.

[Bug target/114189] Target implements obsolete vcond{,u,eq} expanders

2024-07-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114189

--- Comment #11 from Andrew Pinski  ---
(In reply to Richard Biener from comment #4)
> aarch64 reports just
> 
> FAIL: gcc.target/aarch64/if-compare_2.c check-function-bodies bar1
> FAIL: gcc.target/aarch64/if-compare_2.c check-function-bodies bar2

I think these 2 are now fixed after the fixes for PR 115659.
There is also missing andc/iorc patterns. 
Which I am testing a change for them. I am doing it as part of using andc/iorc
for scalars change due to be able to use them for vectors too.

[Bug tree-optimization/116034] New: wrong code with memcpy() from _Complex unsigned short at -fno-strict-aliasing -O1 and above

2024-07-22 Thread zsojka at seznam dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116034

Bug ID: 116034
   Summary: wrong code with memcpy() from _Complex unsigned short
at -fno-strict-aliasing -O1 and above
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Keywords: wrong-code
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zsojka at seznam dot cz
  Target Milestone: ---
  Host: x86_64-pc-linux-gnu
Target: x86_64-pc-linux-gnu

Created attachment 58721
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58721&action=edit
reduced testcase

Output:
$ x86_64-pc-linux-gnu-gcc -O1 -fno-strict-aliasing testcase.c
$ ./a.out
Aborted

$ x86_64-pc-linux-gnu-gcc -v
Using built-in specs.
COLLECT_GCC=/repo/gcc-trunk/binary-latest-amd64/bin/x86_64-pc-linux-gnu-gcc
COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-r15-2206-20240722194717-g6f81b7fa799-checking-yes-rtl-df-extra-nobootstrap-amd64/bin/../libexec/gcc/x86_64-pc-linux-gnu/15.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++
--enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra
--disable-bootstrap --with-cloog --with-ppl --with-isl
--build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu
--target=x86_64-pc-linux-gnu --with-ld=/usr/bin/x86_64-pc-linux-gnu-ld
--with-as=/usr/bin/x86_64-pc-linux-gnu-as --enable-libsanitizer
--disable-libstdcxx-pch
--prefix=/repo/gcc-trunk//binary-trunk-r15-2206-20240722194717-g6f81b7fa799-checking-yes-rtl-df-extra-nobootstrap-amd64
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 15.0.0 20240722 (experimental) (GCC) 

This is also failing on aarch64-unknown-linux-gnu and
powerpc64le-unknown-linux-gnu, but it's OK on mips64el-unknown-linux-gnuabi64
and riscv64-unknown-linux-gnu.

[Bug tree-optimization/116034] wrong code with memcpy() from _Complex unsigned short at -fno-strict-aliasing -O1 and above

2024-07-22 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116034

Sam James  changed:

   What|Removed |Added

 CC||sjames at gcc dot gnu.org

--- Comment #1 from Sam James  ---
on godbolt, 6.5 looks OK, and 7.1 starts to fail for amd64.

[Bug target/116035] New: [14/15] RISC-V: -march=rv64g_xtheadmemidx_zba generates illegal lwu insn

2024-07-22 Thread patrick at rivosinc dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116035

Bug ID: 116035
   Summary: [14/15] RISC-V: -march=rv64g_xtheadmemidx_zba
generates illegal lwu insn
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: patrick at rivosinc dot com
  Target Milestone: ---

Testcase:
void a(long);
unsigned b[11];
void c() {
  for (int d = 0; d < 11; ++d)
a(b[d]);
}

Generated asm:
c:
addisp,sp,-32
sd  s0,16(sp)
lui s0,%hi(.LANCHOR0)
addis0,s0,%lo(.LANCHOR0)
sd  s1,8(sp)
sd  ra,24(sp)
addis1,s0,44
.L2:
lwu a0,(s0),4,0
calla
bne s0,s1,.L2
ld  ra,24(sp)
ld  s0,16(sp)
ld  s1,8(sp)
addisp,sp,32
jr  ra
b:
.zero   44

Command/backtrace:
> /scratch/tc-testing/tc-compiler-fuzz-trunk/build-gcv/bin/riscv64-unknown-linux-gnu-gcc
>  -march=rv64g_xtheadmemidx_zba -mabi=lp64d -mabi=lp64d driver.c -c -O3
/scratch/tmp/ccJK0u5H.s: Assembler messages:
/scratch/tmp/ccJK0u5H.s:25: Error: illegal operands `lwu a0,(s0),4,0'

Godbolt: https://godbolt.org/z/874e85n9r

Without xtheadmemidx: https://godbolt.org/z/4Eo4Wf6h3

Found via fuzzer.

Very likely related to: pr116033

[Bug middle-end/115277] [13 regression] ICF needs to match loop bound estimates

2024-07-22 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115277

Sam James  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |hubicka at gcc dot 
gnu.org
 Status|NEW |ASSIGNED
  Known to work||14.1.1, 15.0

[Bug ipa/115033] [12/13 Regression] Incorrect optimization of by-reference closure fields by fre1 pass since r12-5113-gd70ef65692fced

2024-07-22 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115033

Sam James  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |hubicka at gcc dot 
gnu.org
  Known to fail|15.0|
 Status|NEW |ASSIGNED
  Known to work||14.1.1, 15.0

[Bug tree-optimization/116034] wrong code with memcpy() from _Complex unsigned short at -fno-strict-aliasing -O1 and above

2024-07-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116034

--- Comment #2 from Andrew Pinski  ---
Folding statement: _1 = &c + 1;
Queued stmt for removal.  Folds to: &MEM <__complex__ short unsigned int>
[(void *)&c + 1B]
Folding statement: _3 = MEM  [(char * {ref-all})_1];
Folded into: _3 = MEM  [(char * {ref-all})&c + 1B];

That is ok, but then we change it into:
  _3 = IMAGPART_EXPR ;

  1   2   >