[Bug tree-optimization/113441] [14 Regression] Fail to fold the last element with multiple loop since g:2efe3a7de0107618397264017fb045f237764cc7

2024-02-26 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441

--- Comment #27 from Tamar Christina  ---
Created attachment 57538
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57538&action=edit
proposed1.patch

proposed patch, this gets the gathers and scatters back. doing regression run.

[Bug target/114107] poor vectorization at -O3 when dealing with arrays of different multiplicity, good with -O2

2024-02-26 Thread nathanael.schaeffer at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114107

--- Comment #12 from N Schaeffer  ---
I found the "offending" option, and it seems to be indeed a cost-model problem
as Andrew Pinski said:

good code is generated by:

   gcc -O2 -ftree-vectorize -march=skylake   (since gcc 6.1)
   gcc -O1 -ftree-vectorize -march=skylake   (since gcc 8.1)
   gcc -O3 -fvect-cost-model=very-cheap -march=skylake   (with gcc 13.1+)

bad code is generated otherwise, and in particular:

   gcc -O2 -march=skylake  (does not vectorize)
   gcc -O3 -march=skylake  (bad vectorization with so many permutations)

[Bug tree-optimization/113441] [14 Regression] Fail to fold the last element with multiple loop since g:2efe3a7de0107618397264017fb045f237764cc7

2024-02-26 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441

--- Comment #28 from rguenther at suse dot de  ---
On Mon, 26 Feb 2024, tnfchris at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441
> 
> --- Comment #27 from Tamar Christina  ---
> Created attachment 57538
>   --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57538&action=edit
> proposed1.patch
> 
> proposed patch, this gets the gathers and scatters back. doing regression run.

I don't think this will fly.

[Bug driver/114082] Guidelines for options are empty

2024-02-26 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114082

--- Comment #4 from Richard Biener  ---
Can we simply comment the entire section?

[Bug tree-optimization/114086] Boolean switches could have a lot better codegen, possibly utilizing bit-vectors

2024-02-26 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114086

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
   Last reconfirmed||2024-02-26

[Bug tree-optimization/114090] [13/14 Regression] forwprop -fwrapv miscompilation

2024-02-26 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114090

--- Comment #7 from GCC Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:24aa051af7c59f37ec45aea754b48b97d210ea6d

commit r14-9175-g24aa051af7c59f37ec45aea754b48b97d210ea6d
Author: Jakub Jelinek 
Date:   Mon Feb 26 10:08:45 2024 +0100

match.pd: Guard 2 simplifications on integral TYPE_OVERFLOW_UNDEFINED
[PR114090]

These 2 patterns are incorrect on floating point, or for -fwrapv, or
for -ftrapv, or the first one for unsigned types (the second one is
mathematically correct, but we ought to just fold that to 0 instead).

So, the following patch properly guards this.

I think we don't need && !TYPE_OVERFLOW_SANITIZED (type) because
in both simplifications there would be UB before and after on
signed integer minimum.

2024-02-26  Jakub Jelinek  

PR tree-optimization/114090
* match.pd ((x >= 0 ? x : 0) + (x <= 0 ? -x : 0) -> abs x):
Restrict pattern to ANY_INTEGRAL_TYPE_P and TYPE_OVERFLOW_UNDEFINED
types.
((x <= 0 ? -x : 0) -> max(-x, 0)): Likewise.

* gcc.dg/pr114090.c: New test.

[Bug middle-end/114084] ICE: SIGSEGV: infinite recursion in fold_build2_loc / fold_binary_loc with _BitInt(127)

2024-02-26 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114084

--- Comment #7 from GCC Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:f9d2a95be5680e04f53141c2675798b06d23f409

commit r14-9174-gf9d2a95be5680e04f53141c2675798b06d23f409
Author: Jakub Jelinek 
Date:   Mon Feb 26 10:07:39 2024 +0100

fold-const: Avoid infinite recursion in +-*&|^minmax reassociation
[PR114084]

In the following testcase we infinitely recurse during BIT_IOR_EXPR
reassociation.
One operand is (unsigned _BitInt(31)) a << 4 and another operand
2147483647 >> 1 | 80 where both the right shift and the | 80
trees have TREE_CONSTANT set, but weren't folded because of delayed
folding, where some foldings are apparently done even in that case
unfortunately.
Now, the fold_binary_loc reassocation code splits both operands into
variable part, minus variable part, constant part, minus constant part,
literal part and minus literal parts, to prevent infinite recursion
punts if there are just 2 parts altogether from the 2 operands and then
goes
on with reassociation, merges first the corresponding parts from both
operands and then some further merges.
The problem with the above expressions is that we get 3 different objects,
var0 (the left shift), con1 (the right shift) and lit1 (80), so the
infinite
recursion prevention doesn't trigger, and we eventually merge con1 with
lit1, which effectively reconstructs the original op1 and then associate
that with var0 which is original op0, and associate_trees for that case
calls fold_binary.  There are some casts involved there too (the T typedef
type and the underlying _BitInt type which are stripped with STRIP_NOPS).

The following patch attempts to prevent this infinite recursion by tracking
the origin (if certain var comes from nothing - 0, op0 - 1, op1 - 2 or both
- 3)
and propagates it through all the associate_tree calls which merge the
vars.
If near the end we'd try to merge what comes solely from op0 with what
comes
solely from op1 (or vice versa), the patch punts, because then it isn't any
kind of reassociation between the two operands, if anything it should be
handled when folding the suboperands.

2024-02-26  Jakub Jelinek  

PR middle-end/114084
* fold-const.cc (fold_binary_loc): Avoid the final associate_trees
if all subtrees of var0 come from one of the op0 or op1 operands
and all subtrees of con0 come from the other one.  Don't clear
variables which are never used afterwards.

* gcc.dg/bitint-94.c: New test.

[Bug tree-optimization/114107] poor vectorization at -O3 when dealing with arrays of different multiplicity, good with -O2

2024-02-26 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114107

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
 Blocks||53947
  Component|target  |tree-optimization
   Last reconfirmed||2024-02-26

--- Comment #13 from Richard Biener  ---
Note that we fail to SLP vectorize this (at -O3 we unroll the inner loop):

t.c:4:20: note:   ==> examining statement: _34 = *_33;
t.c:4:20: missed:   peeling for gaps insufficient for access
t.c:5:51: missed:   not vectorized: relevant stmt not supported: _34 = *_33;
t.c:4:20: note:   removing SLP instance operations starting from: *_29 = _35;
t.c:4:20: missed:  unsupported SLP instances

which is because 'factor[i]' is treated as vector load

t.c:4:20: note:   node 0x687f730 (max_nunits=4, refcnt=2) const vector(4)
double
t.c:4:20: note:   op template: _34 = *_33;
t.c:4:20: note: stmt 0 _34 = *_33;
t.c:4:20: note: stmt 1 _34 = *_33;
t.c:4:20: note: stmt 2 _34 = *_33;
t.c:4:20: note: stmt 3 _34 = *_33;
t.c:4:20: note: load permutation { 0 0 0 0 }

and we don't anticipate we can do this with a load-and-splat (I'm not sure
we'd eventually do that even).

I think we might have a duplicate bugreport for this issue.

Note with GCC 13 we refuse to SLP because

t.c:4:20: missed:   Build SLP failed: not grouped load _35 = *_34;

You can help GCC by doign

void rescale_x4(double* __restrict data, const double * __restrict factor, int
n)
{
for (int i=0; ihttps://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
[Bug 53947] [meta-bug] vectorizer missed-optimizations

[Bug tree-optimization/114090] [13 Regression] forwprop -fwrapv miscompilation

2024-02-26 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114090

Jakub Jelinek  changed:

   What|Removed |Added

Summary|[13/14 Regression] forwprop |[13 Regression] forwprop
   |-fwrapv miscompilation  |-fwrapv miscompilation

--- Comment #8 from Jakub Jelinek  ---
Fixed on the trunk so far.

[Bug middle-end/114084] ICE: SIGSEGV: infinite recursion in fold_build2_loc / fold_binary_loc with _BitInt(127)

2024-02-26 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114084

Jakub Jelinek  changed:

   What|Removed |Added

 Resolution|--- |FIXED
   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org
 Status|NEW |RESOLVED

--- Comment #8 from Jakub Jelinek  ---
Fixed.

[Bug tree-optimization/114074] [11/12/13/14 Regression] wrong code at -O1 and above on x86_64-linux-gnu since r8-343

2024-02-26 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114074

Richard Biener  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=114052
 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org
   Keywords||wrong-code

--- Comment #5 from Richard Biener  ---
Possibly related to the other bug showing issues with
infer_loop_bounds_from_signedness.  OTOH:

Analyzing # of iterations of loop 1
  exit condition -2 <= [-2, + , -2](no_overflow)
  bounds on difference of bases: 0 ... 0
  result:
# of iterations 1, bounded by 1
Loop 1 iterates 1 times.
Loop 1 iterates at most 0 times.
Loop 1 likely iterates at most 0 times.
Analyzing # of iterations of loop 1
  exit condition -2 <= [-2, + , -2](no_overflow)
  bounds on difference of bases: 0 ... 0
  result:
# of iterations 1, bounded by 1
Removed pointless exit: if (_4 >= -2)

we incorrectly (looking at the IL) determine the exit will be taken in the
first iteration somehow.  Not sure where that other upper bound comes from,
but we have it zero upon entry of the pass already.

CDDCE has

Induction variable (int) -1 + 2 * iteration does not wrap in statement _1 =
~a.4_18;
 in loop 1.
Statement _1 = ~a.4_18;
 is executed at most 1 (bounded by 1) + 1 times in loop 1.
Induction variable (int) -2147480647 + -6002(OVF) * iteration does not wrap in
statement _2 = _1 * 2147480647;
 in loop 1. 
Statement _2 = _1 * 2147480647;
 is executed at most 0(OVF) (bounded by 0) + 1 times in loop 1.

ranges look somewhat odd (_4 starts at -4 but the merge PHI a.4_18 only at -2),
but not necessarily wrong.  So this might also be a SCEV issue computing
that odd IV for _2.

[Bug ipa/61159] __builtin_constant_p gives incorrect results with aliases

2024-02-26 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61159

--- Comment #8 from GCC Commits  ---
The master branch has been updated by Rainer Orth :

https://gcc.gnu.org/g:a25d7d1385087e0f43574064db45f1bc7d52f400

commit r14-9176-ga25d7d1385087e0f43574064db45f1bc7d52f400
Author: Rainer Orth 
Date:   Mon Feb 26 10:42:04 2024 +0100

testsuite: xfail gcc.c-torture/compile/pr61159.c on Solaris/x86 with as
[PR61159]

gcc.c-torture/compile/pr61159.c currently FAILs on 32 and 64-bit
Solaris/x86 with the native assembler:

FAIL: gcc.c-torture/compile/pr61159.c   -O0  (test for excess errors)
FAIL: gcc.c-torture/compile/pr61159.c   -O1  (test for excess errors)
FAIL: gcc.c-torture/compile/pr61159.c   -O2  (test for excess errors)
FAIL: gcc.c-torture/compile/pr61159.c   -O2 -flto  (test for excess errors)
FAIL: gcc.c-torture/compile/pr61159.c   -O2 -flto -flto-partition=none 
(test for excess errors)
FAIL: gcc.c-torture/compile/pr61159.c   -O3 -g  (test for excess errors)
FAIL: gcc.c-torture/compile/pr61159.c   -Os  (test for excess errors)

Excess errors:
Assembler: pr61159.c
"/var/tmp//ccRtFPva.s", line 5 : Cannot set a weak symbol to a
common symbol

This is a bug/limitation in the native assembler.  Given that this
hasn't seen fixes for a long time, this patch xfails the test.

Tested on i386-pc-solaris2.11 (as and gas) and x86_64-pc-linux-gnu.

2024-02-24  Rainer Orth  

gcc/testsuite:
PR ipa/61159
* gcc.c-torture/compile/pr61159.c: xfail on Solaris/x86 with as.

[Bug ipa/61159] __builtin_constant_p gives incorrect results with aliases

2024-02-26 Thread ro at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61159

Rainer Orth  changed:

   What|Removed |Added

   Assignee|hubicka at gcc dot gnu.org |ro at gcc dot gnu.org
   Target Milestone|--- |14.0
URL||https://gcc.gnu.org/piperma
   ||il/gcc-patches/2024-Februar
   ||y/646514.html
 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #9 from Rainer Orth  ---
Test xfail'ed on Solaris/x86 with as for GCC 14.0.1.

[Bug ipa/70582] [11/12/13/14 regression] gcc.dg/attr-weakref-1.c FAILs

2024-02-26 Thread ro at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70582

Rainer Orth  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
   Assignee|hubicka at gcc dot gnu.org |ro at gcc dot gnu.org
   Target Milestone|11.5|14.0
 Resolution|--- |FIXED

--- Comment #20 from Rainer Orth  ---
Mine, fixed for GCC 14.0.1.

[Bug target/100799] Stackoverflow in optimized code on PPC

2024-02-26 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100799

--- Comment #28 from Jakub Jelinek  ---
(In reply to Peter Bergner from comment #27)
> So I looked closer at what the failure mode was in this PR (versus the one
> you're seeing with flexiblas).  As in your case, there is a mismatch in the
> number of parameters the C caller thinks there are (8 args, so no param save
> area needed) versus what the Fortran callee thinks there are (9 params which
> include the one hidden arg, so there is a param save area).  The Fortran
> function doesn't actually access the hidden argument in our test case above,
> in fact the character argument is never used either.  What I see in the rtl
> dumps is that *all* incoming args have a REG_EQUIV generated that points to
> the param save area (this doesn't happen when there are 8 or fewer formal
> params), even for the first 8 args that are passed in registers:

Yes, so it is the backend that told function.cc that there is a parameter save
area and it should be adding REG_EQUIV notes.  So, the idea would be that for
the case we talk about (<= 8 normal arguments, then only unused
DECL_HIDDEN_STRING_LENGTH ones) that the backend would also say that there is
no parameter save area, basically pretend there are <= 8 arguments.

> > Doing the workaround on the caller side is impossible, this is for calls
> > from C/C++ to Fortran code, directly or indirectly called and there is
> > nothing the compiler could use to guess that it actually calls Fortran code
> > with hidden Fortran character arguments.
> As a HUGE hammer, every caller could always allocate a param save area. 
> That would "fix" the problem from this bug, but would that also fix the bug
> you're seeing in flexiblas?

Most likely yes.  Though of course that is way too high price to pay, even with
some non-default option.  If we can't workaround it in the backend just on the
callee side of calls which have the unused hidden string length arguments, then
better no changes
on the GCC side.

[Bug middle-end/114109] New: x264 satd vectorization vs LLVM

2024-02-26 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114109

Bug ID: 114109
   Summary: x264 satd vectorization vs LLVM
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: enhancement
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: rdapp at gcc dot gnu.org
CC: juzhe.zhong at rivai dot ai, law at gcc dot gnu.org
  Target Milestone: ---
Target: x86_64-*-* riscv*-*-*

Looking at the following code of x264 (SPEC 2017):

typedef unsigned char uint8_t;
typedef unsigned short uint16_t;
typedef unsigned int uint32_t;

static inline uint32_t abs2 (uint32_t a)
{
uint32_t s = ((a >> 15) & 0x10001) * 0x;
return (a + s) ^ s;
}

int x264_pixel_satd_8x4 (uint8_t *pix1, int i_pix1, uint8_t *pix2, int i_pix2)
{
uint32_t tmp[4][4];
uint32_t a0, a1, a2, a3;
int sum = 0;

for( int i = 0; i < 4; i++, pix1 += i_pix1, pix2 += i_pix2 )
{
a0 = (pix1[0] - pix2[0]) + ((pix1[4] - pix2[4]) << 16);
a1 = (pix1[1] - pix2[1]) + ((pix1[5] - pix2[5]) << 16);
a2 = (pix1[2] - pix2[2]) + ((pix1[6] - pix2[6]) << 16);
a3 = (pix1[3] - pix2[3]) + ((pix1[7] - pix2[7]) << 16);
{
  int t0 = a0 + a1;
  int t1 = a0 - a1;
  int t2 = a2 + a3;
  int t3 = a2 - a3;
  tmp[i][0] = t0 + t2;
  tmp[i][1] = t1 + t3;
  tmp[i][2] = t0 - t2;
  tmp[i][3] = t1 - t3;
};
}
for( int i = 0; i < 4; i++ )
{
{ int t0 = tmp[0][i] + tmp[1][i];
  int t1 = tmp[0][i] - tmp[1][i];
  int t2 = tmp[2][i] + tmp[3][i];
  int t3 = tmp[2][i] - tmp[3][i];
  a0 = t0 + t2;
  a2 = t0 - t2;
  a1 = t1 + t3;
  a3 = t1 - t3;
};
sum += abs2 (a0) + abs2 (a1) + abs2 (a2) + abs2 (a3);
}
return (((uint16_t) sum) + ((uint32_t) sum > >16)) >> 1;
}

I first checked on riscv but x86 and aarch64 are pretty similar.  (Refer
https://godbolt.org/z/vzf5ha44r that compares at -O3 -mavx512f)

Vectorizing the first loop seems to be a costing issue.  By default we don't
vectorize and the code becomes much larger when disabling vector costing, so
the costing decision in itself seems correct.
Clang's version is significantly shorter and it looks like it just directly
vec_sets/vec_inits the individual elements.  On riscv it can be handled rather
elegantly with strided loads that we don't emit right now.
As there are only 4 active vector elements and the loop is likely load bound it
might be debatable whether LLVM's version is better?

The second loop we do vectorize (4 elements at a time) but end up with e.g.
four XORs for the four inlined abs2 calls while clang chooses a larger
vectorization factor and does all the xors in one.

On my laptop (no avx512) I don't see a huge difference (113s GCC vs 108s LLVM)
but I guess the general case is still interesting?

[Bug target/114097] Missed register optimization in _Noreturn functions

2024-02-26 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114097

--- Comment #6 from GCC Commits  ---
The master branch has been updated by H.J. Lu :

https://gcc.gnu.org/g:bb98f71bac8aace4e685e648a81dfaf365123833

commit r14-9178-gbb98f71bac8aace4e685e648a81dfaf365123833
Author: H.J. Lu 
Date:   Sun Feb 25 13:14:39 2024 -0800

x86: Check interrupt instead of noreturn attribute

ix86_set_func_type checks noreturn attribute to avoid incompatible
attribute error in LTO1 on interrupt functions.  Since TREE_THIS_VOLATILE
is set also for _Noreturn without noreturn attribute, check interrupt
attribute for interrupt functions instead.

gcc/

PR target/114097
* config/i386/i386-options.cc (ix86_set_func_type): Check
interrupt instead of noreturn attribute.

gcc/testsuite/

PR target/114097
* gcc.target/i386/pr114097-1.c: New test.

[Bug ipa/113996] [11/12/13/14 Regression] ICE with LTO at -O2 and above with some Ada code

2024-02-26 Thread ebotcazou at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113996

Eric Botcazou  changed:

   What|Removed |Added

 CC||ebotcazou at gcc dot gnu.org
   Last reconfirmed||2024-02-26
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW

--- Comment #3 from Eric Botcazou  ---
A good reason not to compile with -gnatp. ;-)  It's this assertion:

  /* Initialize the static chain.  */
  p = DECL_STRUCT_FUNCTION (fn)->static_chain_decl;
  gcc_assert (fn != current_function_decl);
  if (p)
{
  /* No static chain?  Seems like a bug in tree-nested.cc.  */
  gcc_assert (static_chain);

  setup_one_parameter (id, p, static_chain, fn, bb, &vars);
}

[Bug tree-optimization/114099] [14 regression] ICE in find_uses_to_rename_use when building darktable-4.6.1

2024-02-26 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114099

Richard Biener  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org

--- Comment #9 from Richard Biener  ---
Let me handle this as well.

[Bug middle-end/114109] x264 satd vectorization vs LLVM

2024-02-26 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114109

--- Comment #1 from JuzheZhong  ---
It seems RISC-V Clang didn't vectorize it ?

https://godbolt.org/z/G4han6vM3

[Bug libstdc++/114103] FAIL: 29_atomics/atomic/lock_free_aliases.cc -std=gnu++20 (test for excess errors)

2024-02-26 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114103

--- Comment #1 from Jonathan Wakely  ---
We define them as:

#ifdef __cpp_lib_atomic_lock_free_type_aliases
# ifdef _GLIBCXX_HAVE_PLATFORM_WAIT
  using atomic_signed_lock_free
= atomic>;
  using atomic_unsigned_lock_free
= atomic>;
# elif ATOMIC_INT_LOCK_FREE || !(ATOMIC_LONG_LOCK_FREE ||
ATOMIC_CHAR_LOCK_FREE)
  using atomic_signed_lock_free = atomic;
  using atomic_unsigned_lock_free = atomic;
# elif ATOMIC_LONG_LOCK_FREE
  using atomic_signed_lock_free = atomic;
  using atomic_unsigned_lock_free = atomic;
# elif ATOMIC_CHAR_LOCK_FREE
  using atomic_signed_lock_free = atomic;
  using atomic_unsigned_lock_free = atomic;
# else
# error "libstdc++ bug: no lock-free atomics but they were emitted in
"
# endif
#endif


And then test them with:

static_assert( std::atomic_signed_lock_free::is_always_lock_free );
static_assert( std::atomic_unsigned_lock_free::is_always_lock_free );


I assume the problem is that the ATOMIC_xxx_LOCK_FREE macros have value 1 not
2, so they're not unconditionally lock-free.

Are any of the atomic integer types always lock-free for this target?

[Bug libstdc++/114103] FAIL: 29_atomics/atomic/lock_free_aliases.cc -std=gnu++20 (test for excess errors)

2024-02-26 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114103

--- Comment #2 from Jonathan Wakely  ---
Created attachment 57539
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57539&action=edit
make lock-free aliases actually check for lock  freedom

Maybe we want to do this.

[Bug libstdc++/114103] FAIL: 29_atomics/atomic/lock_free_aliases.cc -std=gnu++20 (test for excess errors)

2024-02-26 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114103

Jonathan Wakely  changed:

   What|Removed |Added

  Attachment #57539|0   |1
is obsolete||

--- Comment #3 from Jonathan Wakely  ---
Created attachment 57540
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57540&action=edit
make lock-free aliases actually check for lock  freedom

Oops, let's try that again without unrelated changes in the patch.

[Bug ada/114065] gnat build with -D_TIME_BITS=64 -D_FILE_OFFSET_BITS=64 fails on 32bit archs

2024-02-26 Thread ebotcazou at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114065

Eric Botcazou  changed:

   What|Removed |Added

   Last reconfirmed||2024-02-26
 CC||ebotcazou at gcc dot gnu.org
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1

--- Comment #1 from Eric Botcazou  ---
> however that's not the correct fix. Is there any way to fix this in a better
> way?

s-parame__posix2008.ads already has the 64-bit time_t so you just need to tweak
Makefile.rtl.

[Bug target/114097] Missed register optimization in _Noreturn functions

2024-02-26 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114097

H.J. Lu  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #7 from H.J. Lu  ---
Fixed.

[Bug middle-end/114109] x264 satd vectorization vs LLVM

2024-02-26 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114109

--- Comment #2 from Robin Dapp  ---
It is vectorized with a higher zvl, e.g. zvl512b, refer
https://godbolt.org/z/vbfjYn5Kd.

[Bug middle-end/114109] x264 satd vectorization vs LLVM

2024-02-26 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114109

--- Comment #3 from JuzheZhong  ---
(In reply to Robin Dapp from comment #2)
> It is vectorized with a higher zvl, e.g. zvl512b, refer
> https://godbolt.org/z/vbfjYn5Kd.

OK. I see. But Clang generates many slide instruction which are expensive in
real hardware.

And also vluxei64 is also expensive.

I am not sure which is better. It should be tested on real RISC-V hardware to
evaluate their performance rather than simply tested on SPIKE/QEMU dynamic
instructions count.

[Bug middle-end/114109] x264 satd vectorization vs LLVM

2024-02-26 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114109

--- Comment #4 from Robin Dapp  ---
Yes, as mentioned, vectorization of the first loop is debatable.

[Bug c++/114110] New: unhelpful message about non-movable types

2024-02-26 Thread f.heckenbach--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114110

Bug ID: 114110
   Summary: unhelpful message about non-movable types
   Product: gcc
   Version: 12.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: f.heckenb...@fh-soft.de
  Target Milestone: ---

Here's another gcc output that's less than helpful:

% cat test.cpp
#include 
#include 

struct S
{
  std::unique_ptr  a, b, c, d, e, f, g, h, i, j;
  ~S () = default;
};

int main ()
{
  S a, b = std::move (a);
}
% g++ test.cpp
test.cpp: In function 'int main()':
test.cpp:12:24: error: use of deleted function 'S::S(const S&)'
   12 |   S a, b = std::move (a);
  |^
test.cpp:4:8: note: 'S::S(const S&)' is implicitly deleted because the default
definition would be ill-formed:
4 | struct S
  |^
test.cpp:4:8: error: use of deleted function 'std::unique_ptr<_Tp,
_Dp>::unique_ptr(const std::unique_ptr<_Tp, _Dp>&) [with _Tp = int; _Dp =
std::default_delete]'
In file included from /usr/include/c++/12/memory:76,
 from test.cpp:2:
/usr/include/c++/12/bits/unique_ptr.h:514:7: note: declared here
  514 |   unique_ptr(const unique_ptr&) = delete;
  |   ^~
test.cpp:4:8: error: use of deleted function 'std::unique_ptr<_Tp,
_Dp>::unique_ptr(const std::unique_ptr<_Tp, _Dp>&) [with _Tp = int; _Dp =
std::default_delete]'
4 | struct S
  |^
/usr/include/c++/12/bits/unique_ptr.h:514:7: note: declared here
  514 |   unique_ptr(const unique_ptr&) = delete;
  |   ^~
test.cpp:4:8: error: use of deleted function 'std::unique_ptr<_Tp,
_Dp>::unique_ptr(const std::unique_ptr<_Tp, _Dp>&) [with _Tp = int; _Dp =
std::default_delete]'
4 | struct S
  |^
/usr/include/c++/12/bits/unique_ptr.h:514:7: note: declared here
  514 |   unique_ptr(const unique_ptr&) = delete;
  |   ^~
test.cpp:4:8: error: use of deleted function 'std::unique_ptr<_Tp,
_Dp>::unique_ptr(const std::unique_ptr<_Tp, _Dp>&) [with _Tp = int; _Dp =
std::default_delete]'
4 | struct S
  |^
/usr/include/c++/12/bits/unique_ptr.h:514:7: note: declared here
  514 |   unique_ptr(const unique_ptr&) = delete;
  |   ^~
test.cpp:4:8: error: use of deleted function 'std::unique_ptr<_Tp,
_Dp>::unique_ptr(const std::unique_ptr<_Tp, _Dp>&) [with _Tp = int; _Dp =
std::default_delete]'
4 | struct S
  |^
/usr/include/c++/12/bits/unique_ptr.h:514:7: note: declared here
  514 |   unique_ptr(const unique_ptr&) = delete;
  |   ^~
test.cpp:4:8: error: use of deleted function 'std::unique_ptr<_Tp,
_Dp>::unique_ptr(const std::unique_ptr<_Tp, _Dp>&) [with _Tp = int; _Dp =
std::default_delete]'
4 | struct S
  |^
/usr/include/c++/12/bits/unique_ptr.h:514:7: note: declared here
  514 |   unique_ptr(const unique_ptr&) = delete;
  |   ^~
test.cpp:4:8: error: use of deleted function 'std::unique_ptr<_Tp,
_Dp>::unique_ptr(const std::unique_ptr<_Tp, _Dp>&) [with _Tp = int; _Dp =
std::default_delete]'
4 | struct S
  |^
/usr/include/c++/12/bits/unique_ptr.h:514:7: note: declared here
  514 |   unique_ptr(const unique_ptr&) = delete;
  |   ^~
test.cpp:4:8: error: use of deleted function 'std::unique_ptr<_Tp,
_Dp>::unique_ptr(const std::unique_ptr<_Tp, _Dp>&) [with _Tp = int; _Dp =
std::default_delete]'
4 | struct S
  |^
/usr/include/c++/12/bits/unique_ptr.h:514:7: note: declared here
  514 |   unique_ptr(const unique_ptr&) = delete;
  |   ^~
test.cpp:4:8: error: use of deleted function 'std::unique_ptr<_Tp,
_Dp>::unique_ptr(const std::unique_ptr<_Tp, _Dp>&) [with _Tp = int; _Dp =
std::default_delete]'
4 | struct S
  |^
/usr/include/c++/12/bits/unique_ptr.h:514:7: note: declared here
  514 |   unique_ptr(const unique_ptr&) = delete;
  |   ^~
test.cpp:4:8: error: use of deleted function 'std::unique_ptr<_Tp,
_Dp>::unique_ptr(const std::unique_ptr<_Tp, _Dp>&) [with _Tp = int; _Dp =
std::default_delete]'
4 | struct S
  |^
/usr/include/c++/12/bits/unique_ptr.h:514:7: note: declared here
  514 |   unique_ptr(const unique_ptr&) = delete;
  |   ^~

gcc explains in much detail (and not very readable, cf.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113854#c4, esp. the bottom part),
repeating the same message for each member, why the type is not copyable. Well,
I know it's not copyable, that's why I'm trying to move it. Sure, a
copy-constructor can be used when there is no useable move-constructor, but if
both copy- and move-constructor are not useable, it's quite misleading to only
explain why the copy-constructor is not useable when moving was requested in
the first place.

The actual problem here is that the move

[Bug middle-end/114111] New: [avr] Expensive code instead of conditional branch.

2024-02-26 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114111

Bug ID: 114111
   Summary: [avr] Expensive code instead of conditional branch.
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gjl at gcc dot gnu.org
  Target Milestone: ---

Created attachment 57541
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57541&action=edit
addcc.c: C test case

Compile the code with avr-gcc -S -Os -dp:

int add_ge0 (int x, char c) {
return x + (c >= 0);
}

int add_eq0 (int x, char c) {
return x + (c == 0);
}

int add_le0 (int x, char c) {
return x + (c <= 0);
}

int add_ge1 (int x, char c) {
return x + (c >= 1);
}

int add_ltm3 (int x, char c) {
return x + (c < -3);
}

int add_bit6 (int x, char c) {
return x + !!(c & (1 << 6));
}

int add_nbit6 (int x, char c) {
return x + !(c & (1 << 6));
}

All these could be performed by a test and the addition of x in an if-block. 
But what the compiler does is to extend the 8-bit value c to 16 bit, then
complement it, then shift the MSB to the LSB:

add_ge0:
mov __tmp_reg__,r22  ;  23  [c=12 l=3]  *extendqihi2/0
lsl r0  
sbc r23,r23
com r22  ;  24  [c=8 l=2]  *one_cmplhi2
com r23
bst r23,7;  31  [c=16 l=4]  *lshrhi3_const/3
clr r22
clr r23
bld r22,0
add r24,r22  ;  26  [c=8 l=2]  *addhi3/0
adc r25,r23
ret  ;  29  [c=0 l=1]  return

Even when it does a conditional to set the addend, it should rather have the
addition in the if-block (and moving x to R18 adds even more bloat):

add_eq0:
mov r18,r24  ;  44  [c=4 l=1]  movqi_insn/0
mov r19,r25  ;  45  [c=4 l=1]  movqi_insn/0
ldi r24,lo8(1)   ;  46  [c=4 l=2]  *movhi/4
ldi r25,0   
cp r22, __zero_reg__ ;  47  [c=4 l=1]  cmpqi3/0
breq .L3 ;  48  [c=4 l=1]  branch
ldi r24,0;  43  [c=4 l=2]  *movhi/1
ldi r25,0   
.L3:
add r24,r18  ;  42  [c=8 l=2]  *addhi3/0
adc r25,r19
ret  ;  51  [c=0 l=1]  return

...
.ident  "GCC: (GNU) 14.0.1 20240212 (experimental)"

With avr-gcc 3.4.6 from around 2006, the generated code is as follows:

add_ge0:
sbrs r22,7   ;  38  *sbrx_branch[length = 2]
adiw r24,1   ;  15  *addhi3/2   [length = 1]
.L2:
ret  ;  37  return  [length = 1]

add_eq0:
tst r22  ;  13  tstqi   [length = 1]
brne .L4 ;  14  branch  [length = 1]
adiw r24,1   ;  15  *addhi3/2   [length = 1]
.L4:
ret  ;  35  return  [length = 1]

etc.  So at some point in time GCC lost all that smartness.

Appears to be around emit_stor_flag and friends; as far as I can see it doesn't
even try to work out costs.

[Bug target/113507] can't build a cross compiler to rs6000-ibm-aix7.2

2024-02-26 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113507

Kewen Lin  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |segher at gcc dot 
gnu.org

--- Comment #6 from Kewen Lin  ---
Segher will clean up this rs6000-*-* thing in next release, please use
powerpc*-*-* instead.

[Bug tree-optimization/114074] [11/12/13/14 Regression] wrong code at -O1 and above on x86_64-linux-gnu since r8-343

2024-02-26 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114074

--- Comment #6 from Richard Biener  ---
   [local count: 1014686025]:
  # a.4_18 = PHI <_4(8), 0(2)>
  b = 2147480647;
  _1 = ~a.4_18;
  _2 = _1 * 2147480647;
  a = _2;
  foo ();
  a.2_3 = a;
  if (a.2_3 == 0)
goto ; [5.50%]
  else
goto ; [94.50%]

   [local count: 958878295]:
  _4 = a.4_18 + -2;
  a = _4;
  if (_4 >= -2)
goto ; [94.50%]
  else
goto ; [5.50%]

   [local count: 906139989]:
  goto ; [100.00%]

and we get

(set_scalar_evolution
  instantiated_below = 2
  (scalar = _1)
  (scalar_evolution = {-1, +, 2}_1))
)

this is ~{0, + -2} which I think we handle as -1 - X

And we get

(set_scalar_evolution
  instantiated_below = 2
  (scalar = _2)
  (scalar_evolution = {-2147480647, +, -6002(OVF)}_1))
)

and that's wrong, the 2nd iteration _2 should be 1 * 2147480647 but indeed
the difference isn't representable in the signed integer increment of the
CHREC.

It's probably safes to go chrec_dont_know here, the alternative would be
probably (int){-2147480647u, +, -6002u}_1 which likely doesn't help much
in practice?

[Bug tree-optimization/114074] [11/12/13/14 Regression] wrong code at -O1 and above on x86_64-linux-gnu since r8-343

2024-02-26 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114074

--- Comment #7 from Richard Biener  ---
In fact SCEV does, in chrec_fold_multiply for a mixed multiplication:

  return build_polynomial_chrec
(CHREC_VARIABLE (op0),
 chrec_fold_multiply (type, CHREC_LEFT (op0), op1),
 chrec_fold_multiply (type, CHREC_RIGHT (op0), op1));

but that's (a + b) * c -> a * c + b * c which does not preserve overflow
behavior and thus isn't valid when the result is a signed CHREC with
undefined behavior on overflow and 'a + b' evaluates to 0, 1 or -1 as is
the case here with a == -1 and b == 2.  In the SCEV case it's OK to
do CHREC_LEFT * op1 but CHREC_RIGHT * op1 may not be evaluated this way.

fold-const.cc:fold_plusminus_mult deals with this by doing the multiplications
and addition in unsigned.

The exception we can definitely make is when for {a, +, b}_1 a and b have
the same sign or a is zero.  I'm not sure we can generally handle any
other case - we are already special-casing c == 0 and c == 1.

[Bug tree-optimization/114086] Boolean switches could have a lot better codegen, possibly utilizing bit-vectors

2024-02-26 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114086

Jakub Jelinek  changed:

   What|Removed |Added

 CC||aldyh at gcc dot gnu.org,
   ||amacleod at redhat dot com

--- Comment #8 from Jakub Jelinek  ---
Unfortunately doing the ((682 >> x) & 1) to x & 1 optimization in match.pd
isn't possible, we can only use global ranges there and we need path specific
range here.
Can it be done in VRP pass?  Though, I'm afraid I'm quite lost where it
actually has
the statement optimizations (rather than mere computing of ranges),
Aldy/Andrew, any hints?  I mean like what old tree-vrp.c was doing in
simplify_stmt_using_ranges.
Guess we could duplicate that in match.pd for the case which can use global
range or
doesn't need any range at all.
I mean
unsigned int
foo (int x)
{
  return (0xU >> x) & 1;
}

unsigned int
bar (int x)
{
  return (0xU >> x) & 1;
}

unsigned int
baz (int x)
{
  if (x >= 22) __builtin_unreachable ();
  return (0x5aU >> x) & 1;
}
can be optimized even with global ranges (or the first one with no ranges).
foo and baz equivalent is x & 1, while bar is (~x) & 1 or (x & 1) ^ 1, dunno
what is more canonical.

[Bug c++/114110] unhelpful message about non-movable types

2024-02-26 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114110

Jonathan Wakely  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Keywords||diagnostic
   Last reconfirmed||2024-02-26

--- Comment #1 from Jonathan Wakely  ---
(In reply to Frank Heckenbach from comment #0)
> The actual problem here is that the move-constructor was deleted because a

No, the move constructor is simply *not declared*. If it had been deleted then
it would have been selected by overload resolution as a better match and you'd
get an error telling you the move constructor was deleted.

The distinction is important (and frequently misunderstood).

> destructor was declared, but gcc does not mention this at all (instead of,
> or at least in addition to, why the copy-constructor was deleted).

That would be better.

If move construction fails, it's unlikely that the type is
copyable-but-not-movable, because such types are stupid and should not exist.
So then the question is whether it's supposed to be completely non-copyable and
non-movable, or whether the class has a bug that we can help to diagnose.

The right heuristic is probably:

If initializing T from a T rvalue is ill-formed, check whether an implicit move
constructor was suppressed. If it was, print a fix-it suggesting:

  S(S&&) = default;

Reduced:

struct MO {
  MO(MO&&) { }
};
struct S {
  ~S() = default;
  MO m;
};
S&& f();
S s = f();

This prints:

mo.cc:7:9: error: use of deleted function ‘S::S(const S&)’
7 | S s = f();
  | ^
mo.cc:2:8: note: ‘S::S(const S&)’ is implicitly deleted because the default
definition would be ill-formed:
2 | struct S {
  |^
mo.cc:2:8: error: use of deleted function ‘constexpr MO::MO(const MO&)’
mo.cc:1:8: note: ‘constexpr MO::MO(const MO&)’ is implicitly declared as
deleted because ‘MO’ declares a move constructor or move assignment operator
1 | struct MO {
  |^~


I think it would be better to print:

mo.cc:7:9: error: use of deleted function ‘S::S(const S&)’
7 | S s = f();
  | ^
mo.cc:5:4: note: ‘S::S(S&&)’ is not implicitly declared 'S' declares a
destructor:
2 |   ~S() = default;
  |   ^~
mo.cc:5:4: note: add a user-declared move constructor to fix this:
   ~S() = default;
   ^
   S(S&&) = default;

mo.cc:2:8: note: ‘S::S(const S&)’ is implicitly deleted because the default
definition would be ill-formed:
2 | struct S {
  |^
mo.cc:2:8: error: use of deleted function ‘constexpr MO::MO(const MO&)’
mo.cc:1:8: note: ‘constexpr MO::MO(const MO&)’ is implicitly declared as
deleted because ‘MO’ declares a move constructor or move assignment operator
1 | struct MO {
  |^~


N.B. I don't think showing the locations "struct S {" and "struct MO {" for the
implicitly deleted copy constructors is useful, but I think I've said that in
another bug report somewhere.

[Bug ada/113893] finalization of object allocated by anonymous access type designating local type

2024-02-26 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113893

--- Comment #4 from GCC Commits  ---
The master branch has been updated by Eric Botcazou :

https://gcc.gnu.org/g:39c07c5a3bf4a865175727bf60d5758372543b87

commit r14-9179-g39c07c5a3bf4a865175727bf60d5758372543b87
Author: Eric Botcazou 
Date:   Mon Feb 26 13:13:34 2024 +0100

Finalization of object allocated by anonymous access designating local type

The finalization of objects dynamically allocated through an anonymous
access type is deferred to the enclosing library unit in the current
implementation and a warning is given on each of them.

However this cannot be done if the designated type is local, because this
would generate dangling references to the local finalization routine, so
the finalization needs to be dropped in this case and the warning adjusted.

gcc/ada/
PR ada/113893
* exp_ch7.adb (Build_Anonymous_Master): Do not build the master
for a local designated type.
* exp_util.adb (Build_Allocate_Deallocate_Proc): Force Needs_Fin
to false if no finalization master is attached to an access type
and assert that it is anonymous in this case.
* sem_res.adb (Resolve_Allocator): Mention that the object might
not be finalized at all in the warning given when the type is an
anonymous access-to-controlled type.

gcc/testsuite/
* gnat.dg/access10.adb: New test.

[Bug libstdc++/114103] FAIL: 29_atomics/atomic/lock_free_aliases.cc -std=gnu++20 (test for excess errors)

2024-02-26 Thread dave.anglin at bell dot net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114103

--- Comment #4 from dave.anglin at bell dot net ---
On 2024-02-26 5:54 a.m., redi at gcc dot gnu.org wrote:
> I assume the problem is that the ATOMIC_xxx_LOCK_FREE macros have value 1 not
> 2, so they're not unconditionally lock-free.
>
> Are any of the atomic integer types always lock-free for this target?
No.  The only "lock free" operations are load and clear word/double word.

On linux, we fudge the support in the kernel where we can disable interrupts
but the operation
still can spin.

[Bug ada/113893] finalization of object allocated by anonymous access type designating local type

2024-02-26 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113893

--- Comment #5 from GCC Commits  ---
The releases/gcc-13 branch has been updated by Eric Botcazou
:

https://gcc.gnu.org/g:88661078eac2440fbc2cd5b32ee31cec93f84d08

commit r13-8363-g88661078eac2440fbc2cd5b32ee31cec93f84d08
Author: Eric Botcazou 
Date:   Mon Feb 26 13:13:34 2024 +0100

Finalization of object allocated by anonymous access designating local type

The finalization of objects dynamically allocated through an anonymous
access type is deferred to the enclosing library unit in the current
implementation and a warning is given on each of them.

However this cannot be done if the designated type is local, because this
would generate dangling references to the local finalization routine, so
the finalization needs to be dropped in this case and the warning adjusted.

gcc/ada/
PR ada/113893
* exp_ch7.adb (Build_Anonymous_Master): Do not build the master
for a local designated type.
* exp_util.adb (Build_Allocate_Deallocate_Proc): Force Needs_Fin
to false if no finalization master is attached to an access type
and assert that it is anonymous in this case.
* sem_res.adb (Resolve_Allocator): Mention that the object might
not be finalized at all in the warning given when the type is an
anonymous access-to-controlled type.

gcc/testsuite/
* gnat.dg/access10.adb: New test.

[Bug libstdc++/114103] FAIL: 29_atomics/atomic/lock_free_aliases.cc -std=gnu++20 (test for excess errors)

2024-02-26 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114103

--- Comment #5 from Jonathan Wakely  ---
OK then I think we don't want these aliases to be defined at all (which means
we cannot be fully C++20 conformant) and the test should be xfailed or skipped.

[Bug ada/113893] finalization of object allocated by anonymous access type designating local type

2024-02-26 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113893

--- Comment #6 from GCC Commits  ---
The releases/gcc-12 branch has been updated by Eric Botcazou
:

https://gcc.gnu.org/g:1a915f6ab52eff19eb3c890a127c6693c8ce4f65

commit r12-10178-g1a915f6ab52eff19eb3c890a127c6693c8ce4f65
Author: Eric Botcazou 
Date:   Mon Feb 26 13:13:34 2024 +0100

Finalization of object allocated by anonymous access designating local type

The finalization of objects dynamically allocated through an anonymous
access type is deferred to the enclosing library unit in the current
implementation and a warning is given on each of them.

However this cannot be done if the designated type is local, because this
would generate dangling references to the local finalization routine, so
the finalization needs to be dropped in this case and the warning adjusted.

gcc/ada/
PR ada/113893
* exp_ch7.adb (Build_Anonymous_Master): Do not build the master
for a local designated type.
* exp_util.adb (Build_Allocate_Deallocate_Proc): Force Needs_Fin
to false if no finalization master is attached to an access type
and assert that it is anonymous in this case.
* sem_res.adb (Resolve_Allocator): Mention that the object might
not be finalized at all in the warning given when the type is an
anonymous access-to-controlled type.

gcc/testsuite/
* gnat.dg/access10.adb: New test.

[Bug ada/113893] finalization of object allocated by anonymous access type designating local type

2024-02-26 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113893

--- Comment #7 from GCC Commits  ---
The releases/gcc-11 branch has been updated by Eric Botcazou
:

https://gcc.gnu.org/g:bbf799a972201e82f54ee17dc3bf7a093a98077a

commit r11-11255-gbbf799a972201e82f54ee17dc3bf7a093a98077a
Author: Eric Botcazou 
Date:   Mon Feb 26 13:13:34 2024 +0100

Finalization of object allocated by anonymous access designating local type

The finalization of objects dynamically allocated through an anonymous
access type is deferred to the enclosing library unit in the current
implementation and a warning is given on each of them.

However this cannot be done if the designated type is local, because this
would generate dangling references to the local finalization routine, so
the finalization needs to be dropped in this case and the warning adjusted.

gcc/ada/
PR ada/113893
* exp_ch7.adb (Build_Anonymous_Master): Do not build the master
for a local designated type.
* exp_util.adb (Build_Allocate_Deallocate_Proc): Force Needs_Fin
to false if no finalization master is attached to an access type
and assert that it is anonymous in this case.
* sem_res.adb (Resolve_Allocator): Mention that the object might
not be finalized at all in the warning given when the type is an
anonymous access-to-controlled type.

gcc/testsuite/
* gnat.dg/access10.adb: New test.

[Bug libstdc++/114103] FAIL: 29_atomics/atomic/lock_free_aliases.cc -std=gnu++20 (test for excess errors)

2024-02-26 Thread dave.anglin at bell dot net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114103

--- Comment #6 from dave.anglin at bell dot net ---
On 2024-02-26 7:22 a.m., redi at gcc dot gnu.org wrote:
> OK then I think we don't want these aliases to be defined at all (which means
> we cannot be fully C++20 conformant) and the test should be xfailed or 
> skipped.
That's what I was thinking.

[Bug ada/113893] finalization of object allocated by anonymous access type designating local type

2024-02-26 Thread ebotcazou at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113893

Eric Botcazou  changed:

   What|Removed |Added

   Target Milestone|--- |11.5
 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #8 from Eric Botcazou  ---
This should run to completion now.

[Bug c/114112] New: Error message is translatable but inserts untranslated substring

2024-02-26 Thread goeran at uddeborg dot se via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114112

Bug ID: 114112
   Summary: Error message is translatable but inserts untranslated
substring
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: goeran at uddeborg dot se
  Target Milestone: ---

At
https://gcc.gnu.org/git?p=gcc.git;a=blob;f=gcc/c-family/c-omp.cc;h=5117022e330c95592d7731eec161ab1b5c6925d9;hb=HEAD#l1810
the function check_loop_binding_expr emits an error message where it inserts a
"context". This "context" comes from the call and is sent as a string not
available for translation. Even if those inserted strings were marked for
translation, it is in general a bad idea to compose a message from smaller
strings in that way if they are to be correctly translated.

(See
https://www.gnu.org/software/gettext/manual/html_node/Preparing-Strings.html if
one wants a further discussion around this.)

[Bug libstdc++/113450] [14 Regression] std/format/functions/format.cc FAILs

2024-02-26 Thread ro at CeBiTec dot Uni-Bielefeld.DE via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113450

--- Comment #19 from ro at CeBiTec dot Uni-Bielefeld.DE  ---
I'm talking with Oracle Solaris Engineering and they're amenable to
making the int8_t change from char to signed char.

To assess the possible impact, the plan is to compare the public symbols
of C++ libraries delivered with Solaris now and after a rebuild with
 changed.

Are there other important issues to consider with such a change?

[Bug c/114113] New: bogus -Walloc-zero warning

2024-02-26 Thread f.heckenbach--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114113

Bug ID: 114113
   Summary: bogus -Walloc-zero warning
   Product: gcc
   Version: 12.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: f.heckenb...@fh-soft.de
  Target Milestone: ---

% cat test.c
void *malloc (long unsigned int size);

int p[2] = { 1, 0 };
int *a;

int main ()
{
  int n = 0;
  while (p[n])
n++;
  a = (int *) malloc (n * sizeof (int));
  int i;
  for (i = 0; i < n; i++)
a[i] = 0;
}
% gcc -O -Walloc-zero test.c
test.c: In function 'main':
test.c:11:15: warning: argument 1 value is zero [-Walloc-zero]
   11 |   a = (int *) malloc (n * sizeof (int));
  |   ^
test.c:1:7: note: in a call to allocation function 'malloc' declared here
1 | void *malloc (long unsigned int size);
  |   ^~

[Bug c++/114114] New: Internal compiler error on function-local conditional noexcept

2024-02-26 Thread yves.bailly at hexagon dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114114

Bug ID: 114114
   Summary: Internal compiler error on function-local conditional
noexcept
   Product: gcc
   Version: 13.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: yves.bailly at hexagon dot com
  Target Milestone: ---

Created attachment 57542
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57542&action=edit
Preprocessed file from -save-temps

Tested on:
- Ubuntu 22.04 with distribution's GCC 11.4.0
- Ubuntu 22.04 with "home-build" GCC 13.2.0
- Ubuntu 23.10 with distribution's GCC 13.2.0
- Godbolt's compiler explorer with GCC x86-64 "trunk"

The following code causes an internal compiler error on (1):

--8<-8<-8<-8<-8<-8<-8<-8<---
#include 

enum class YesNo: bool { Yes, No };
template 
[[nodiscard]] constexpr bool isYes(const E e) noexcept {
   return e == E::Yes;
}

template
constexpr void test() {
[[maybe_unused]] constexpr bool is_yes = isYes(yes_or_no); // (1)

struct S
{
#if true // (2)
constexpr S() noexcept(is_yes)
{ std::cout << "boo\n"; }

// The following compiles fine:
#else
constexpr S() noexcept(yes_or_no == YesNo::Yes)
{ std::cout << "boo ok\n"; }
#endif
};

S s;
}

int main()
{
test();
}

--8<-8<-8<-8<-8<-8<-8<-8<---

Changing the "true" to "false" on (2) makes the code compile, link and run
fine.

Note: this code is accepted by Clang and MSVC.


Output of gcc -v:
Using built-in specs.
COLLECT_GCC=/home/ybailly/gcc13/bin/g++
COLLECT_LTO_WRAPPER=/home/ybailly/gcc13/libexec/gcc/x86_64-pc-linux-gnu/13.2.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc-13.2.0/configure --prefix=/home/ybailly/gcc13
--enable-languages=c,c++,fortran --disable-multilib
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 13.2.0 (GCC) 


Output of "~/gcc13/bin/g++ -o test_gcc.x -std=c++20 test_gcc.cpp":
test_gcc.cpp: In instantiation of ‘constexpr test()::S::S() [with YesNo
yes_or_no = YesNo::Yes]’:
test_gcc.cpp:16:19:   required from ‘constexpr test()::S::S() [with YesNo
yes_or_no = YesNo::Yes]’
test_gcc.cpp:24:5:   required from ‘constexpr void test() [with YesNo yes_or_no
= YesNo::Yes]’
test_gcc.cpp:31:21:   required from here
test_gcc.cpp:11:37: internal compiler error: Segmentation fault
   11 | [[maybe_unused]] constexpr bool is_yes = isYes(yes_or_no); // (1)
  | ^~
0xe013bf crash_signal
../../gcc-13.2.0/gcc/toplev.cc:314
0x7f673d7e851f ???
./signal/../sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c:0
0x88089e hash_table, tree_node*>
>::hash_entry, false, xcallocator>::find_slot_with_hash(tree_node* const&,
unsigned int, insert_option)
../../gcc-13.2.0/gcc/hash-table.h:1039
0x88089e hash_map, tree_node*>
>::put(tree_node* const&, tree_node* const&)
../../gcc-13.2.0/gcc/hash-map.h:170
0x88089e register_local_specialization(tree_node*, tree_node*)
../../gcc-13.2.0/gcc/cp/pt.cc:1970
0x896009 tsubst_decl
../../gcc-13.2.0/gcc/cp/pt.cc:15446
0x885904 tsubst_copy
../../gcc-13.2.0/gcc/cp/pt.cc:17417
0x886588 tsubst_copy_and_build(tree_node*, tree_node*, int, tree_node*)
../../gcc-13.2.0/gcc/cp/pt.cc:21676
0x889bed maybe_instantiate_noexcept(tree_node*, int)
../../gcc-13.2.0/gcc/cp/pt.cc:26754
0x88ddb2 regenerate_decl_from_template
../../gcc-13.2.0/gcc/cp/pt.cc:26553
0x88ddb2 instantiate_body
../../gcc-13.2.0/gcc/cp/pt.cc:26865
0x88e678 instantiate_decl(tree_node*, bool, bool)
../../gcc-13.2.0/gcc/cp/pt.cc:27217
0x897f22 tsubst_expr(tree_node*, tree_node*, int, tree_node*)
../../gcc-13.2.0/gcc/cp/pt.cc:19397
0x898431 tsubst_expr(tree_node*, tree_node*, int, tree_node*)
../../gcc-13.2.0/gcc/cp/pt.cc:18844
0x898431 tsubst_expr(tree_node*, tree_node*, int, tree_node*)
../../gcc-13.2.0/gcc/cp/pt.cc:18858
0x898265 tsubst_expr(tree_node*, tree_node*, int, tree_node*)
../../gcc-13.2.0/gcc/cp/pt.cc:18844
0x898265 tsubst_expr(tree_node*, tree_node*, int, tree_node*)
../../gcc-13.2.0/gcc/cp/pt.cc:19238
0x88de06 tsubst_expr(tree_node*, tree_node*, int, tree_node*)
../../gcc-13.2.0/gcc/cp/pt.cc:26930
0x88de06 instantiate_body
../../gcc-13.2.0/gcc/cp/pt.cc:26930
0x88e678 instantiate_decl(tree_node*, bool, bool)
../../gcc-13.2.0/gcc/cp/pt.cc:27217


Preprocessed file attached, greatly reduced by removing  - the same
error appears without it.

Regards,

[Bug tree-optimization/107855] gcc.dg/vect/vect-ifcvt-18.c FAILs

2024-02-26 Thread ro at CeBiTec dot Uni-Bielefeld.DE via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107855

--- Comment #8 from ro at CeBiTec dot Uni-Bielefeld.DE  ---
> --- Comment #6 from Xi Ruoyao  ---
> Hmm, the test contains
>
> "/* { dg-additional-options "-Ofast -mavx" { target avx_runtime } } */"
>
> So it passes on AVX capable native builds, but fails otherwise.

I can reproduce things in a VM now: when it doesn't have avx support,
the test is compiled with -msse2 only and FAILs both for the dump and
execution:

FAIL: gcc.dg/vect/vect-ifcvt-18.c -flto -ffat-lto-objects  scan-tree-dump vect
"vectorized 3 loops"
FAIL: gcc.dg/vect/vect-ifcvt-18.c -flto -ffat-lto-objects execution test
FAIL: gcc.dg/vect/vect-ifcvt-18.c execution test
FAIL: gcc.dg/vect/vect-ifcvt-18.c scan-tree-dump vect "vectorized 3 loops"

The test aborts here:

Thread 2 received signal SIGABRT, Aborted.

#0  0xfe26e385 in __lwp_sigqueue () from /lib/libc.so.1
#1  0xfe2660ef in thr_kill () from /lib/libc.so.1
#2  0xfe19db82 in raise () from /lib/libc.so.1
#3  0xfe16b1f4 in abort () from /lib/libc.so.1
#4  0x08050d58 in main ()
at
/vol/gcc/src/hg/master/local/gcc/testsuite/gcc.dg/vect/vect-ifcvt-18.c:34

and the dump shows

/vol/gcc/src/hg/master/local/gcc/testsuite/gcc.dg/vect/vect-ifcvt-18.c:28:17:
note:  === analyze_loop_nest ===
/vol/gcc/src/hg/master/local/gcc/testsuite/gcc.dg/vect/vect-ifcvt-18.c:28:17:
note:   === vect_analyze_loop_form ===
/vol/gcc/src/hg/master/local/gcc/testsuite/gcc.dg/vect/vect-ifcvt-18.c:28:17:
note:   using as main loop exit: 13 -> 14 [AUX: 0]
/vol/gcc/src/hg/master/local/gcc/testsuite/gcc.dg/vect/vect-ifcvt-18.c:28:17:
missed:   not vectorized: unsupported control flow in loop.
/vol/gcc/src/hg/master/local/gcc/testsuite/gcc.dg/vect/vect-ifcvt-18.c:28:17:
missed:  bad loop form.
/vol/gcc/src/hg/master/local/gcc/testsuite/gcc.dg/vect/vect-ifcvt-18.c:28:17:
missed: couldn't vectorize loop

When I add avx support to the VM, the test PASSes.

It seems the test is missing some requirement here.

[Bug gcov-profile/114115] New: xz-utils segfaults when built with -fprofile-generate (bad interaction between IFUNC and binding?)

2024-02-26 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114115

Bug ID: 114115
   Summary: xz-utils segfaults when built with -fprofile-generate
(bad interaction between IFUNC and binding?)
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: gcov-profile
  Assignee: unassigned at gcc dot gnu.org
  Reporter: sjames at gcc dot gnu.org
  Target Milestone: ---

This was first reported downstream in Gentoo at https://bugs.gentoo.org/925415.

xz-utils-5.6.0 (it started to use IFUNC recently for crc32) started to
segfault, but only when built with -march=x86-64-v3 & -fprofile-generate.

For convenience, a broken builddir is available at
http://dev.gentoo.org/~sam/bugs/xz/pgo/xz-5.6.0-abi_x86_64.amd64.tar.xz.

```
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x41b6 in ?? ()
(gdb) bt
#0  0x41b6 in ?? ()
#1  0x7f861b2fcc75 in crc32_resolve () at
/var/tmp/portage/app-arch/xz-utils-5.6.0/work/xz-5.6.0/src/liblzma/check/crc32_fast.c:140
#2  0x7f861b3541e4 in elf_machine_rela (map=,
scope=, reloc=0x7f861b2e05c8, sym=0x7f861b2ddfd8,
version=,
reloc_addr_arg=0x7f861b32ab10 , skip_ifunc=) at ../sysdeps/x86_64/dl-machine.h:314
#3  elf_dynamic_do_Rela (map=0x7f861b343160, scope=,
reladdr=, relsize=, nrelative=,
lazy=,
skip_ifunc=) at
/var/tmp/portage/sys-libs/glibc-2.39-r1/work/glibc-2.39/elf/do-rel.h:147
#4  _dl_relocate_object (l=l@entry=0x7f861b343160, scope=,
reloc_mode=, consider_profiling=,
consider_profiling@entry=0) at dl-reloc.c:301
#5  0x7f861b363d61 in dl_main (phdr=, phnum=,
user_entry=, auxv=) at rtld.c:2311
#6  0x7f861b36059f in _dl_sysdep_start
(start_argptr=start_argptr@entry=0x7ffdeae5bd20,
dl_main=dl_main@entry=0x7f861b362060 )
at ../sysdeps/unix/sysv/linux/dl-sysdep.c:140
#7  0x7f861b361da2 in _dl_start_final (arg=0x7ffdeae5bd20) at rtld.c:494
#8  _dl_start (arg=0x7ffdeae5bd20) at rtld.c:581
#9  0x7f861b360b88 in _start () from /lib64/ld-linux-x86-64.so.2
#10 0x0006 in ?? ()
#11 0x7ffdeae5cfc9 in ?? ()
#12 0x7ffdeae5d021 in ?? ()
#13 0x7ffdeae5d026 in ?? ()
#14 0x7ffdeae5d034 in ?? ()
#15 0x7ffdeae5d03a in ?? ()
#16 0x7ffdeae5d04b in ?? ()
#17 0x in ?? ()
(gdb)
```

```
(gdb) frame 1
#1  0x7f861b2fcc75 in crc32_resolve () at
/var/tmp/portage/app-arch/xz-utils-5.6.0/work/xz-5.6.0/src/liblzma/check/crc32_fast.c:140
140 {
(gdb) list
135 // This resolver is shared between all three dispatch methods. It
serves as
136 // the ifunc resolver if ifunc is supported, otherwise it is called as
a
137 // regular function by the constructor or first call resolution
methods.
138 static crc32_func_type
139 crc32_resolve(void)
140 {
141 return is_arch_extension_supported()
142 ? &crc32_arch_optimized : &crc32_generic;
143 }
144
(gdb)
```

[Bug gcov-profile/114115] xz-utils segfaults when built with -fprofile-generate (bad interaction between IFUNC and binding?)

2024-02-26 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114115

--- Comment #1 from Sam James  ---
One of the xz developers, Jia Tan, has kindly minimised it to not need
BIND_NOW. I've adapted it a bit to cleanup flags and warnings.

I can reproduce it with the following, at least:
```
#!/bin/sh
gcc-14 -O2 -march=znver2 -fvisibility=hidden -fPIC -fprofile-update=atomic
-fprofile-dir=$(pwd) -fprofile-generate=$(pwd) -c test.c -o test.o -Wall
-Wextra
gcc-14 -o libapp.so test.o -shared -Wl,-z,now -fPIC -lgcov
gcc-14 -o app main.c -lgcov -L. -lapp
LD_LIBRARY_PATH=. ./app
```

main.c:
```
#include 

extern int func();

int main(void)
{
printf( "Hello world %p\n", func);

return 0;
}
```

test.c:
```
__attribute__((visibility("default")))
void *foo_ifunc2() __attribute__((ifunc("foo_resolver")));


__attribute__((visibility("default")))
void bar(void)
{
}

static int f3()
{
return 5;
}


__attribute__((visibility("default")))
void (*foo_resolver(void))(void)
{
f3();
return bar;
}


__attribute__((optimize("O0")))
__attribute__((visibility("default")))
int func()
{
foo_ifunc2();
return 0;
}
```

[Bug gcov-profile/114115] xz-utils segfaults when built with -fprofile-generate (bad interaction between IFUNC and binding?)

2024-02-26 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114115

--- Comment #2 from Sam James  ---
The reproducer succeeds for me with Clang 17.0.6, but fails for me with GCC
10..14.

[Bug gcov-profile/114115] xz-utils segfaults when built with -fprofile-generate (bad interaction between IFUNC and binding?)

2024-02-26 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114115

--- Comment #3 from Sam James  ---
(In reply to Sam James from comment #1)
> One of the xz developers, Jia Tan, has kindly minimised it to not need
> BIND_NOW. I've adapted it a bit to cleanup flags and warnings.

(oops, sorry, this one does need it - we were discussing whether we could elide
it but didn't get there yet.)

[Bug tree-optimization/114068] [14 regression] ICE when building darktable-4.6.1 (error: PHI node with wrong VUSE on edge from BB 25) since r14-8768

2024-02-26 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114068

--- Comment #15 from GCC Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:8293df8019adfffae3384cb6fb9cb6f496fe8608

commit r14-9181-g8293df8019adfffae3384cb6fb9cb6f496fe8608
Author: Richard Biener 
Date:   Mon Feb 26 11:25:50 2024 +0100

tree-optimization/114068 - missed virtual LC PHI after vect peeling

When we choose the IV exit to be one leading to no virtual use we
fail to have a virtual LC PHI even though we need it for the epilog
entry.  The following makes sure to create it so that later updating
works.

PR tree-optimization/114068
* tree-vect-loop-manip.cc (get_live_virtual_operand_on_edge):
New function.
(slpeel_tree_duplicate_loop_to_edge_cfg): Add a virtual LC PHI
on the main exit if needed.  Remove band-aid for the case
it was missing.

* gcc.dg/vect/vect-early-break_118-pr114068.c: New testcase.
* gcc.dg/vect/vect-early-break_119-pr114068.c: Likewise.

[Bug tree-optimization/114099] [14 regression] ICE in find_uses_to_rename_use when building darktable-4.6.1

2024-02-26 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114099

--- Comment #10 from GCC Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:fb68e2cac1283f731a3a979cb714621afb1ddfcc

commit r14-9182-gfb68e2cac1283f731a3a979cb714621afb1ddfcc
Author: Richard Biener 
Date:   Mon Feb 26 12:27:42 2024 +0100

tree-optimization/114099 - virtual LC PHIs and early exit vect

In some cases exits can lack LC PHI nodes for the virtual operand.
We have to create them when the epilog loop requires them which also
allows us to remove some only halfway correct fixups.  This is the
variant triggering for alternate exits.

PR tree-optimization/114099
* tree-vect-loop-manip.cc (slpeel_tree_duplicate_loop_to_edge_cfg):
Create and fill in a needed virtual LC PHI for the alternate
exits.  Remove code dealing with that missing.

* gcc.dg/vect/vect-early-break_120-pr114099.c: New testcase.

[Bug tree-optimization/114068] [14 regression] ICE when building darktable-4.6.1 (error: PHI node with wrong VUSE on edge from BB 25) since r14-8768

2024-02-26 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114068

Richard Biener  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #16 from Richard Biener  ---
Should be fixed.

[Bug tree-optimization/114099] [14 regression] ICE in find_uses_to_rename_use when building darktable-4.6.1

2024-02-26 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114099

Richard Biener  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #11 from Richard Biener  ---
Fixed.

[Bug c++/114104] nodiscard not diagnosed on synthesized operator!=

2024-02-26 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114104

Patrick Palka  changed:

   What|Removed |Added

 CC||ppalka at gcc dot gnu.org

--- Comment #4 from Patrick Palka  ---
(In reply to Harald van Dijk from comment #2)
> For similar useless operations, such as f() ^ true;, GCC emits a similar
> warning "warning: value computed is not used [-Wunused-value]". Presumably,
> if that warning were implemented in GCC for ! as well, it should also fire
> for your original x != 0 test?
That sounds plausible.  The relevant code is

gcc/cp/cvt.cc
@@ -1647,11 +1647,6 @@ convert_to_void (tree expr, impl_conv_void implicit,
tsubst_flags_t complain)
  enum tree_code code = TREE_CODE (e);
  enum tree_code_class tclass = TREE_CODE_CLASS (code);
  if (tclass == tcc_comparison
  || tclass == tcc_unary
  || tclass == tcc_binary
  || code == VEC_PERM_EXPR
  || code == VEC_COND_EXPR)
warn_if_unused_value (e, loc);

which doesn't consider boolean operations (TRUTH_NOT_EXPR / TRUTH_AND_EXPR /
TRUTH_OR_EXPR) because their class is tcc_expression.  This is probably just an
oversight (even the C front end warns for !f()).

[Bug gcov-profile/114115] xz-utils segfaults when built with -fprofile-generate (bad interaction between IFUNC and binding?)

2024-02-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114115

--- Comment #4 from Andrew Pinski  ---
It is the use of TLS inside an ifunc resolver which seems like causing issues
...

[Bug tree-optimization/114107] poor vectorization at -O3 when dealing with arrays of different multiplicity, good with -O2

2024-02-26 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114107

Richard Biener  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org
 Status|NEW |ASSIGNED

--- Comment #14 from Richard Biener  ---
Mine.

[Bug gcov-profile/114115] xz-utils segfaults when built with -fprofile-generate (bad interaction between IFUNC and binding?)

2024-02-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114115

--- Comment #5 from Andrew Pinski  ---
The obvious workaround is to mark the ifunc_resolver with
no_profile_instrument_function attribute since is only ever called once and
really does not need to be PGO'ed anyways.

[Bug gcov-profile/114115] xz-utils segfaults when built with -fprofile-generate (bad interaction between IFUNC and binding?)

2024-02-26 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114115

--- Comment #6 from Richard Biener  ---
Maybe we can automatically consider that when handling the ifunc attribute?

[Bug c/114113] bogus -Walloc-zero warning

2024-02-26 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114113

Richard Biener  changed:

   What|Removed |Added

   Keywords||diagnostic

--- Comment #1 from Richard Biener  ---
I bet we thread the p[n] == 0 case because of the later loop i < n condition.

Consider when p[0] == 0, the code would then call malloc (0).

[Bug target/114116] New: [14 Regression] Broken backtraces in bootstrapped x86_64 gcc

2024-02-26 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114116

Bug ID: 114116
   Summary: [14 Regression] Broken backtraces in bootstrapped
x86_64 gcc
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jakub at gcc dot gnu.org
  Target Milestone: ---

The expected ICE on
void
foo (void)
{
  unsigned _BitInt (575) a = 3;
  __builtin_clzg (a);
}
with -fno-tree-dce -O1 (might go away soon when PR114044 is fixed) is from
stage1-gcc/cc1
~/src/gcc/obj88/stage1-gcc/cc1 -quiet pr114044-2.c -fno-tree-dce -O1
during RTL pass: expand
pr114044-2.c: In function ‘foo’:
pr114044-2.c:5:3: internal compiler error: in expand_fn_using_insn, at
internal-fn.cc:208
5 |   __builtin_clzg (a);
  |   ^~
0x12e77f6 expand_fn_using_insn
../../gcc/internal-fn.cc:208
0x12f6321 expand_direct_optab_fn
../../gcc/internal-fn.cc:3817
0x12fce16 expand_CLZ
../../gcc/internal-fn.def:444
0x12fdb14 expand_internal_call(internal_fn, gcall*)
../../gcc/internal-fn.cc:4913
0x12fdb3f expand_internal_call(gcall*)
../../gcc/internal-fn.cc:4921
0xf39343 expand_call_stmt
../../gcc/cfgexpand.cc:2771
0xf3e2db expand_gimple_stmt_1
../../gcc/cfgexpand.cc:3932
0xf3e99b expand_gimple_stmt
../../gcc/cfgexpand.cc:4077
0xf48362 expand_gimple_basic_block
../../gcc/cfgexpand.cc:6133
0xf4a9d2 execute
../../gcc/cfgexpand.cc:6872
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See  for instructions.

but from gcc/cc1

during RTL pass: expand
pr114044-2.c: In function ‘foo’:
pr114044-2.c:5:3: internal compiler error: in expand_fn_using_insn, at
internal-fn.cc:208
5 |   __builtin_clzg (a);
  |   ^~
0x7d9246 expand_fn_using_insn
../../gcc/internal-fn.cc:208

pr114044-2.c:5:3: internal compiler error: Segmentation fault
0x1554262 crash_signal
../../gcc/toplev.cc:319
0x2b20320 x86_64_fallback_frame_state
./md-unwind-support.h:63
0x2b20320 uw_frame_state_for
../../../libgcc/unwind-dw2.c:1013
0x2b2165d _Unwind_Backtrace
../../../libgcc/unwind.inc:303
0x2acbd69 backtrace_full
../../libbacktrace/backtrace.c:127
0x2a32fa6 diagnostic_context::action_after_output(diagnostic_t)
../../gcc/diagnostic.cc:781
0x2a331bb diagnostic_action_after_output(diagnostic_context*, diagnostic_t)
../../gcc/diagnostic.h:1002
0x2a331bb diagnostic_context::report_diagnostic(diagnostic_info*)
../../gcc/diagnostic.cc:1633
0x2a33543 diagnostic_impl
../../gcc/diagnostic.cc:1767
0x2a33c26 internal_error(char const*, ...)
../../gcc/diagnostic.cc:2225
0xe232c8 fancy_abort(char const*, int, char const*)
../../gcc/diagnostic.cc:2336
0x7d9246 expand_fn_using_insn
../../gcc/internal-fn.cc:208
Segmentation fault (core dumped)

I believe this is caused by the r14-8470 change.
The problem can be also seen when running the cc1 in the debugger.
When a breakpoint as added on fancy_abort (.gdbinit normally does that), the
backtrace still looks sane:
#0  fancy_abort (file=file@entry=0x2bd70fb "../../gcc/internal-fn.cc",
line=line@entry=208, function=function@entry=0x2bd76cf "expand_fn_using_insn")
at ../../gcc/diagnostic.cc:2313
#1  0x007d9247 in expand_fn_using_insn (stmt=,
icode=CODE_FOR_nothing, ninputs=1, noutputs=1) at ../../gcc/internal-fn.cc:208
#2  0x00fcd1d0 in expand_call_stmt (stmt=0x7fffea307000) at
../../gcc/cfgexpand.cc:2771
#3  expand_gimple_stmt_1 (stmt=) at
../../gcc/cfgexpand.cc:3932
#4  expand_gimple_stmt (stmt=) at
../../gcc/cfgexpand.cc:4077
#5  0x00fcdf18 in expand_gimple_basic_block (bb=,
disable_tail_calls=false) at ../../gcc/cfgexpand.cc:6133
#6  0x00fd059f in (anonymous namespace)::pass_expand::execute
(this=, fun=) at ../../gcc/cfgexpand.cc:6872
#7  0x0140bff8 in execute_one_pass (pass=) at ../../gcc/passes.cc:2646
#8  0x0140c890 in execute_pass_list_1 (pass=) at ../../gcc/passes.cc:2755
#9  0x0140c8c9 in execute_pass_list (fn=0x7fffea302000, pass=) at ../../gcc/passes.cc:2766
#10 0x01011a26 in cgraph_node::expand (this=) at ../../gcc/context.h:48
#11 cgraph_node::expand (this=) at
../../gcc/cgraphunit.cc:1798
#12 0x010137fb in expand_all_functions () at
../../gcc/cgraphunit.cc:2028
#13 symbol_table::compile (this=0x7fffea13) at ../../gcc/cgraphunit.cc:2402
#14 0x01015e18 in symbol_table::compile (this=0x7fffea13) at
../../gcc/cgraphunit.cc:2315
#15 symbol_table::finalize_compilation_unit (this=0x7fffea13) at
../../gcc/cgraphunit.cc:2587
#16 0x01554742 in compile_file () at ../../gcc/toplev.cc:476
#17 0x00e281cc in do_compile () at ../../

[Bug target/114116] [14 Regression] Broken backtraces in bootstrapped x86_64 gcc

2024-02-26 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114116

Jakub Jelinek  changed:

   What|Removed |Added

 CC||hjl.tools at gmail dot com
   Priority|P3  |P1
   Target Milestone|--- |14.0

[Bug tree-optimization/114086] Boolean switches could have a lot better codegen, possibly utilizing bit-vectors

2024-02-26 Thread amacleod at redhat dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114086

--- Comment #9 from Andrew Macleod  ---
(In reply to Jakub Jelinek from comment #8)
> Unfortunately doing the ((682 >> x) & 1) to x & 1 optimization in match.pd
> isn't possible, we can only use global ranges there and we need path
> specific range here.
> Can it be done in VRP pass?  Though, I'm afraid I'm quite lost where it
> actually has
> the statement optimizations (rather than mere computing of ranges),
> Aldy/Andrew, any hints?  I mean like what old tree-vrp.c was doing in
> simplify_stmt_using_ranges.

I don't think much has changed there... We still call into all the code in
vr-values.cc to do simplifications.  I think Aldy changed it all to be
contained in class 'simplify_using_ranges'.. but those routines are all still
in vr-values.cc.   tree-vrp calls into the top level simplfy() routine.

  bool fold_stmt (gimple_stmt_iterator *gsi) override
  {
bool ret = m_simplifier.simplify (gsi);
if (!ret)
  ret = ::fold_stmt (gsi, follow_single_use_edges);
return ret;
  }


If that fails, then rangers fold_stmt() is invoked.  That is merely a
contextual wrapper around a call to gimple-fold::fold_stmt to see if normal
folding can find anything.  Under the covers I believe that invokes match.pd
which, if it was using the current range_query, would get contextual info.



> Guess we could duplicate that in match.pd for the case which can use global
> range or
> doesn't need any range at all.
> I mean
> unsigned int
> foo (int x)
> {
>   return (0xU >> x) & 1;
> }
> 
> unsigned int
> bar (int x)
> {
>   return (0xU >> x) & 1;
> }
> 
> unsigned int
> baz (int x)
> {
>   if (x >= 22) __builtin_unreachable ();
>   return (0x5aU >> x) & 1;
> }
> can be optimized even with global ranges (or the first one with no ranges).
> foo and baz equivalent is x & 1, while bar is (~x) & 1 or (x & 1) ^ 1, dunno
> what is more canonical.

[Bug middle-end/114111] [avr] Expensive code instead of conditional branch.

2024-02-26 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114111

Richard Biener  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Keywords||missed-optimization
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2024-02-26
 Target||avr

--- Comment #1 from Richard Biener  ---
I think RTL expansion only (if even) considers BRANCH_COST.  I also think
that while we have if () to non-branchy code conversion we don't have the
reverse on GIMPLE so RTL expansion sees code like

  _7 = c_3(D) & 64;
  _1 = _7 != 0;
  _2 = (int) _1;
  _5 = _2 + x_4(D);
  return _5;

and when setcc is available it doesn't consider test & branch.  It would
only effectively do

  if (_7 != 0)
_1 = 1;
  else
_1 = 0;
  _2 = (int) _1;
  _5 = _2 + x_4(D);
  return _5;

so probably not help much in practice unless we move the computation
below back into the branch during RTL optimization.

This possibly asks for a better GIMPLE representation, at least for the
purpose of getting good code for AVR.  RTL expansion probably isn't the
best place to fix this.

[Bug middle-end/114109] x264 satd vectorization vs LLVM

2024-02-26 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114109

Richard Biener  changed:

   What|Removed |Added

 Blocks||53947
 CC||rguenth at gcc dot gnu.org

--- Comment #5 from Richard Biener  ---
There's at least one other bug about this (or a similar) pattern.  Note using
-fno-vect-cost-model isn't really recommended.

Might want to relate the various x264 missed-opt bugs.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
[Bug 53947] [meta-bug] vectorizer missed-optimizations

[Bug target/66874] RFE: x86_64_fallback_frame_state more robust

2024-02-26 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66874

--- Comment #6 from Sam James  ---
Pretty sure my issue is indeed PR114116.

[Bug target/114116] [14 Regression] Broken backtraces in bootstrapped x86_64 gcc

2024-02-26 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114116

--- Comment #1 from Jakub Jelinek  ---
Maybe introduce TYPE_NO_CALLEE_SAVED_REGISTERS_EXCEPT_BP or something similar?

[Bug rtl-optimization/10837] noreturn attribute causes no sibling calling optimization

2024-02-26 Thread lukas.graetz--- via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=10837

--- Comment #20 from Lukas Grätz  ---
(In reply to Petr Skocik from comment #19)
> IMO(In reply to Xi Ruoyao from comment #16)
>  
> > In practice most _Noreturn functions are abort, exit, ..., i.e. they are
> > only executed one time so optimizing against a cold path does not help much.
> > I don't think it's a good idea to encourage people to construct some fancy
> > code by a recursive _Noreturn function (why not just use a loop?!)  And if
> > you must write such fancy code anyway IMO musttail attribute (PR83324) will
> > be a better solution.
> 
> There's also longjmp, which may not be all that super cold and may be
> executed multiple times. And while yeah, nobody will notice a single call vs
> jmp time save against a process spawn/exit, for a longjmp wrapper, it'll
> make it a few % faster (as would utilizing _Noreturn attributes for better
> register allocation: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114097,
> which would also save a bit of codesize too). Taillcalls can also save a bit
> of codesize if the target is near.


Just to emphasize, tail call optimization is not just for speed. It is
essential to avoid waste of stack space. Especially, to avoid potential stack
overflows, it should _not_ be necessary to replace all recursions with loops,
as Xi Ruoyao suggests. Ah, and I also think that recursions in C is not fancy
(anymore), since everyone expects the compiler to do sibcall or similar
optimizations. Noreturn functions are the exception for that. So it would be
consequent indeed to do sibcall optimization for noreturn functions, too!

Personally, I would be satisfied with the new attribute musttail to enforces
tail calls whenever necessary (given that this will be available for C, not C++
only). But speed-wise, musttail might not have the desired effect. It is meant
for preserving stack space.

---

Following Petr Skocik, I quick-tested on my computer:

= longjmp_wrapper.c =
#include 

__attribute__((noreturn))
void longjmp_wrapper(jmp_buf env, int val) {
longjmp(env, val);
}

= longjmp_main.c 
#include 
#include 

__attribute__((noreturn))
void longjmp_wrapper(jmp_buf env, int val);

int main(void) {
jmp_buf env;
for (int i = 0; i < INT_MAX; i++) {
if (setjmp(env) == 0) {
longjmp_wrapper(env, 1);
}
}
}
=

After compiling with

$ gcc -O3 -m32 -c -S longjmp_wrapper.c -o longjmp_wrapper.S

I copied and manually modified the generated longjmp_wrapper.S as follows:

9,15c9
<   subl$20, %esp
<   .cfi_def_cfa_offset 24
<   pushl   28(%esp)
<   .cfi_def_cfa_offset 28
<   pushl   28(%esp)
<   .cfi_def_cfa_offset 32
<   calllongjmp
---
>   jmp longjmp


Then I compiled both versions with longjmp_main.c, again with -m32. Measured
with "time", the sibcall and unmodified version took around 23.5 sec and 24.5
sec on my computer. So around 4 % improvement for 32 bit x86. For 64 bit x86,
both took around 18 secs without noticeable speed difference (perhaps because
both arguments are passed in registers instead of stack by 64 bit calling
conventions).

[Bug rtl-optimization/114044] ICE: in expand_fn_using_insn, at internal-fn.cc:208 with _BitInt() and -O -fno-tree-dce

2024-02-26 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114044

Jakub Jelinek  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org
 Status|NEW |ASSIGNED

--- Comment #4 from Jakub Jelinek  ---
Created attachment 57543
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57543&action=edit
gcc14-pr114044.patch

Untested fix on the ifn expansion side.

[Bug c/114042] diagnostics about __builtin_stdc_bit_ceil() mentions __builtin_clzg()

2024-02-26 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114042

--- Comment #5 from GCC Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:77576915cfd26e603aba5295dfdac54a5545f5f2

commit r14-9184-g77576915cfd26e603aba5295dfdac54a5545f5f2
Author: Jakub Jelinek 
Date:   Mon Feb 26 16:30:16 2024 +0100

c: Improve some diagnostics for __builtin_stdc_bit_* [PR114042]

The PR complains that for the __builtin_stdc_bit_* "builtins" the
diagnostics doesn't mention the name of the builtin the user used, but
instead __builtin_{clz,ctz,popcount}g instead (which is what the FE
immediately lowers it to).

The following patch repeats the checks from
check_builtin_function_arguments
which are there done on BUILT_IN_{CLZ,CTZ,POPCOUNT}G, such that they
diagnose it with the name of the "builtin" user actually used before it
is gone.

2024-02-26  Jakub Jelinek  

PR c/114042
* c-parser.cc (c_parser_postfix_expression): Diagnose
__builtin_stdc_bit_* argument with ENUMERAL_TYPE or BOOLEAN_TYPE
type or if signed here rather than on the replacement builtins
in check_builtin_function_arguments.

* gcc.dg/builtin-stdc-bit-2.c: Adjust testcase for actual builtin
names rather than names of builtin replacements.

[Bug c/114042] diagnostics about __builtin_stdc_bit_ceil() mentions __builtin_clzg()

2024-02-26 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114042

Jakub Jelinek  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #6 from Jakub Jelinek  ---
Fixed.

[Bug target/114116] [14 Regression] Broken backtraces in bootstrapped x86_64 gcc

2024-02-26 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114116

H.J. Lu  changed:

   What|Removed |Added

   Last reconfirmed||2024-02-26
 Ever confirmed|0   |1
 Status|UNCONFIRMED |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |hjl.tools at gmail dot 
com

[Bug c/114117] New: -Wno-foo handling

2024-02-26 Thread pto at linuxbog dot dk via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114117

Bug ID: 114117
   Summary: -Wno-foo handling
   Product: gcc
   Version: 13.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pto at linuxbog dot dk
  Target Milestone: ---

I have worked a lot with clang and gcc compilers for many years, with focus on
C and C++.

It we take something really simple

int f()
{
int x = 1;
return x;
}

and compile with Gcc 13.2 - all fine -> see https://godbolt.org/z/Wxqxzzj1G

However if I then add a "-Wno-" pattern e.g. -Wno-comment I still have a clean
compilation -> https://godbolt.org/z/j5Yf5ozqo

Let me then try to ignore an unknown option "-Wno-petertoft" for the same code
then surprisingly gcc is happy - see https://godbolt.org/z/efxGzhcM1

If I try the same with clang 17 then clang returns the expected
warning: unknown warning option '-Wno-petertoft'; did you mean '-Wno-selector'?
[-Wunknown-warning-option]
See https://godbolt.org/z/TvbzWPaPP

When working with large code-bases with differerent origin, it is quite
challenging to have the silent gcc behaviur that -Wno-say-hello-to-rms-from-me
is silently dropped. The clang behaviour is much more consistent if you ask me.

Can gcc adopt the clang-style of giving a warning if -Wno- is used for
cases where -W does not exist?

[Bug c/114117] -Wno-foo handling

2024-02-26 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114117

Sam James  changed:

   What|Removed |Added

 CC||sjames at gcc dot gnu.org

--- Comment #1 from Sam James  ---
See -Wunknown-warning and the part under -Wfatal-errors at
https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Wfatal-errors.

[Bug c/114117] -Wno-foo handling

2024-02-26 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114117

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #2 from Jakub Jelinek  ---
(In reply to Peter Toft from comment #0)
> Can gcc adopt the clang-style of giving a warning if -Wno- is used
> for cases where -W does not exist?

No, current behavior have is 100% intentional and documented.

[Bug c/114117] -Wno-foo handling

2024-02-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114117

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #3 from Andrew Pinski  ---
Dup.

*** This bug has been marked as a duplicate of bug 63499 ***

[Bug c/63499] gcc treats unknown -Wno-xxx options differently than -Wxxx

2024-02-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63499

Andrew Pinski  changed:

   What|Removed |Added

 CC||pto at linuxbog dot dk

--- Comment #6 from Andrew Pinski  ---
*** Bug 114117 has been marked as a duplicate of this bug. ***

[Bug other/28322] GCC new warnings and compatibility

2024-02-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=28322

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |4.4.0

[Bug gcov-profile/114115] xz-utils segfaults when built with -fprofile-generate (bad interaction between IFUNC and binding?)

2024-02-26 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114115

H.J. Lu  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
 Ever confirmed|0   |1
   Last reconfirmed||2024-02-26
   Target Milestone|--- |14.0
   Assignee|unassigned at gcc dot gnu.org  |hjl.tools at gmail dot 
com

[Bug gcov-profile/114115] xz-utils segfaults when built with -fprofile-generate (bad interaction between IFUNC and binding?)

2024-02-26 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114115

--- Comment #7 from H.J. Lu  ---
Created attachment 57544
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57544&action=edit
A patch

[Bug target/114116] [14 Regression] Broken backtraces in bootstrapped x86_64 gcc

2024-02-26 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114116

--- Comment #2 from Jakub Jelinek  ---
Created attachment 57545
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57545&action=edit
gcc14-pr114116.patch

This seems to fix it, so far tested just on the small testcase, back to the
expected backtrace there.

[Bug target/114116] [14 Regression] Broken backtraces in bootstrapped x86_64 gcc

2024-02-26 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114116

--- Comment #3 from H.J. Lu  ---
(In reply to Jakub Jelinek from comment #2)
> Created attachment 57545 [details]
> gcc14-pr114116.patch
> 
> This seems to fix it, so far tested just on the small testcase, back to the
> expected backtrace there.

Should we check -g? Without -g, I don't think we need to save FP.

[Bug target/114116] [14 Regression] Broken backtraces in bootstrapped x86_64 gcc

2024-02-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114116

--- Comment #4 from Andrew Pinski  ---
(In reply to H.J. Lu from comment #3)
> (In reply to Jakub Jelinek from comment #2)
> > Created attachment 57545 [details]
> > gcc14-pr114116.patch
> > 
> > This seems to fix it, so far tested just on the small testcase, back to the
> > expected backtrace there.
> 
> Should we check -g? Without -g, I don't think we need to save FP.

NO, the code generated with -g should be the same as without ...

[Bug target/114116] [14 Regression] Broken backtraces in bootstrapped x86_64 gcc

2024-02-26 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114116

--- Comment #5 from Jakub Jelinek  ---
Yeah.  Not to mention, one can call backtrace even if -g0; you just don't get
nice names for the addresses.  Without the patch you get crashes in the
unwinder when doing backtrace.

[Bug target/113257] -march=native or -mcpu=native are ineffective, but -march=native -mcpu=native works on arm64 M2 Ultra

2024-02-26 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113257

--- Comment #6 from Sam James  ---
Created attachment 57546
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57546&action=edit
gcc 14 test results

$ gcc-13 --version
gcc-13 (Gentoo 13.2.1_p20240210 p13) 13.2.1 20240210
Copyright (C) 2023 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ gcc-13 -v -E -x c /dev/null -o /dev/null -march=native 2>&1 | grep /cc1
 /usr/libexec/gcc/aarch64-unknown-linux-gnu/13/cc1 -E -quiet -v /dev/null -o
/dev/null -mlittle-endian -mabi=lp64 -dumpbase null

$ gcc-13 -v -E -x c /dev/null -o /dev/null -mcpu=native 2>&1 | grep /cc1
 /usr/libexec/gcc/aarch64-unknown-linux-gnu/13/cc1 -E -quiet -v /dev/null -o
/dev/null -mlittle-endian -mabi=lp64 -dumpbase null

$ gcc-13 -v -E -x c /dev/null -o /dev/null -march=native -mcpu=native 2>&1 |
grep /cc1
 /usr/libexec/gcc/aarch64-unknown-linux-gnu/13/cc1 -E -quiet -v /dev/null -o
/dev/null -mlittle-endian -mabi=lp64
-march=armv8-a+crc+lse+rcpc+rdma+dotprod+aes+sha3+fp16fml+sb+ssbs+i8mm+bf16+flagm+pauth
-dumpbase null

$ gcc-14 --version
gcc-14 (Gentoo 14.0.1_pre20240211-r1 p22) 14.0.1 20240211 (experimental)
Copyright (C) 2024 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ gcc-14 -v -E -x c /dev/null -o /dev/null -march=native 2>&1 | grep /cc1
 /usr/libexec/gcc/aarch64-unknown-linux-gnu/14/cc1 -E -quiet -v /dev/null -o
/dev/null -mlittle-endian -mabi=lp64 -dumpbase null

$ gcc-14 -v -E -x c /dev/null -o /dev/null -mcpu=native 2>&1 | grep /cc1
 /usr/libexec/gcc/aarch64-unknown-linux-gnu/14/cc1 -E -quiet -v /dev/null -o
/dev/null -mlittle-endian -mabi=lp64 -dumpbase null

$ gcc-14 -v -E -x c /dev/null -o /dev/null -march=native -mcpu=native 2>&1 |
grep /cc1
 /usr/libexec/gcc/aarch64-unknown-linux-gnu/14/cc1 -E -quiet -v /dev/null -o
/dev/null -mlittle-endian -mabi=lp64
-march=armv8-a+flagm+dotprod+rdma+lse+crc+aes+sha3+fp16fml+rcpc+i8mm+bf16+sb+ssbs+pauth
-dumpbase null

Still hosed :(

[Bug middle-end/114111] [avr] Expensive code instead of conditional branch.

2024-02-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114111

--- Comment #2 from Andrew Pinski  ---
Maybe this is something that could be done during isel to undo what was done in
phiopt ...

[Bug c++/114114] [11/12/13/14 Regression] Internal compiler error on function-local conditional noexcept

2024-02-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114114

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |11.5
Summary|Internal compiler error on  |[11/12/13/14 Regression]
   |function-local conditional  |Internal compiler error on
   |noexcept|function-local conditional
   ||noexcept
  Known to fail||10.1.0, 9.3.0, 9.5.0
  Known to work||9.1.0, 9.2.0

[Bug c++/114114] [11/12/13/14 Regression] Internal compiler error on function-local conditional noexcept

2024-02-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114114

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2024-02-26
 Ever confirmed|0   |1

--- Comment #1 from Andrew Pinski  ---
Confirmed, reduced further:
```
template
constexpr void test() {
constexpr bool is_yes = yes_or_no;
struct S
{
constexpr S() noexcept(is_yes){}
};
S s;
}
int main()
{
test();
}
```

[Bug rtl-optimization/113617] [14 Regression] Symbol ... referenced in section `.data.rel.ro.local' of ...: defined in discarded section ... since r14-4944

2024-02-26 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113617

--- Comment #17 from GCC Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:1931c40364bb9fb0a7c4b650917e3ac0e88bf6f4

commit r14-9185-g1931c40364bb9fb0a7c4b650917e3ac0e88bf6f4
Author: Jakub Jelinek 
Date:   Mon Feb 26 17:55:07 2024 +0100

varasm: Handle private COMDAT function symbol reference in readonly data
section [PR113617]

If default_elf_select_rtx_section is called to put a reference to some
local symbol defined in a comdat section into memory, which happens more
often
since the r14-4944 RA change, linking might fail.
default_elf_select_rtx_section puts such constants into .data.rel.ro.local
etc. sections and if linker chooses comdat sections from some other TU
and discards the one to which a relocation in .data.rel.ro.local remains,
linker diagnoses error.  References to private comdat symbols can only
appear
from functions or data objects in the same comdat group, so the following
patch arranges using .data.rel.ro.local.pool. and similar
sections.

2024-02-26  Jakub Jelinek  
H.J. Lu  

PR rtl-optimization/113617
* varasm.cc (default_elf_select_rtx_section): For
references to private symbols in comdat sections
use .data.relro.local.pool., .data.relro.pool.
or .rodata. comdat sections.

* g++.dg/other/pr113617.C: New test.
* g++.dg/other/pr113617.h: New test.
* g++.dg/other/pr113617-aux.cc: New test.

[Bug c/114112] Error message is translatable but inserts untranslated substring

2024-02-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114112

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2024-02-26
 Ever confirmed|0   |1

--- Comment #1 from Andrew Pinski  ---
Confirmed.
An enum should be used instead and then N_( should be used around it.
Like what is done for format_specifier_kind  in c-family/c-format.cc:
```
/* Enum describing the kind of specifiers present in the format and
   requiring an argument.  */
enum format_specifier_kind {
  CF_KIND_FORMAT,
  CF_KIND_FIELD_WIDTH,
  CF_KIND_FIELD_PRECISION
};

static const char *kind_descriptions[] = {
  N_("format"),
  N_("field width specifier"),
  N_("field precision specifier")
};
```

[Bug rtl-optimization/113617] [14 Regression] Symbol ... referenced in section `.data.rel.ro.local' of ...: defined in discarded section ... since r14-4944

2024-02-26 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113617

Jakub Jelinek  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #18 from Jakub Jelinek  ---
Fixed.

[Bug libstdc++/114118] New: std::is_floating_point<_Float32> and __is_floating<_Float32> are false in C++20 and older

2024-02-26 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114118

Bug ID: 114118
   Summary: std::is_floating_point<_Float32> and
__is_floating<_Float32> are false in C++20 and older
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: redi at gcc dot gnu.org
  Target Milestone: ---

Since GCC 13 we defined _Float32 etc. as distinct types, but the library only
considers them to be floating-point types for C++23 and later, when 
declares the aliases std::float32_t etc.

This means that the proposed solution for PR 114018 only works in C++23:

  // _GLIBCXX_RESOLVE_LIB_DEFECTS
  // 3790. P1467 accidentally changed nexttoward's signature
  template
typename __gnu_cxx::__enable_if<__is_floating<_Tp>::__value, _Tp>::__type
nexttoward(_Tp, long double) = delete; // not defined for extended FP types

For C++20 std::nexttoward(_Float32(0), 0.0L) compiles and selects the float
overload.

To consistently delete them we would need to do:

#if __FLT32_DIG__
  void nexttoward(_Float32, long double) = delete;
#endif

We should probably just make __is_floating<_Float32> true for all -std modes.

And also define __gnu_cxx::__numeric_traits<_Float32>.

[Bug target/114116] [14 Regression] Broken backtraces in bootstrapped x86_64 gcc

2024-02-26 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114116

H.J. Lu  changed:

   What|Removed |Added

 Status|ASSIGNED|NEW
   Assignee|hjl.tools at gmail dot com |unassigned at gcc dot 
gnu.org

--- Comment #6 from H.J. Lu  ---
(In reply to Jakub Jelinek from comment #5)
> Yeah.  Not to mention, one can call backtrace even if -g0; you just don't
> get nice names for the addresses.  Without the patch you get crashes in the
> unwinder when doing backtrace.

Should we generate REG_CFA_UNDEFINED for unsaved callee-saved registers to
help unwinder:

https://patchwork.sourceware.org/project/gcc/list/?series=30327

[Bug fortran/114012] overloaded unary operator called twice

2024-02-26 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114012

--- Comment #4 from GCC Commits  ---
The master branch has been updated by Harald Anlauf :

https://gcc.gnu.org/g:2f71e801ad0bb1f620334aadbd7c99cc4efe6309

commit r14-9186-g2f71e801ad0bb1f620334aadbd7c99cc4efe6309
Author: Harald Anlauf 
Date:   Sun Feb 25 21:18:23 2024 +0100

Fortran: do not evaluate polymorphic functions twice in assignment
[PR114012]

PR fortran/114012

gcc/fortran/ChangeLog:

* trans-expr.cc (gfc_conv_procedure_call): Evaluate non-trivial
arguments just once before assigning to an unlimited polymorphic
dummy variable.

gcc/testsuite/ChangeLog:

* gfortran.dg/pr114012.f90: New test.

[Bug libstdc++/114118] std::is_floating_point<_Float32> and __is_floating<_Float32> are false in C++20 and older

2024-02-26 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114118

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #1 from Jakub Jelinek  ---
The reason was mainly that without -std=c++23, most of the library support just
isn't there.  The f16/f32/f64/f128 etc. literal suffixes will result in
pedwarns,
__STDCPP_FLOAT*_T__ isn't defined,  is a C++23 header, etc.
Most of the library changes were guarded with __STDCPP_FLOAT*_T__ macros.
If you think it is worth it enabling it for C++20 or older as well and such
changes
wouldn't cause problems for valid C++20 or 17 etc. code not using the types,
then
all that (perhaps except for bfloat16_t stuff?) would need to start using
__FLT*_MANT_DIG__ and similar macros instead.  But then we also run into a
problem that I think clang++ predefines those even when it doesn't implement
the C++23 paper.

[Bug analyzer/105898] RFE: -fanalyzer should complain about overlapping args to mempcpy, wmemcpy, and wmempcpy

2024-02-26 Thread egallager at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105898

Eric Gallager  changed:

   What|Removed |Added

Summary|RFE: -fanalyzer should  |RFE: -fanalyzer should
   |complain about overlapping  |complain about overlapping
   |args to memcpy and mempcpy  |args to mempcpy, wmemcpy,
   ||and wmempcpy

--- Comment #5 from Eric Gallager  ---
(In reply to David Malcolm from comment #4)
> I implemented this a different way, for memcpy, in r14-3556-g034d99e81484fb
> (by special-casing it).
> 
> We don't yet check mempcpy, wmemcpy, or wmempcp; keeping bug open to handle
> those.

Retitling.

[Bug tree-optimization/114119] New: add reduction promotion from unsigned char to unsigned not vectorized

2024-02-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114119

Bug ID: 114119
   Summary: add reduction promotion from unsigned char to unsigned
not vectorized
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: enhancement
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pinskia at gcc dot gnu.org
  Target Milestone: ---
Target: aarch64

Take:
```
unsigned  f(unsigned char *src)
{
unsigned sum = 0;
for(int y = 0; y < 8; y++)
{
sum += src[y];
}
return sum;
}
```

This is not vectorized for aarch64 but it is for x86_64.

[Bug tree-optimization/114120] New: add reduction with promotion and then truncation poorly vectorized

2024-02-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114120

Bug ID: 114120
   Summary: add reduction with promotion and then truncation
poorly vectorized
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: enhancement
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pinskia at gcc dot gnu.org
Blocks: 53947
  Target Milestone: ---
Target: x86_64

Take:
```
unsigned char f(unsigned char *src)
{
unsigned  sum = 0;
for(int y = 0; y < 8; y++)
{
sum += src[y];
}
return sum;
}
```

On x86_64 we should vectorize to the same as what is done for:
```
unsigned char f0(unsigned char *src)
{
unsigned char sum = 0;
for(int y = 0; y < 8; y++)
{
sum += src[y];
}
return sum;
}
```

But GCC does not as GCC keeps sum in unsigned and the reduction is done in
`unsigned int`.

Note LLVM is able to vectorize this decently.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
[Bug 53947] [meta-bug] vectorizer missed-optimizations

  1   2   >