[Bug middle-end/115863] [15 Regression] zlib-1.3.1 miscompilation since r15-1936-g80e446e829d818

2024-07-17 Thread lin1.hu at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115863

--- Comment #18 from Hu Lin  ---
(In reply to Uroš Bizjak from comment #17)
> (In reply to Hongtao Liu from comment #16)
> > > Unfortunately, x86 has no vector mode .SAT_TRUNC instruction.
> > No, AVX512 supports both signed and unsigned saturation
> Indeed.
> 
> BTW: PACKUSmn (despite the name) is not what we are looking for.

Indeed.

[Bug target/115950] Missed SVE fold to INCP

2024-07-17 Thread ktkachov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115950

--- Comment #3 from ktkachov at gcc dot gnu.org ---
(In reply to Andrew Pinski from comment #2)
> Hmm actually there are patterns there but they are not matching. Something
> seems to be going wrong with define_insn_and_rewrite ...

The MD pattern requires a (const_int SVE_KNOWN_PTRUE) in one of its operands
but the attempted match has (const_int 0) i.e. SVE_MAYBE_NOT_PTRUE which blocks
matching.

[Bug tree-optimization/115766] [12/13/14 Regression] wrong code at optimization levels -O2, -O3

2024-07-17 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115766

--- Comment #8 from Sam James  ---
Are you sure the reduced one is accurate? For me, it behaves the same with
-O0..-O3 for GCC. For Clang, it has the same behaviour as GCC with -O0, and
"passes" with > -O0.

[Bug tree-optimization/115868] [14 Regression] ICE: in exact_div, at poly-int.h:2156

2024-07-17 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115868

--- Comment #3 from GCC Commits  ---
The releases/gcc-14 branch has been updated by Richard Biener
:

https://gcc.gnu.org/g:c58bede01c06c84f0b36881fafd1e5d6456a38f4

commit r14-10443-gc58bede01c06c84f0b36881fafd1e5d6456a38f4
Author: Richard Biener 
Date:   Thu Jul 11 09:56:56 2024 +0200

tree-optimization/115868 - ICE with .MASK_CALL in simdclone

The following adjusts mask recording which didn't take into account
that we can merge call arguments from two vectors like

  _50 = {vect_d_1.253_41, vect_d_1.254_43};
  _51 = VIEW_CONVERT_EXPR(mask__19.257_49);
  _52 = (unsigned int) _51;
  _53 = _Z3bazd.simdclone.7 (_50, _52);
  _54 = BIT_FIELD_REF <_53, 256, 0>;
  _55 = BIT_FIELD_REF <_53, 256, 256>;

The testcase g++.dg/vect/pr68762-2.cc exercises this on x86_64 with
partial vector usage enabled and AVX512 support.

PR tree-optimization/115868
* tree-vect-stmts.cc (vectorizable_simd_clone_call): Correctly
compute the number of mask copies required for
vect_record_loop_mask.

(cherry picked from commit abf3964711f05b6858d9775c3595ec2b45483e14)

[Bug tree-optimization/105769] [11/12/13/14/15 Regression] program segmentation fault with -ftree-vectorize and nested lambdas

2024-07-17 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105769

Richard Biener  changed:

   What|Removed |Added

  Known to fail||12.4.0, 13.3.0, 14.1.0
  Known to work||10.5.0

--- Comment #17 from Richard Biener  ---
I'm not actually seeing the problematic use of the hoisted address - the
address value itself is stored and the trick of looking at SSA uses defs to
pick up
indirect address uses later doesn't work here as the only use is in the
vector CTOR:

  _15 = (long unsigned int) &bias;
  _10 = (long unsigned int) &cov_jn;
  _12 = {_10, _15};
...
  bias ={v} {CLOBBER(bob)}; 

but _12 is only used in

  MEM  [(void *)&D.5715 + 32B] = _12;

and then maybe indirectly

  __ct_comp  (_14, &D.5715.__est);

I can fix the miscompile with the following patch - we're treating all
CLOBBER kinds as invalidating earlier mentions.  I'm not sure that's
really necessary and it's definitely harmful when there are hoisted
address mentions.  It also explains that -fstack-reuse=none doesn't
help as the gimplifier only inserts CLOBBER_STORAGE_END clobbers.
I'm also allowing CLOBBER_OBJECT_END here.

I do not remember whether we discussed doing sth like this instead of the
special SSA use handling we added?

diff --git a/gcc/cfgexpand.cc b/gcc/cfgexpand.cc
index eef565eddb5..92968075b04 100644
--- a/gcc/cfgexpand.cc
+++ b/gcc/cfgexpand.cc
@@ -632,6 +632,13 @@ add_scope_conflicts_1 (basic_block bb, bitmap work, bool
for_conflict)
 that are COMPONENT_REFs.  */
  if (!VAR_P (lhs))
continue;
+ tree cl = gimple_assign_rhs1 (stmt);
+ /* When the clobber is possibly a object/storage start do not
+ignore previous mentions at this point.  Those might
+include hoisted address uses.  */
+ if (CLOBBER_KIND (cl) != CLOBBER_STORAGE_END
+ && CLOBBER_KIND (cl) != CLOBBER_OBJECT_END)
+   continue;
  if (DECL_RTL_IF_SET (lhs) == pc_rtx
  && (v = decl_to_stack_part->get (lhs)))
bitmap_clear_bit (work, *v);

[Bug tree-optimization/115382] Wrong code with in-order conditional reduction and masked loops

2024-07-17 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115382

--- Comment #8 from GCC Commits  ---
The releases/gcc-14 branch has been updated by Richard Biener
:

https://gcc.gnu.org/g:bf64404280a90715d1228edef0d5756e81635a64

commit r14-10444-gbf64404280a90715d1228edef0d5756e81635a64
Author: Robin Dapp 
Date:   Fri Jun 7 14:36:41 2024 +0200

vect: Merge loop mask and cond_op mask in fold-left reduction [PR115382].

Currently we discard the cond-op mask when the loop is fully masked
which causes wrong code in
gcc.dg/vect/vect-cond-reduc-in-order-2-signed-zero.c
when compiled with
-O3 -march=cascadelake --param vect-partial-vector-usage=2.

This patch ANDs both masks.

gcc/ChangeLog:

PR tree-optimization/115382

* tree-vect-loop.cc (vectorize_fold_left_reduction): Use
prepare_vec_mask.
* tree-vect-stmts.cc (check_load_store_for_partial_vectors):
Remove static of prepare_vec_mask.
* tree-vectorizer.h (prepare_vec_mask): Export.

(cherry picked from commit 2b438a0d2aa80f051a09b245a58f643540d4004b)

[Bug tree-optimization/114908] fails to optimize avx2 in-register permute written with std::experimental::simd

2024-07-17 Thread mkretz at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114908

--- Comment #10 from Matthias Kretz (Vir)  ---
(In reply to Richard Biener from comment #9)
> One issue with
> 
> V load3(const unsigned long* ptr)
> {
>   V ret = {};
>   __builtin_memcpy(&ret, ptr, 3 * sizeof(unsigned long));
> 
> is that we cannot load a vector worth of data from ptr because that might
> trap

Unless the target has a masked load instruction (e.g. AVX512) or ptr is known
to be aligned to at least 16 Bytes (in which case we know there cannot be a
page boundary at ptr + 24 Bytes). No? In this specific example, ptr is pointing
to a 32-Byte vector object.

The library can do this and it makes a difference:

if (__builtin_object_size(ptr, 0) >= 4 * sizeof(T))
  __builtin_memcpy(&ret, ptr, 4 * sizeof(T));
else
  __builtin_memcpy(&ret, ptr, 3 * sizeof(T));

[Bug c++/115964] New: GCC accepts invalid program with explicit object member function overloads

2024-07-17 Thread jlame646 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115964

Bug ID: 115964
   Summary: GCC accepts invalid program with explicit object
member function overloads
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jlame646 at gmail dot com
  Target Milestone: ---

The following invalid program is accepted by gcc. Demo:
https://godbolt.org/z/dG4qEKzb5

```
struct C {
   void j(this const C);
   void j() const ;  //gcc ok, clang: nope, edg:ok 

   void f(this C );   
   void f(C);   //gcc ok, clang ok, edg:nope
};

```
Both these overloads are ill-formed as explained here:
https://stackoverflow.com/questions/78758215/explicit-member-function-discrepancies-between-different-compilers

[Bug c/66425] (void) cast doesn't suppress __attribute__((warn_unused_result))

2024-07-17 Thread uecker at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66425

--- Comment #67 from uecker at gcc dot gnu.org ---
(In reply to Andrew Church from comment #66)
> (In reply to Andrew Church from comment #65)
> > As one of the advocates for this behavior, it stems (at least in my case)
> > from pre-C23 code in which [[attribute]] syntax was not available.  If
> > [[nodiscard]] suppresses the warning, I'd accept that as a solution.
> 
> Premature reply (apologies) - I somehow gave myself the impression that
> [[nodiscard]] could be put at the place of use.  Since it's something the
> library writer has to do, I think this is still not enough from the library
> user's point of view.

I agree that this is an argument for having a compiler switch.

But also the library could switch to "discard"  or add a condition that the
lets the user of the library choose it. 

As a last resort, on the compiler side one can already use a pragma to turn off
the
warning at a specific point.

So I am a bit unsure about whether the flag is worth having

[Bug tree-optimization/105769] [11/12/13/14/15 Regression] program segmentation fault with -ftree-vectorize and nested lambdas

2024-07-17 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105769

--- Comment #18 from Richard Biener  ---
(In reply to Richard Biener from comment #17)
> I'm not actually seeing the problematic use of the hoisted address - the
> address value itself is stored and the trick of looking at SSA uses defs to
> pick up
> indirect address uses later doesn't work here as the only use is in the
> vector CTOR:
> 
>   _15 = (long unsigned int) &bias;
>   _10 = (long unsigned int) &cov_jn;
>   _12 = {_10, _15};
> ...
>   bias ={v} {CLOBBER(bob)}; 
> 
> but _12 is only used in
> 
>   MEM  [(void *)&D.5715 + 32B] = _12;
> 
> and then maybe indirectly
> 
>   __ct_comp  (_14, &D.5715.__est);
> 
> I can fix the miscompile with the following patch - we're treating all
> CLOBBER kinds as invalidating earlier mentions.  I'm not sure that's
> really necessary and it's definitely harmful when there are hoisted
> address mentions.  It also explains that -fstack-reuse=none doesn't
> help as the gimplifier only inserts CLOBBER_STORAGE_END clobbers.
> I'm also allowing CLOBBER_OBJECT_END here.
> 
> I do not remember whether we discussed doing sth like this instead of the
> special SSA use handling we added?
> 
> diff --git a/gcc/cfgexpand.cc b/gcc/cfgexpand.cc
> index eef565eddb5..92968075b04 100644
> --- a/gcc/cfgexpand.cc
> +++ b/gcc/cfgexpand.cc
> @@ -632,6 +632,13 @@ add_scope_conflicts_1 (basic_block bb, bitmap work,
> bool for_conflict)
>  that are COMPONENT_REFs.  */
>   if (!VAR_P (lhs))
> continue;
> + tree cl = gimple_assign_rhs1 (stmt);
> + /* When the clobber is possibly a object/storage start do not
> +ignore previous mentions at this point.  Those might
> +include hoisted address uses.  */
> + if (CLOBBER_KIND (cl) != CLOBBER_STORAGE_END
> + && CLOBBER_KIND (cl) != CLOBBER_OBJECT_END)
> +   continue;
>   if (DECL_RTL_IF_SET (lhs) == pc_rtx
>   && (v = decl_to_stack_part->get (lhs)))
> bitmap_clear_bit (work, *v);

It breaks g++.dg/opt/pr86214-1.C and gcc.target/i386/stack-check-17.c

[Bug tree-optimization/104515] [11/12/13/14/15 Regression] trivially-destructible destructors interfere with loop optimization - maybe related to lifetime-dse.

2024-07-17 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104515

--- Comment #9 from Richard Biener  ---
I think not re-emitting the clobber on the exit might be OK semantically - it's
at most a missed optimization for a store.  I think it's also OK for the
stack slot sharing logic as any object becoming live after a clobber
would need to have a mention inbetween the clobber and the exit.  I'll note
that if that's not OK then duplicating the clobber wouldn't either.
As we don't know whether the object we store to ends its lifetime (it's only
a may-alias) we can't do both - extend its lifetime and preserve the
original one.

We could maybe move the clobber (duplicating it to each edge but removing
it from the loop body).

I'm testing a patch.

[Bug c++/96717] -flifetime-dse=2 breaks webkit-gtk-2.28.4

2024-07-17 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96717

Richard Biener  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #10 from Richard Biener  ---
This was fixed long time ago.

[Bug tree-optimization/96881] [11 Regression] Clobbers on NULL vs. DCE since r8-1519

2024-07-17 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96881
Bug 96881 depends on bug 96717, which changed state.

Bug 96717 Summary: -flifetime-dse=2 breaks webkit-gtk-2.28.4
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96717

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

[Bug tree-optimization/96722] [8/9 Regression] Clobbers on NULL since r8-1519

2024-07-17 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96722
Bug 96722 depends on bug 96717, which changed state.

Bug 96717 Summary: -flifetime-dse=2 breaks webkit-gtk-2.28.4
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96717

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

[Bug c++/96721] [11 Regression] pseudo-destructor calls on pointers since r11-2238

2024-07-17 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96721
Bug 96721 depends on bug 96717, which changed state.

Bug 96717 Summary: -flifetime-dse=2 breaks webkit-gtk-2.28.4
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96717

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

[Bug tree-optimization/104515] [11/12/13/14/15 Regression] trivially-destructible destructors interfere with loop optimization - maybe related to lifetime-dse.

2024-07-17 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104515

--- Comment #10 from Richard Biener  ---
Testcase from PR96717, also fixed with my patch in testing.

#include 

void pop_many(std::vector& v, unsigned n) {
for (unsigned i = 0; i < n; ++i) {
v.pop_back();
}
}

[Bug tree-optimization/115868] [14 Regression] ICE: in exact_div, at poly-int.h:2156

2024-07-17 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115868

Richard Biener  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #4 from Richard Biener  ---
Fixed now.

[Bug tree-optimization/53947] [meta-bug] vectorizer missed-optimizations

2024-07-17 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
Bug 53947 depends on bug 115868, which changed state.

Bug 115868 Summary: [14 Regression] ICE: in exact_div, at poly-int.h:2156
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115868

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

[Bug tree-optimization/115382] Wrong code with in-order conditional reduction and masked loops

2024-07-17 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115382

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
   Target Milestone|--- |14.2
 Resolution|--- |FIXED

--- Comment #9 from Richard Biener  ---
Fixed now.

[Bug tree-optimization/53947] [meta-bug] vectorizer missed-optimizations

2024-07-17 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
Bug 53947 depends on bug 115382, which changed state.

Bug 115382 Summary: Wrong code with in-order conditional reduction and masked 
loops
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115382

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

[Bug testsuite/110445] [14 Regression] FAIL: gcc.dg/vect/slp-46.c with AVX2

2024-07-17 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110445

Richard Biener  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED
   Target Milestone|14.2|15.0

--- Comment #5 from Richard Biener  ---
So fixed.  I'm not planning to backport.

[Bug other/115958] [15 regression] varasm.cc:8546:27: error: comparison of integer expressions of different signedness since r15-2039-g9964edfb4abdec

2024-07-17 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115958

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |15.0

[Bug c/66425] (void) cast doesn't suppress __attribute__((warn_unused_result))

2024-07-17 Thread achurch+gcc at achurch dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66425

--- Comment #68 from Andrew Church  ---
(In reply to uecker from comment #67)
> But also the library could switch to "discard"  or add a condition that the
> lets the user of the library choose it. 

The issue here is that the library user has no control over what the library
author chooses to do.  If the library author does not make that change, the
user currently has no recourse (other than the pragma workaround you suggest).

> As a last resort, on the compiler side one can already use a pragma to turn
> off the
> warning at a specific point.

While true, this also introduces both a compiler-specific hack and a lot of
verboseness around what ought to be a simple declaration of "I wish to ignore
this return value", and I feel like most code authors who encounter this
problem are more likely to add -Wno-unused-result to their compiler flags (thus
losing the check everywhere) than do a whole
#pragma GCC diagnostic push
#pragma GCC diagnostic ignored "-Wunused-result"
system(foo);
#pragma GCC diagnostic pop
just to make that one instance go away.

I do agree that "(void)" is very idiomatic, and something like a [[discard]]
statement attribute (which would silence warnings for both __attribute__((wur))
and [[nodiscard]]) would make the intent clearer.  Perhaps something to suggest
for a future version of the C standard?

[Bug tree-optimization/114908] fails to optimize avx2 in-register permute written with std::experimental::simd

2024-07-17 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114908

--- Comment #11 from rguenther at suse dot de  ---
On Wed, 17 Jul 2024, mkretz at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114908
> 
> --- Comment #10 from Matthias Kretz (Vir)  ---
> (In reply to Richard Biener from comment #9)
> > One issue with
> > 
> > V load3(const unsigned long* ptr)
> > {
> >   V ret = {};
> >   __builtin_memcpy(&ret, ptr, 3 * sizeof(unsigned long));
> > 
> > is that we cannot load a vector worth of data from ptr because that might
> > trap
> 
> Unless the target has a masked load instruction (e.g. AVX512) or ptr is known
> to be aligned to at least 16 Bytes (in which case we know there cannot be a
> page boundary at ptr + 24 Bytes). No? In this specific example, ptr is 
> pointing
> to a 32-Byte vector object.

Sure but here we have no alignment info available (at most 8 byte 
alignment from the pointer type).  I don't think introducing a .MASK_LOAD
for the purpose of eliding a memcpy is a good thing to do (locally,
just taking into account the memcpy on its own).

> The library can do this and it makes a difference:
> 
> if (__builtin_object_size(ptr, 0) >= 4 * sizeof(T))
>   __builtin_memcpy(&ret, ptr, 4 * sizeof(T));
> else
>   __builtin_memcpy(&ret, ptr, 3 * sizeof(T));

I see, but that's then of course after inlining.

In my former C++ times I've used template metaprogramming to implement
this as an unrolled element-by-element copy (emitting a loop would
be possible as well, of course).

[Bug c/66425] (void) cast doesn't suppress __attribute__((warn_unused_result))

2024-07-17 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66425

--- Comment #69 from Jonathan Wakely  ---
(In reply to Andrew Church from comment #68)
> I do agree that "(void)" is very idiomatic, and something like a [[discard]]
> statement attribute (which would silence warnings for both
> __attribute__((wur)) and [[nodiscard]]) would make the intent clearer. 
> Perhaps something to suggest for a future version of the C standard?

Maybe you want:

 [[maybe_unused]] auto _ = foo();

This is already in C23.

[Bug c++/115960] gcc throws an error when I use Optional in c++17.

2024-07-17 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115960

--- Comment #4 from Jonathan Wakely  ---
> The problem seems to be that, despite passing "-std=c++17", it doesn't use 
> c++17
> header files for the Optional identifier.

Why should it? The name "Optional" is not part of any C++ standard.

[Bug c++/115965] New: Stack smashing depending on order of declaration

2024-07-17 Thread nathan.teodosio at canonical dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115965

Bug ID: 115965
   Summary: Stack smashing depending on order of declaration
   Product: gcc
   Version: 14.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: nathan.teodosio at canonical dot com
  Target Milestone: ---

If I execute the binary I get

--->
% ./e
*** stack smashing detected ***: terminated
Aborted (core dumped)
<---

However, no error is raised if I swap lines 17 (where a and b are declared) and
18 (where c is declared), or if I move either a or b definition to after c.

Valgrind says:

--->
% valgrind -s --track-origins=yes --leak-check=full --show-leak-kinds=all ./e
==173999== Memcheck, a memory error detector
==173999== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==173999== Using Valgrind-3.22.0 and LibVEX; rerun with -h for copyright info
==173999== Command: ./e
==173999==
==173999== Conditional jump or move depends on uninitialised value(s)
==173999==at 0x1092FA: main (in /tmp/e)
==173999==  Uninitialised value was created by a stack allocation
==173999==at 0x109244: main (in /tmp/e)
==173999==
*** stack smashing detected ***: terminated
==173999==
==173999== Process terminating with default action of signal 6 (SIGABRT):
dumping core
==173999==at 0x4928B1C: __pthread_kill_implementation (pthread_kill.c:44)
==173999==by 0x4928B1C: __pthread_kill_internal (pthread_kill.c:78)
==173999==by 0x4928B1C: pthread_kill@@GLIBC_2.34 (pthread_kill.c:89)
==173999==by 0x48CF26D: raise (raise.c:26)
==173999==by 0x48B28FE: abort (abort.c:79)
==173999==by 0x48B37B5: __libc_message_impl.cold (libc_fatal.c:132)
==173999==by 0x49C0C18: __fortify_fail (fortify_fail.c:24)
==173999==by 0x49C1EA3: __stack_chk_fail (stack_chk_fail.c:24)
==173999==by 0x109300: main (in /tmp/e)
==173999==
==173999== HEAP SUMMARY:
==173999== in use at exit: 0 bytes in 0 blocks
==173999==   total heap usage: 0 allocs, 0 frees, 0 bytes allocated
==173999==
==173999== All heap blocks were freed -- no leaks are possible
==173999==
==173999== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
==173999==
==173999== 1 errors in context 1 of 1:
==173999== Conditional jump or move depends on uninitialised value(s)
==173999==at 0x1092FA: main (in /tmp/e)
==173999==  Uninitialised value was created by a stack allocation
==173999==at 0x109244: main (in /tmp/e)
==173999==
==173999== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
Aborted (core dumped)
<---

[Bug c++/115965] Stack smashing depending on order of declaration

2024-07-17 Thread nathan.teodosio at canonical dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115965

--- Comment #1 from Nathan Teodosio  ---
Created attachment 58690
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58690&action=edit
Preprocessed file (compressed with Gzip)

[Bug c++/115965] Stack smashing depending on order of declaration

2024-07-17 Thread nathan.teodosio at canonical dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115965

--- Comment #2 from Nathan Teodosio  ---
Created attachment 58691
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58691&action=edit
Source file

[Bug tree-optimization/115959] [15 Regression] rv64gcv ICE: segfault during GIMPLE pass: vect

2024-07-17 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115959

Richard Biener  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org

--- Comment #2 from Richard Biener  ---
I will have a look,

(gdb) p debug (cond_node)
t.c:3:6: note: node (constant) 0x5a76f60 (max_nunits=1, refcnt=1) vector(4)
unsigned char
t.c:3:6: note:  { 0 }

not sure how this happened, it seems STMT_VINFO_REDUC_IDX got "off".

t.c:3:6: note: node 0x5a76e30 (max_nunits=4, refcnt=2) vector(4) int
t.c:3:6: note: op template: patt_31 = _4 != 0 ? t_14 : 0;
t.c:3:6: note:  [l] stmt 0 patt_31 = _4 != 0 ? t_14 : 0;
t.c:3:6: note:  children 0x5a76ec8 0x5a76f60 0x5a76ff8 0x5a77090

The stmts reduc_idx is 1 which is OK.  Ah, but we have four children
for this frankenstein COND_EXPR which still has a GENERIC first operand.
Partial transitions haunt us here ...

[Bug target/115954] Alignment of _Atomic structs incompatible between GCC and LLVM

2024-07-17 Thread wilco at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115954

--- Comment #12 from Wilco  ---
This came out of the AArch64 Atomic ABI design work:
https://github.com/ARM-software/abi-aa/pull/256

[Bug middle-end/115527] incorrect folding of __builtin_clear_padding()

2024-07-17 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115527

--- Comment #11 from GCC Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:8b5919bae11754f4b65a17e63663d3143f9615ac

commit r15-2090-g8b5919bae11754f4b65a17e63663d3143f9615ac
Author: Jakub Jelinek 
Date:   Wed Jul 17 11:38:33 2024 +0200

gimple-fold: Fix up __builtin_clear_padding lowering [PR115527]

The builtin-clear-padding-6.c testcase fails as clear_padding_type
doesn't correctly recompute the buf->size and buf->off members after
expanding clearing of an array using a runtime loop.
buf->size should be in that case the offset after which it should continue
with next members or padding before them modulo UNITS_PER_WORD and
buf->off that offset minus buf->size.  That is what the code was doing,
but with off being the start of the loop cleared array, not its end.
So, the last hunk in gimple-fold.cc fixes that.
When adding the testcase, I've noticed that the
c-c++-common/torture/builtin-clear-padding-* tests, although clearly
written as runtime tests to test the builtins at runtime, didn't have
{ dg-do run } directive and were just compile tests because of that.
When adding that to the tests, builtin-clear-padding-1.c was already
failing without that clear_padding_type hunk too, but
builtin-clear-padding-5.c was still failing even after the change.
That is due to a bug in clear_padding_flush which the patch fixes as
well - when clear_padding_flush is called with full=true (that happens
at the end of the whole __builtin_clear_padding or on those array
padding clears done by a runtime loop), it wants to flush all the pending
padding clearings rather than just some.  If it is at the end of the whole
object, it decreases wordsize when needed to make sure the code never
writes
including RMW cycles to something outside of the object:
  if ((unsigned HOST_WIDE_INT) (buf->off + i + wordsize)
  > (unsigned HOST_WIDE_INT) buf->sz)
{
  gcc_assert (wordsize > 1);
  wordsize /= 2;
  i -= wordsize;
  continue;
}
but if it is full==true flush in the middle, this doesn't happen, but we
still process just the buffer bytes before the current end.  If that end
is not on a wordsize boundary, e.g. on the builtin-clear-padding-5.c test
the last chunk is 2 bytes, '\0', '\xff', i is 16 and end is 18,
nonzero_last might be equal to the end - i, i.e. 2 here, but still all_ones
might be true, so in some spots we just didn't emit any clearing in that
last chunk.

2024-07-17  Jakub Jelinek  

PR middle-end/115527
* gimple-fold.cc (clear_padding_flush): Introduce endsize
variable and use it instead of wordsize when comparing it against
nonzero_last.
(clear_padding_type): Increment off by sz.

* c-c++-common/torture/builtin-clear-padding-1.c: Add dg-do run
directive.
* c-c++-common/torture/builtin-clear-padding-2.c: Likewise.
* c-c++-common/torture/builtin-clear-padding-3.c: Likewise.
* c-c++-common/torture/builtin-clear-padding-4.c: Likewise.
* c-c++-common/torture/builtin-clear-padding-5.c: Likewise.
* c-c++-common/torture/builtin-clear-padding-6.c: New test.

[Bug other/115958] [15 regression] varasm.cc:8546:27: error: comparison of integer expressions of different signedness since r15-2039-g9964edfb4abdec

2024-07-17 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115958

--- Comment #1 from GCC Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:74bcef4cf16b35fe64767c1e8e529bdd229841a3

commit r15-2091-g74bcef4cf16b35fe64767c1e8e529bdd229841a3
Author: Jakub Jelinek 
Date:   Wed Jul 17 11:40:03 2024 +0200

varasm: Fix bootstrap after the .base64 changes [PR115958]

Apparently there is a -Wsign-compare warning if ptrdiff_t has precision of
int, then (t - s + 1 + 2) / 3 * 4 has int type while cnt unsigned int.
This doesn't warn if ptrdiff_t has larger precision, say on x86_64
it is 64-bit and so (t - s + 1 + 2) / 3 * 4 has long type and cnt unsigned
int.  And it doesn't warn when using older binutils (in my tests I've
used new binutils on x86_64 and old binutils on i686).
Anyway, earlier condition guarantees that t - s is at most 256-ish and
t >= s by construction, so we can just cast it to (unsigned) to avoid
the warning.

2024-07-17  Jakub Jelinek  

PR other/115958
* varasm.cc (default_elf_asm_output_ascii): Cast t - s to unsigned
to avoid -Wsign-compare warnings.

[Bug tree-optimization/114966] fails to optimize avx2 in-register permute written with std::experimental::simd

2024-07-17 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114966

--- Comment #5 from Hongtao Liu  ---
I saw pass_eras optimize BIT_FIELD_REF of big memory into load from small
memory


Created a replacement for D.161366 offset: 0, size: 64: SR.20D.170101
Created a replacement for D.161366 offset: 64, size: 64: SR.21D.170102
Created a replacement for D.161366 offset: 128, size: 64: SR.22D.170103
Created a replacement for D.161547 offset: 0, size: 256: SR.23D.170104


  _8 = BIT_FIELD_REF ;
_9 = BIT_FIELD_REF ;
_10 = BIT_FIELD_REF ;
  _11 = {0, _8, _9, _10};

to 

  SR.20_3 = MEM  [(struct simd *)&data];
  SR.21_13 = MEM  [(struct simd *)&data + 8B];
  SR.22_14 = MEM  [(struct simd *)&data + 16B];
  _7 = SR.20_3;
  _8 = SR.21_13;
  _9 = SR.22_14;
  _10 = {0, _7, _8, _9};


So I guess for the later GCC somehow can't be sure the whole 256-bit memory is
valid and fail to optimize it with vec_perm_expr?

[Bug tree-optimization/114966] fails to optimize avx2 in-register permute written with std::experimental::simd

2024-07-17 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114966

--- Comment #6 from rguenther at suse dot de  ---
On Wed, 17 Jul 2024, liuhongt at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114966
> 
> --- Comment #5 from Hongtao Liu  ---
> I saw pass_eras optimize BIT_FIELD_REF of big memory into load from small
> memory
> 
> 
> Created a replacement for D.161366 offset: 0, size: 64: SR.20D.170101
> Created a replacement for D.161366 offset: 64, size: 64: SR.21D.170102
> Created a replacement for D.161366 offset: 128, size: 64: SR.22D.170103
> Created a replacement for D.161547 offset: 0, size: 256: SR.23D.170104
> 
> 
>   _8 = BIT_FIELD_REF  *)&D.159286].D.158970._M_data, 64, 0>;
> _9 = BIT_FIELD_REF  *)&D.159286].D.158970._M_data, 64, 64>;
> _10 = BIT_FIELD_REF  *)&D.159286].D.158970._M_data, 64, 128>;
>   _11 = {0, _8, _9, _10};
> 
> to 
> 
>   SR.20_3 = MEM  [(struct simd *)&data];
>   SR.21_13 = MEM  [(struct simd *)&data + 8B];
>   SR.22_14 = MEM  [(struct simd *)&data + 16B];
>   _7 = SR.20_3;
>   _8 = SR.21_13;
>   _9 = SR.22_14;
>   _10 = {0, _7, _8, _9};
> 
> 
> So I guess for the later GCC somehow can't be sure the whole 256-bit memory is
> valid and fail to optimize it with vec_perm_expr?

I think the above would be a candidate for SLP vectorization of the
vector CTOR.  A specific example we don't handle right now of course.
Or alternatively by instruction combination in 
simplify_vector_constructor.

[Bug c/66425] (void) cast doesn't suppress __attribute__((warn_unused_result))

2024-07-17 Thread achurch+gcc at achurch dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66425

--- Comment #70 from Andrew Church  ---
(In reply to Jonathan Wakely from comment #69)
> Maybe you want:
> 
>  [[maybe_unused]] auto _ = foo();

If I could apply that attribute to the value itself, i.e.:

[[maybe_unused]] foo();

that would do what I want.  Since it only applies to a declaration, we're still
left with idiosyncratic code ("auto _ =" instead of "(void)") which ideally
should not be needed.

[Bug middle-end/115887] ICE: in gsi_insert_on_edge_immediate, at gimple-iterator.cc:849 with -O -fnon-call-exceptions -finstrument-functions and _BitInt()

2024-07-17 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115887

Jakub Jelinek  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org
 Status|NEW |ASSIGNED

--- Comment #2 from Jakub Jelinek  ---
Created attachment 58692
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58692&action=edit
gcc15-pr115887.patch

Untested fix.

[Bug bootstrap/115951] [15 Regression] pgo+lto enabled bootstrap fails building gnat (ICE in fold_stmt, at gimple-range-fold.cc:701)

2024-07-17 Thread doko at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115951

--- Comment #5 from Matthias Klose  ---
a new build survived on x86_64-linux-gnu. will wait on the results on other
architectures.

[Bug target/115966] New: [15 Regression] Miscompilation of 403.gcc with -Ofast -march=native on x86_64

2024-07-17 Thread pheeck at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115966

Bug ID: 115966
   Summary: [15 Regression] Miscompilation of 403.gcc with -Ofast
-march=native on x86_64
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Keywords: needs-bisection, wrong-code
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pheeck at gcc dot gnu.org
Blocks: 26163
  Target Milestone: ---
  Host: x86_64-linux
Target: x86_64-linux

Compiling the 403.gcc CPU SPEC 2006 benchmark with -g -Ofast -march=native
-fpermissive (403.gcc no longer compiles without -fpermissive) on an x86_64
machine results in a miscompilation. Here is what the benchmark reports:

*** Miscompare of scilab.s; for details see
   
/home/gcc/buildworker/source/cpu2006/benchspec/CPU2006/403.gcc/run/run_peak_ref_amd64-m64-mine./scilab.s.mis
157474: .long   1764174565
.long   1248582442
 ^
157475: .long   1072684140
.long   1072430610
^
157476: .long   103477976
.long   3882853149
^
157477: .long   1072638526
.long   1072384995
^
157478: .long   2763002310
.long   3786780864
^
157479: .long   1072546777
.long   1072746777
^
157484: .long   579723672
.long   1095315795
^
157485: .long   1072020921
.long   1072274451
^
157486: .long   1075822880
.long   1120461514
 ^
157487: .long   1071776283
.long   1071400834
^

I've seen this on AMD Zen4, Zen3 and Zen2 machines and on an Intel Ice Lake
machine.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163
[Bug 26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)

[Bug target/115966] [15 Regression] Miscompilation of 403.gcc with -Ofast -march=native on x86_64

2024-07-17 Thread pheeck at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115966

Filip Kastl  changed:

   What|Removed |Added

   Target Milestone|--- |15.0

[Bug tree-optimization/115959] [15 Regression] rv64gcv ICE: segfault during GIMPLE pass: vect

2024-07-17 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115959

--- Comment #3 from GCC Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:24689b84b8ec0c74c2b9a72ec4fb467069806bda

commit r15-2093-g24689b84b8ec0c74c2b9a72ec4fb467069806bda
Author: Richard Biener 
Date:   Wed Jul 17 11:42:13 2024 +0200

tree-optimization/115959 - ICE with SLP condition reduction

The following fixes how during reduction epilogue generation we
gather conditional compares for condition reductions, thereby
following the reduction chain via STMT_VINFO_REDUC_IDX.  The issue
is that SLP nodes for COND_EXPRs can have either three or four
children dependent on whether we have legacy GENERIC expressions
in the transitional pattern GIMPLE for the COND_EXPR condition.

PR tree-optimization/115959
* tree-vect-loop.cc (vect_create_epilog_for_reduction):
Get at the REDUC_IDX child in a safer way for COND_EXPR
nodes.

* gcc.dg/vect/pr115959.c: New testcase.

[Bug tree-optimization/115959] [15 Regression] rv64gcv ICE: segfault during GIMPLE pass: vect

2024-07-17 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115959

Richard Biener  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #4 from Richard Biener  ---
Fixed.

[Bug target/115966] [15 Regression] Miscompilation of 403.gcc with -Ofast -march=native on x86_64

2024-07-17 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115966

--- Comment #1 from Richard Biener  ---
I think we have a few other gcc miscompiles now (but from SPEC CPU 2017), so
this one is probably related.  Does -fno-strict-aliasing fix it?

[Bug target/80881] Implement Windows native TLS

2024-07-17 Thread tanksherman27 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80881

--- Comment #24 from Julian Waters  ---
Thanks for the patch, I've been looking through it these past few days. While
the simpler parts of it I can manage, I'm struggling terribly with
understanding the RTL shifting code in legitimize_tls_address and the RTL
templates in the machine definitions file (i386.md to be specific). Do you
happen to know how to read the RTL code in the patch? I definitely need some
help with figuring out how it works mechanically

[Bug c/66425] (void) cast doesn't suppress __attribute__((warn_unused_result))

2024-07-17 Thread pskocik at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66425

--- Comment #71 from Petr Skocik  ---
An Ignore macro that works everywhere where a (void) cast syntactically works
(i.e., even on void types for whatever reason) is easy:

#define IGN$(Val) (__extension__({  \
__auto_type IGN$ = _Generic((typeof(Val)*)0, \
void*: ((void)(Val),0), default: Val); (void)IGN$; }))

///
__attribute((warn_unused_result)) int getInt(void);
void getVoid(void);

void ign_test(void){
getInt(); //warning
getVoid(); //no warning

(void)getInt(); //traditionally with a warning
(void)getVoid(); //no warning
IGN$(getInt()); //no warning
IGN$(getVoid()); //no warning
}


https://godbolt.org/z/4qa8TcWMM

(Can be easily done wihtout __auto-type (=>use typeof) or (__extension__({ })
too (use do ;while(0)).

Would strongly prefer if the current semantics of warn_unused_result were not
broken by a late "correction".
The time for a discussion on the semantics of warn_unused_result in combo with
void cast is long gone. It's now been long established that simple (void) casts
do NOT silence warn_unused_result.
Let's not break code that expects such semantics.

A conditional compiler flag to enable void casts to silence WUR might be in
order, however, considering that clang disregards the established semantics and
a void cast does silence WUR on clang (https://godbolt.org/z/4qa8TcWMM).

[Bug target/115966] [15 Regression] Miscompilation of 403.gcc with -Ofast -march=native on x86_64

2024-07-17 Thread pheeck at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115966

--- Comment #2 from Filip Kastl  ---
(In reply to Richard Biener from comment #1)
> I think we have a few other gcc miscompiles now (but from SPEC CPU 2017), so
> this one is probably related.  Does -fno-strict-aliasing fix it?

Just tested it, -fno-strict-aliasing doesn't help

[Bug middle-end/115967] New: ubsan: shift exponent 64 is too large for 64-bit type HOST_WIDE_INT in ext-dce.cc on line 600 since r15-1901-g98914f9eba5f19

2024-07-17 Thread jamborm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115967

Bug ID: 115967
   Summary: ubsan: shift exponent 64 is too large for 64-bit type
HOST_WIDE_INT in ext-dce.cc on line 600 since
r15-1901-g98914f9eba5f19
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jamborm at gcc dot gnu.org
CC: law at gcc dot gnu.org
Blocks: 63426
  Target Milestone: ---
  Host: x86_64-linux
Target: x86_64-linux

Undefined behavior sanitizer reports a failure when running Fortran
testcase gfortran.dg/ieee/large_1.f90 at -O2 and higher:

  /home/mjambor/gcc/mine/src/gcc/ext-dce.cc:600:15: runtime error: shift
exponent 64 is too large for 64-bit type 'long unsigned int'
  /home/mjambor/gcc/mine/src/gcc/ext-dce.cc:404:23: runtime error: left shift
of negative value -1
  FAIL: gfortran.dg/ieee/large_1.f90   -O2  (test for excess errors)

The failure is present since the introduction of the source file
ext-dce.cc with commit r15-1901-g98914f9eba5f19 (Jeff Law:
[to-be-committed][RISC-V][V3] DCE analysis for extension elimination)

One way to reproduce the issue is to bootstrap GCC with Fortran
enabled and with --with-build-config=bootstrap-ubsan and then run the
test case as usual.

It is however much easier to (on an x86_64-linux at least) simply
apply the following patch and then run
  make -k check-gfortran RUNTESTFLAGS="ieee.exp=large_1.f90"

--- a/gcc/ext-dce.cc
+++ b/gcc/ext-dce.cc
@@ -597,6 +597,7 @@ ext_dce_process_uses (rtx_insn *insn, rtx obj, bitmap
live_tmp)
  bit = subreg_lsb (y).to_constant ();
  if (dst_mask)
{
+ gcc_assert (bit < 64);
  dst_mask <<= bit;
  if (!dst_mask)
dst_mask = -0x1ULL;


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63426
[Bug 63426] [meta-bug] Issues found with -fsanitize=undefined

[Bug c++/115965] Stack smashing depending on order of declaration

2024-07-17 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115965

--- Comment #3 from Richard Biener  ---
I think that's somewhat expected.

[Bug c++/115965] Stack smashing depending on order of declaration

2024-07-17 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115965

--- Comment #4 from Jonathan Wakely  ---
Jut to be clear on the bug being reported, are you expecting the error to be
detected in all cases?

[Bug c++/115965] Stack smashing depending on order of declaration

2024-07-17 Thread nathan.teodosio at canonical dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115965

--- Comment #5 from Nathan Teodosio  ---
In none of them. Or am I overlooking a buffer overrun here? Also with Clang I
get no stack smashing even with -fstack-protector-all.

In any case I fail to see why that would be dependent on which of the array
definitions in main come first.

[Bug c++/115963] P3144R2 is not yet completely implemented with a nullptr constant

2024-07-17 Thread de34 at live dot cn via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115963

--- Comment #2 from Jiang An  ---
> The question becomes is that an oversight of P3144R2 or not and should a null 
> pointer
> constant be valid always since that was never undefined or even had a chance 
> of being
> undefined.

CWG2392 (https://cplusplus.github.io/CWG/issues/2392.html) might be related.

In order to keep deleting (constant) null pointer well-formed, we may need to
choose one of following strategies:

1. Make a delete-expression whose operand (after conversion, if any) is a
pointer to an incomplete class type potentially constant evaluated. So that we
can distinguish constant null pointer operand in unevaluated operands (e.g. in
decltype).

2. Always accept such a "bad" delete-expression in unevaluated operands.

Both options reject non-constant and constant but non-null operands.

It seems that GCC is behaving like option 1 (rejecting `decltype(delete
std::declval())` while accepting `delete
static_cast(nullptr)`).
However, it doesn't seem possible to make it conforming to accept
potentially-evaluated `delete reinterpret_cast(std::uintptr_t{})`,
because the operand is not a constant (sub)expression.

On the other hand, I tend to believe "the object being deleted has incomplete
class type" is equivalent to "the pointed to type of the operand is an
incomplete class type", but it's also not very clear whether such reading is
correct.

[Bug fortran/84246] [11/12/13/14/15 Regression] [Coarray] ICE in conv_caf_send, at fortran/trans-intrinsic.c:1950

2024-07-17 Thread vehre at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84246

Andre Vehreschild  changed:

   What|Removed |Added

 Status|ASSIGNED|WAITING

--- Comment #9 from Andre Vehreschild  ---
Patch proposed: https://gcc.gnu.org/pipermail/fortran/2024-July/060692.html
Waiting for review.

[Bug c/106800] Expose more vector extensions in C

2024-07-17 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106800

--- Comment #2 from Richard Biener  ---
I have something in the works.

[Bug c/66425] (void) cast doesn't suppress __attribute__((warn_unused_result))

2024-07-17 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66425

--- Comment #72 from Segher Boessenkool  ---
The correct way to not get the warning about unused results, is to _do_ use
the function return value, of course, as I explained in #c18 already.

Like:

if (foo()) {
  /* The return value of foo can be ignored here because X and Y.  */
}

You *always* should explain why you can ignore it here (not just *that* you
can, that's not an explanation, that's merely a statement), anyway, so this
gives a nicely readable flow.

[Bug c/66425] (void) cast doesn't suppress __attribute__((warn_unused_result))

2024-07-17 Thread zsojka at seznam dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66425

--- Comment #73 from Zdenek Sojka  ---
See MISRA C:2012 Rule 17.7:
"...  If the return value of a function is intended not to be used explicitly,
it should be cast to the void type. ..."

It would be helpful if gcc could be used to write MISRA-compliant code, or at
least if it wouldn't generate compilation warnings when the programmer is
targeting MISRA-compliancy.

I understand this might be a very complex topic, due to gcc being free
software, and other companies possibly providing / selling the tool
qualification.

[Bug tree-optimization/114440] Fail to recognize a chain of lane-reduced operations for loop reduction vect

2024-07-17 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114440

--- Comment #2 from GCC Commits  ---
The master branch has been updated by Feng Xue :

https://gcc.gnu.org/g:178cc419512f7e358f88dfe2336625aa99cd7438

commit r15-2096-g178cc419512f7e358f88dfe2336625aa99cd7438
Author: Feng Xue 
Date:   Wed May 29 17:22:36 2024 +0800

vect: Support multiple lane-reducing operations for loop reduction
[PR114440]

For lane-reducing operation(dot-prod/widen-sum/sad) in loop reduction,
current
vectorizer could only handle the pattern if the reduction chain does not
contain other operation, no matter the other is normal or lane-reducing.

This patches removes some constraints in reduction analysis to allow
multiple
arbitrary lane-reducing operations with mixed input vectypes in a loop
reduction chain. For example:

   int sum = 1;
   for (i)
 {
   sum += d0[i] * d1[i];  // dot-prod 
   sum += w[i];   // widen-sum 
   sum += abs(s0[i] - s1[i]); // sad 
 }

The vector size is 128-bit vectorization factor is 16. Reduction statements
would be transformed as:

   vector<4> int sum_v0 = { 0, 0, 0, 1 };
   vector<4> int sum_v1 = { 0, 0, 0, 0 };
   vector<4> int sum_v2 = { 0, 0, 0, 0 };
   vector<4> int sum_v3 = { 0, 0, 0, 0 };

   for (i / 16)
 {
   sum_v0 = DOT_PROD (d0_v0[i: 0 ~ 15], d1_v0[i: 0 ~ 15], sum_v0);
   sum_v1 = sum_v1;  // copy
   sum_v2 = sum_v2;  // copy
   sum_v3 = sum_v3;  // copy

   sum_v0 = WIDEN_SUM (w_v0[i: 0 ~ 15], sum_v0);
   sum_v1 = sum_v1;  // copy
   sum_v2 = sum_v2;  // copy
   sum_v3 = sum_v3;  // copy

   sum_v0 = SAD (s0_v0[i: 0 ~ 7 ], s1_v0[i: 0 ~ 7 ], sum_v0);
   sum_v1 = SAD (s0_v1[i: 8 ~ 15], s1_v1[i: 8 ~ 15], sum_v1);
   sum_v2 = sum_v2;  // copy
   sum_v3 = sum_v3;  // copy
 }

sum_v = sum_v0 + sum_v1 + sum_v2 + sum_v3;   // = sum_v0 + sum_v1

2024-03-22 Feng Xue 

gcc/
PR tree-optimization/114440
* tree-vectorizer.h (vectorizable_lane_reducing): New function
declaration.
* tree-vect-stmts.cc (vect_analyze_stmt): Call new function
vectorizable_lane_reducing to analyze lane-reducing operation.
* tree-vect-loop.cc (vect_model_reduction_cost): Remove cost
computation
code related to emulated_mixed_dot_prod.
(vectorizable_lane_reducing): New function.
(vectorizable_reduction): Allow multiple lane-reducing operations
in
loop reduction. Move some original lane-reducing related code to
vectorizable_lane_reducing.
(vect_transform_reduction): Adjust comments with updated example.

gcc/testsuite/
PR tree-optimization/114440
* gcc.dg/vect/vect-reduc-chain-1.c
* gcc.dg/vect/vect-reduc-chain-2.c
* gcc.dg/vect/vect-reduc-chain-3.c
* gcc.dg/vect/vect-reduc-chain-dot-slp-1.c
* gcc.dg/vect/vect-reduc-chain-dot-slp-2.c
* gcc.dg/vect/vect-reduc-chain-dot-slp-3.c
* gcc.dg/vect/vect-reduc-chain-dot-slp-4.c
* gcc.dg/vect/vect-reduc-dot-slp-1.c

[Bug tree-optimization/114440] Fail to recognize a chain of lane-reduced operations for loop reduction vect

2024-07-17 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114440

--- Comment #3 from GCC Commits  ---
The master branch has been updated by Feng Xue :

https://gcc.gnu.org/g:db3c8c9726d0bafbb9f85b6d7027fe83602643e7

commit r15-2097-gdb3c8c9726d0bafbb9f85b6d7027fe83602643e7
Author: Feng Xue 
Date:   Wed May 29 17:28:14 2024 +0800

vect: Optimize order of lane-reducing operations in loop def-use cycles

When transforming multiple lane-reducing operations in a loop reduction
chain,
originally, corresponding vectorized statements are generated into def-use
cycles starting from 0. The def-use cycle with smaller index, would contain
more statements, which means more instruction dependency. For example:

   int sum = 1;
   for (i)
 {
   sum += d0[i] * d1[i];  // dot-prod 
   sum += w[i];   // widen-sum 
   sum += abs(s0[i] - s1[i]); // sad 
   sum += n[i];   // normal 
 }

Original transformation result:

   for (i / 16)
 {
   sum_v0 = DOT_PROD (d0_v0[i: 0 ~ 15], d1_v0[i: 0 ~ 15], sum_v0);
   sum_v1 = sum_v1;  // copy
   sum_v2 = sum_v2;  // copy
   sum_v3 = sum_v3;  // copy

   sum_v0 = WIDEN_SUM (w_v0[i: 0 ~ 15], sum_v0);
   sum_v1 = sum_v1;  // copy
   sum_v2 = sum_v2;  // copy
   sum_v3 = sum_v3;  // copy

   sum_v0 = SAD (s0_v0[i: 0 ~ 7 ], s1_v0[i: 0 ~ 7 ], sum_v0);
   sum_v1 = SAD (s0_v1[i: 8 ~ 15], s1_v1[i: 8 ~ 15], sum_v1);
   sum_v2 = sum_v2;  // copy
   sum_v3 = sum_v3;  // copy

   ...
 }

For a higher instruction parallelism in final vectorized loop, an optimal
means is to make those effective vector lane-reducing ops be distributed
evenly among all def-use cycles. Transformed as the below, DOT_PROD,
WIDEN_SUM and SADs are generated into disparate cycles, instruction
dependency among them could be eliminated.

   for (i / 16)
 {
   sum_v0 = DOT_PROD (d0_v0[i: 0 ~ 15], d1_v0[i: 0 ~ 15], sum_v0);
   sum_v1 = sum_v1;  // copy
   sum_v2 = sum_v2;  // copy
   sum_v3 = sum_v3;  // copy

   sum_v0 = sum_v0;  // copy
   sum_v1 = WIDEN_SUM (w_v1[i: 0 ~ 15], sum_v1);
   sum_v2 = sum_v2;  // copy
   sum_v3 = sum_v3;  // copy

   sum_v0 = sum_v0;  // copy
   sum_v1 = sum_v1;  // copy
   sum_v2 = SAD (s0_v2[i: 0 ~ 7 ], s1_v2[i: 0 ~ 7 ], sum_v2);
   sum_v3 = SAD (s0_v3[i: 8 ~ 15], s1_v3[i: 8 ~ 15], sum_v3);

   ...
 }

2024-03-22 Feng Xue 

gcc/
PR tree-optimization/114440
* tree-vectorizer.h (struct _stmt_vec_info): Add a new field
reduc_result_pos.
* tree-vect-loop.cc (vect_transform_reduction): Generate
lane-reducing
statements in an optimized order.

[Bug c/66425] (void) cast doesn't suppress __attribute__((warn_unused_result))

2024-07-17 Thread fw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66425

--- Comment #74 from Florian Weimer  ---
(In reply to Zdenek Sojka from comment #73)
> See MISRA C:2012 Rule 17.7:
> "...  If the return value of a function is intended not to be used
> explicitly, it should be cast to the void type. ..."
> 
> It would be helpful if gcc could be used to write MISRA-compliant code, or
> at least if it wouldn't generate compilation warnings when the programmer is
> targeting MISRA-compliancy.

Doesn't this (interpretation of MISRA) mean that compliant code cannot use
__attribute__((warn_unused_result))? That doesn't require any GCC changes.

[Bug c++/115968] New: g++ 12 and above incorrectly optimize the code with Eigen (-O2 or -O1)

2024-07-17 Thread summersnow9403 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115968

Bug ID: 115968
   Summary: g++ 12 and above incorrectly optimize the code with
Eigen (-O2 or -O1)
   Product: gcc
   Version: 14.1.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: summersnow9403 at gmail dot com
  Target Milestone: ---

The code example can be found on godbolt:

https://godbolt.org/z/qPs4zhb3j

The two functions test_eigen and test_eigen2 are expected to output the same
result. However, when compiled with gcc 12 and above versions with -O2, the
program prints all zeros for test_eigen. When compiled with gcc 12 and above
without -O2, or any versions of clang with -O2, the program behaves correctly.

[Bug c++/115968] g++ 12 and above incorrectly optimize the code with Eigen (-O2 or -O1)

2024-07-17 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115968

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |INVALID

--- Comment #1 from Andrew Pinski  ---
const auto x

See https://eigen.tuxfamily.org/dox/TopicPitfalls.html

[Bug c++/115968] g++ 12 and above incorrectly optimize the code with Eigen (-O2 or -O1)

2024-07-17 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115968

--- Comment #2 from Andrew Pinski  ---
See C++11 and the auto keyword section of
https://eigen.tuxfamily.org/dox/TopicPitfalls.html


The problem is that eval() returns a temporary object (in this case a MatrixXd)
which is then referenced by the Transpose<> expression. However, this temporary
is deleted right after the first line, and then the C expression references a
dead object.

[Bug c/66425] (void) cast doesn't suppress __attribute__((warn_unused_result))

2024-07-17 Thread zsojka at seznam dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66425

--- Comment #75 from Zdenek Sojka  ---
(In reply to Florian Weimer from comment #74)
> (In reply to Zdenek Sojka from comment #73)
> > See MISRA C:2012 Rule 17.7:
> > "...  If the return value of a function is intended not to be used
> > explicitly, it should be cast to the void type. ..."
> > 
> > It would be helpful if gcc could be used to write MISRA-compliant code, or
> > at least if it wouldn't generate compilation warnings when the programmer is
> > targeting MISRA-compliancy.
> 
> Doesn't this (interpretation of MISRA) mean that compliant code cannot use
> __attribute__((warn_unused_result))? That doesn't require any GCC changes.

With the current state of things, yes. MISRA suggests adding the (void) cast,
that does not suppress the warning.

For me the ideal state would be to have a -Wwarn-any-unused-result, to consider
all functions as having the "__attribute__((__warn_unused_result__))"
attribute, with the option of (void) cast to suppress the warning.

Just the following sentence of the MISRA C:2012 explains the "(void)" cast as
the way of preventing dead code; the example from comment #72:

> if (foo()) {
>   /* The return value of foo can be ignored here because X and Y.  */
> }

creates a condition that needs to be covered, even though it might not be
possible to trigger either FALSE or TRUE outcome.

Of course there are other solutions (and possible justifications); I just
wanted to show that the (void) cast of unused function result might not be that
uncommon.

[Bug c++/115968] g++ 12 and above incorrectly optimize the code with Eigen (-O2 or -O1)

2024-07-17 Thread summersnow9403 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115968

--- Comment #3 from HanatoK  ---
(In reply to Andrew Pinski from comment #2)
> See C++11 and the auto keyword section of
> https://eigen.tuxfamily.org/dox/TopicPitfalls.html
> 
> 
> The problem is that eval() returns a temporary object (in this case a
> MatrixXd) which is then referenced by the Transpose<> expression. However,
> this temporary is deleted right after the first line, and then the C
> expression references a dead object.

Thanks! I can confirm that with the clang sanitizers
"-fsanitize=address,undefined,leak" I encounter the "stack-use-after-scope"
error, so the clang result is actually printed from out-of-scope variables. I
think you are right that I misuse the auto keyword.

[Bug c/66425] (void) cast doesn't suppress __attribute__((warn_unused_result))

2024-07-17 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66425

--- Comment #76 from Jakub Jelinek  ---
(void) casts not quieting the warning was an intentional request when the
warning has been added, I really don't think it is a good idea to change that.
The fact that clang people can't properly implement Perhaps you can ask glibc
to recategorize some of the declarations to use [[nodiscard]] instead of
__attribute__((__warn_unused_result__)), IMHO it is helpful to have different
badnesses of ignoring the result, WUR attribute should be used for the cases
where it is always or pretty much always a very severe bug, while nodiscard can
be used for the lighter cases (using the result is nice to have, but usually
nothing wrong will happen if it is ignored).
E.g. ignoring return value of realloc is pretty much always a bad idea and just
(void) realloc (...); is something that shouldn't be supported.

[Bug rtl-optimization/115876] [15 regression] ext-dce.cc has ubsan issues; shifting negative values

2024-07-17 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115876

Andrew Pinski  changed:

   What|Removed |Added

 CC||jamborm at gcc dot gnu.org

--- Comment #10 from Andrew Pinski  ---
*** Bug 115967 has been marked as a duplicate of this bug. ***

[Bug middle-end/115967] ubsan: shift exponent 64 is too large for 64-bit type HOST_WIDE_INT in ext-dce.cc on line 600 since r15-1901-g98914f9eba5f19

2024-07-17 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115967

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #1 from Andrew Pinski  ---
Already recorded as PR 115876 .

*** This bug has been marked as a duplicate of bug 115876 ***

[Bug other/63426] [meta-bug] Issues found with -fsanitize=undefined

2024-07-17 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63426
Bug 63426 depends on bug 115967, which changed state.

Bug 115967 Summary: ubsan: shift exponent 64 is too large for 64-bit type 
HOST_WIDE_INT in ext-dce.cc on line 600 since r15-1901-g98914f9eba5f19
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115967

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |DUPLICATE

[Bug c++/111890] ICE in tsubst_friend_function with friend function declared inside a concept constrainted class inside a template class

2024-07-17 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111890

--- Comment #7 from GCC Commits  ---
The master branch has been updated by Patrick Palka :

https://gcc.gnu.org/g:247335823f420eb1dd56f4bf32ac78d441f5ccc2

commit r15-2098-g247335823f420eb1dd56f4bf32ac78d441f5ccc2
Author: Patrick Palka 
Date:   Wed Jul 17 11:08:35 2024 -0400

c++: constrained partial spec type context [PR111890]

maybe_new_partial_specialization wasn't propagating TYPE_CONTEXT when
creating a new class type corresponding to a constrained partial spec,
which do_friend relies on via template_class_depth to distinguish a
template friend from a non-template friend, and so in the below testcase
we were incorrectly instantiating the non-template operator+ as if it
were a template leading to an ICE.

PR c++/111890

gcc/cp/ChangeLog:

* pt.cc (maybe_new_partial_specialization): Propagate TYPE_CONTEXT
to the newly created partial specialization.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-partial-spec15.C: New test.

Reviewed-by: Jason Merrill 

[Bug c/66425] (void) cast doesn't suppress __attribute__((warn_unused_result))

2024-07-17 Thread achurch+gcc at achurch dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66425

--- Comment #77 from Andrew Church  ---
(In reply to Segher Boessenkool from comment #72)
> if (foo()) {
>   /* The return value of foo can be ignored here because X and Y.  */
> }

This is just another idiom, with "if(){}" replacing "(void)"; it does not
directly indicate that the value is unused, as a hypothetical [[discard]] would
do.  I would even argue that it is worse because (as Zdenek points out) it adds
a branch which would either need to be tested, potentially requiring additional
failure injection logic to trigger the failing case, or documented as not
needing to be covered by a test.

In general, I would consider any code structure with no behavioral effect but a
semantic side-effect (including casting to void, assigning to an unused
variable, or testing in a conditional with an empty block) a code smell, and
would prefer an explicit [[discard]] to make the intent clear.  Given that we
have no [[discard]], I still hold that cast-to-void is the best existing option
due both to conciseness and to widespread recognition of its intent.

There's also an argument to be made that allowing the warning to be bypassed
with if(){} or assignment to an unused variable is weakening the original
intent behind WUR, as Jakub mentions.

(In reply to Jakub Jelinek from comment #76)
> (void) casts not quieting the warning was an intentional request when the
> warning has been added, I really don't think it is a good idea to change
> that.

This is why I initially suggested a compiler option (-Wunused-result=strict) to
select the behavior.  It could of course be coded in reverse, defaulting to the
current behavior and having e.g. -Wunused-result=lax to inhibit WUR warnings.

The fundamental problem with the request behind this feature (in particular,
with the fact that the request comes from a library author) is that the end
user of the compiler is the library user, not the library author, and if the
end user considers the warnings useless, they will find one or another way
around them, however much collateral damage (in the form of missed errors) that
may cause.  Given that, I think it's reasonable to offer a middle-ground option
that lets the end user reject the library author's original intent of forcing
return value usage but retain the ability to check for accidentally unused
return values.

[Bug c++/115965] Stack smashing depending on order of declaration

2024-07-17 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115965

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |INVALID
 Status|UNCONFIRMED |RESOLVED

--- Comment #6 from Andrew Pinski  ---
(In reply to Nathan Teodosio from comment #5)
> In none of them. Or am I overlooking a buffer overrun here?

You definitely are overlooking one.

  for (size_t i = 0; i < size; i += hn::Lanes(d)) {


hn::Store(x, d, x_array + i);


hn::Lanes(d) is 4.

so you are storing 0,1,2,3 and then 4,5,6,7 . Except there are only 5 elements
of x_array so 5,6,7 stores is broken.

>In any case I fail to see why that would be dependent on which of the array 
>definitions in main come first.


Because -fstack-protector-all only checks one place in the stack rather than
after each array. So the order of the arrays on the stack for a tie breaker is
the order of how the user order was. So it just happens to be at the end you
get the stack smasher error.

With -fsanitize=address all arrays have a redzone and you get the following
eror message and that is indepdent of the order of arrays since all load/stores
are checked.


=
==1==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x782bede00030
at pc 0x00401b3e bp 0x7ffeb05deab0 sp 0x7ffeb05deaa8
READ of size 16 at 0x782bede00030 thread T0
#0 0x401b3d in _mm_load_si128(long long __vector(2) const*)
/opt/compiler-explorer/gcc-trunk-20240717/lib/gcc/x86_64-linux-gnu/15.0.0/include/emmintrin.h:701
#1 0x401b3d in Load >
/opt/compiler-explorer/libs/highway/trunk/hwy/ops/x86_128-inl.h:2069
#2 0x401311 in MulAddLoop(int const*, int const*, unsigned long, int*)
/app/example.cpp:11
#3 0x401954 in main /app/example.cpp:22
#4 0x782befa29d8f  (/lib/x86_64-linux-gnu/libc.so.6+0x29d8f) (BuildId:
490fef8403240c91833978d494d39e537409b92e)
#5 0x782befa29e3f in __libc_start_main
(/lib/x86_64-linux-gnu/libc.so.6+0x29e3f) (BuildId:
490fef8403240c91833978d494d39e537409b92e)
#6 0x401104 in _start (/app/output.s+0x401104) (BuildId:
4af5893bdf93a048dba77151f2e0b5e5a0ee46bd)

Address 0x782bede00030 is located in stack of thread T0 at offset 48 in frame
#0 0x4014cf in main /app/example.cpp:18

  This frame has 3 object(s):
[32, 52) 'a' (line 19) <== Memory access at offset 48 partially overflows
this variable
[96, 116) 'b' (line 19)
[160, 180) 'c' (line 20)
HINT: this may be a false positive if your program uses some custom stack
unwind mechanism, swapcontext or vfork
  (longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-buffer-overflow
/opt/compiler-explorer/gcc-trunk-20240717/lib/gcc/x86_64-linux-gnu/15.0.0/include/emmintrin.h:701
in _mm_load_si128(long long __vector(2) const*)
Shadow bytes around the buggy address:
  0x782beddffd80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x782beddffe00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x782beddffe80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x782beddfff00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x782beddfff80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x782bede0: f1 f1 f1 f1 00 00[04]f2 f2 f2 f2 f2 00 00 04 f2
  0x782bede00080: f2 f2 f2 f2 00 00 04 f3 f3 f3 f3 f3 00 00 00 00
  0x782bede00100: f1 f1 f1 f1 f1 f1 01 f2 00 00 f2 f2 f8 f8 f2 f2
  0x782bede00180: f8 f8 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
  0x782bede00200: f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5
  0x782bede00280: f5 f5 f5 f5 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:   00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:   fa
  Freed heap region:   fd
  Stack left redzone:  f1
  Stack mid redzone:   f2
  Stack right redzone: f3
  Stack after return:  f5
  Stack use after scope:   f8
  Global redzone:  f9
  Global init order:   f6
  Poisoned by user:f7
  Container overflow:  fc
  Array cookie:ac
  Intra object redzone:bb
  ASan internal:   fe
  Left alloca redzone: ca
  Right alloca redzone:cb
==1==ABORTING

[Bug c++/115965] Stack smashing depending on order of declaration

2024-07-17 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115965

--- Comment #7 from Andrew Pinski  ---
Note valgrind in this case cannot always capture buffer overruns due to it
cann't easily add a redzone (buffer to detect overruns) for stack arrays. This
is why -fsanitize=address is more powerful than both of the other two here.

[Bug tree-optimization/115936] [15 Regression] GCN vs. ivopts: replace constant_multiple_of with aff_combination_constant_multiple_p [PR114932]

2024-07-17 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115936

--- Comment #7 from GCC Commits  ---
The master branch has been updated by Tamar Christina :

https://gcc.gnu.org/g:0135a90de5a99b51001b6152d8b548151ebfa1c3

commit r15-2099-g0135a90de5a99b51001b6152d8b548151ebfa1c3
Author: Tamar Christina 
Date:   Wed Jul 17 16:22:14 2024 +0100

middle-end: fix 0 offset creation and folding [PR115936]

As shown in PR115936 SCEV and IVOPTS create an invalidate IV when the IV is
a pointer type:

ivtmp.39_65 = ivtmp.39_59 + 0B;

where the IVs are DI mode and the offset is a pointer.
This comes from this weird candidate:

Candidate 8:
  Var befor: ivtmp.39_59
  Var after: ivtmp.39_65
  Incr POS: before exit test
  IV struct:
Type:   sizetype
Base:   0
Step:   0B
Biv:N
Overflowness wrto loop niter:   No-overflow

This IV was always created just ended up not being used.

This is created by SCEV.

simple_iv_with_niters in the case where no CHREC is found creates an IV
with
base == ev, offset == 0;

however in this case EV is a POINTER_PLUS_EXPR and so the type is a
pointer.
it ends up creating an unusable expression.

gcc/ChangeLog:

PR tree-optimization/115936
* tree-scalar-evolution.cc (simple_iv_with_niters): Use sizetype
for
pointers.

[Bug target/115966] [15 Regression] Miscompilation of 403.gcc with -Ofast -march=native on x86_64

2024-07-17 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115966

Andrew Pinski  changed:

   What|Removed |Added

 CC||pinskia at gcc dot gnu.org

--- Comment #3 from Andrew Pinski  ---
Does -fno-ext-dce fix it?

There are a few bugs that Jeff has been working in ext-dce so I wonder if this
might be on of them?

Also do you have a last known to work?

[Bug tree-optimization/115936] [15 Regression] GCN vs. ivopts: replace constant_multiple_of with aff_combination_constant_multiple_p [PR114932]

2024-07-17 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115936

Tamar Christina  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #8 from Tamar Christina  ---
Fixed, thanks for the report.  Bug is latent on branches so won't backport for
now.

[Bug middle-end/115887] ICE: in gsi_insert_on_edge_immediate, at gimple-iterator.cc:849 with -O -fnon-call-exceptions -finstrument-functions and _BitInt()

2024-07-17 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115887

--- Comment #3 from GCC Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:5104fe4c7808a66ed3041a8da8e4720585cc8a1f

commit r15-2101-g5104fe4c7808a66ed3041a8da8e4720585cc8a1f
Author: Jakub Jelinek 
Date:   Wed Jul 17 17:32:21 2024 +0200

bitint: Use gsi_insert_on_edge rather than gsi_insert_on_edge_immediate
[PR115887]

The following testcase ICEs on x86_64-linux, because we try to
gsi_insert_on_edge_immediate a statement on an edge which already has
statements queued with gsi_insert_on_edge, and the deferral has been
intentional so that we don't need to deal with cfg changes in between.

The following patch uses the delayed insertion as well.

2024-07-17  Jakub Jelinek  

PR middle-end/115887
* gimple-lower-bitint.cc (gimple_lower_bitint): Use
gsi_insert_on_edge
instead of gsi_insert_on_edge_immediate and set edge_insertions to
true.

* gcc.dg/bitint-108.c: New test.

[Bug c++/115754] [14 Regression] C++26 ICE on constexpr new

2024-07-17 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115754

--- Comment #4 from GCC Commits  ---
The releases/gcc-14 branch has been updated by Jakub Jelinek
:

https://gcc.gnu.org/g:297ea7e5bb5c4d92cf3fe29182d432694de858cc

commit r14-10445-g297ea7e5bb5c4d92cf3fe29182d432694de858cc
Author: Jakub Jelinek 
Date:   Tue Jul 2 22:09:58 2024 +0200

c++: Fix ICE on constexpr placement new [PR115754]

C++26 is making in P2747R2 paper placement new constexpr.
While working on a patch for that, I've noticed we ICE starting with
GCC 14 on the following testcase.
The problem is that e.g. for the void * to sometype * casts checks,
we really assume the casts have their operand constant evaluated
as prvalue, but on the testcase the cast itself is evaluated with
vc_discard and that means op can end up e.g. a VAR_DECL which the
later code doesn't like and asserts on.
If the result type is void, we don't really need the cast operand
for anything, so can use vc_discard for the recursive call,
VIEW_CONVERT_EXPR can appear on the lhs, so we need to honor the
lval but otherwise the patch uses vc_prvalue.
I'd like to get this patch in before the rest of P2747R2 implementation,
so that it can be backported to 14.2 later on.

2024-07-02  Jakub Jelinek  
Jason Merrill  

PR c++/115754
* constexpr.cc (cxx_eval_constant_expression) :
For conversions to void, pass vc_discard to the recursive call
and otherwise for tcode other than VIEW_CONVERT_EXPR pass
vc_prvalue.

* g++.dg/cpp26/pr115754.C: New test.

(cherry picked from commit 1250540a98e0a1dfa4d7834672d88d8543ea70b1)

[Bug middle-end/115887] ICE: in gsi_insert_on_edge_immediate, at gimple-iterator.cc:849 with -O -fnon-call-exceptions -finstrument-functions and _BitInt()

2024-07-17 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115887

--- Comment #4 from GCC Commits  ---
The releases/gcc-14 branch has been updated by Jakub Jelinek
:

https://gcc.gnu.org/g:01dfc5b4add9a5ed48c46f6b25cde6e55b9f3ff1

commit r14-10447-g01dfc5b4add9a5ed48c46f6b25cde6e55b9f3ff1
Author: Jakub Jelinek 
Date:   Wed Jul 17 17:32:21 2024 +0200

bitint: Use gsi_insert_on_edge rather than gsi_insert_on_edge_immediate
[PR115887]

The following testcase ICEs on x86_64-linux, because we try to
gsi_insert_on_edge_immediate a statement on an edge which already has
statements queued with gsi_insert_on_edge, and the deferral has been
intentional so that we don't need to deal with cfg changes in between.

The following patch uses the delayed insertion as well.

2024-07-17  Jakub Jelinek  

PR middle-end/115887
* gimple-lower-bitint.cc (gimple_lower_bitint): Use
gsi_insert_on_edge
instead of gsi_insert_on_edge_immediate and set edge_insertions to
true.

* gcc.dg/bitint-108.c: New test.

(cherry picked from commit 5104fe4c7808a66ed3041a8da8e4720585cc8a1f)

[Bug middle-end/115527] incorrect folding of __builtin_clear_padding()

2024-07-17 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115527

--- Comment #12 from GCC Commits  ---
The releases/gcc-14 branch has been updated by Jakub Jelinek
:

https://gcc.gnu.org/g:d668f875985cf61d3a898d95cf01df90a720e5c2

commit r14-10446-gd668f875985cf61d3a898d95cf01df90a720e5c2
Author: Jakub Jelinek 
Date:   Wed Jul 17 11:38:33 2024 +0200

gimple-fold: Fix up __builtin_clear_padding lowering [PR115527]

The builtin-clear-padding-6.c testcase fails as clear_padding_type
doesn't correctly recompute the buf->size and buf->off members after
expanding clearing of an array using a runtime loop.
buf->size should be in that case the offset after which it should continue
with next members or padding before them modulo UNITS_PER_WORD and
buf->off that offset minus buf->size.  That is what the code was doing,
but with off being the start of the loop cleared array, not its end.
So, the last hunk in gimple-fold.cc fixes that.
When adding the testcase, I've noticed that the
c-c++-common/torture/builtin-clear-padding-* tests, although clearly
written as runtime tests to test the builtins at runtime, didn't have
{ dg-do run } directive and were just compile tests because of that.
When adding that to the tests, builtin-clear-padding-1.c was already
failing without that clear_padding_type hunk too, but
builtin-clear-padding-5.c was still failing even after the change.
That is due to a bug in clear_padding_flush which the patch fixes as
well - when clear_padding_flush is called with full=true (that happens
at the end of the whole __builtin_clear_padding or on those array
padding clears done by a runtime loop), it wants to flush all the pending
padding clearings rather than just some.  If it is at the end of the whole
object, it decreases wordsize when needed to make sure the code never
writes
including RMW cycles to something outside of the object:
  if ((unsigned HOST_WIDE_INT) (buf->off + i + wordsize)
  > (unsigned HOST_WIDE_INT) buf->sz)
{
  gcc_assert (wordsize > 1);
  wordsize /= 2;
  i -= wordsize;
  continue;
}
but if it is full==true flush in the middle, this doesn't happen, but we
still process just the buffer bytes before the current end.  If that end
is not on a wordsize boundary, e.g. on the builtin-clear-padding-5.c test
the last chunk is 2 bytes, '\0', '\xff', i is 16 and end is 18,
nonzero_last might be equal to the end - i, i.e. 2 here, but still all_ones
might be true, so in some spots we just didn't emit any clearing in that
last chunk.

2024-07-17  Jakub Jelinek  

PR middle-end/115527
* gimple-fold.cc (clear_padding_flush): Introduce endsize
variable and use it instead of wordsize when comparing it against
nonzero_last.
(clear_padding_type): Increment off by sz.

* c-c++-common/torture/builtin-clear-padding-1.c: Add dg-do run
directive.
* c-c++-common/torture/builtin-clear-padding-2.c: Likewise.
* c-c++-common/torture/builtin-clear-padding-3.c: Likewise.
* c-c++-common/torture/builtin-clear-padding-4.c: Likewise.
* c-c++-common/torture/builtin-clear-padding-5.c: Likewise.
* c-c++-common/torture/builtin-clear-padding-6.c: New test.

(cherry picked from commit 8b5919bae11754f4b65a17e63663d3143f9615ac)

[Bug c/66425] (void) cast doesn't suppress __attribute__((warn_unused_result))

2024-07-17 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66425

--- Comment #78 from Segher Boessenkool  ---
(In reply to Andrew Church from comment #77)
> (In reply to Segher Boessenkool from comment #72)
> > if (foo()) {
> >   /* The return value of foo can be ignored here because X and Y.  */
> > }
> 
> This is just another idiom, with "if(){}" replacing "(void)"; it does not
> directly indicate that the value is unused,

That is not what I said either, of course.  It *does* give the author a great
place to add commentary why the return value is not actually used here.  I.e.
the only thing that *actually matters*: it encourages one human (the author)
to communicate important information to the code reader.

Code is for humans to read (and write)!  If you care about the compiler
first, you are doing it Wrong(tm).

> as a hypothetical [[discard]]
> would do.  I would even argue that it is worse because (as Zdenek points
> out) it adds a branch which would either need to be tested, potentially
> requiring additional failure injection logic to trigger the failing case, or
> documented as not needing to be covered by a test.

If your coverage testing framework does not handle empty BBs specially, get
a better coverage framework.

> There's also an argument to be made that allowing the warning to be bypassed
> with if(){} or assignment to an unused variable is weakening the original
> intent behind WUR, as Jakub mentions.

It does the opposite, as I have explained many times now.  I haven't seen Jakub
say anything like you say btw.

(I find the unused var thing just clumsy, noisy, inelegant, and distracting.
Not something someone who cares about readable code would ever do.  There are
much better options!)

> The fundamental problem with the request behind this feature (in particular,
> with the fact that the request comes from a library author) is that the end
> user of the compiler is the library user, not the library author, and if the
> end user considers the warnings useless, they will find one or another way
> around them, however much collateral damage (in the form of missed errors)
> that may cause.  Given that, I think it's reasonable to offer a
> middle-ground option that lets the end user reject the library author's
> original intent of forcing return value usage but retain the ability to
> check for accidentally unused return values.

The warning is exactly for cases like realloc().  If the user finds the warning
useless in that case, the user is a fool.

If someone (the user, the author, anyone) used warn_unused_result where it is
not appropriate, just fix *that*.  The attribute is specifically for cases
where not looking at the result value is a big (often hard to find) bug, or
even a security problem.  In cases where you just have some silly coding
standard
with silly rules you want to sillily follow, well, that is your own problem!

[Bug target/90616] Suboptimal code generated for accessing an aligned array.

2024-07-17 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90616

--- Comment #2 from GCC Commits  ---
The master branch has been updated by Georg-Johann Lay :

https://gcc.gnu.org/g:e21fef7da92ef36af1e1b020ae5f35ef4f3c3fce

commit r15-2102-ge21fef7da92ef36af1e1b020ae5f35ef4f3c3fce
Author: Georg-Johann Lay 
Date:   Thu Jul 4 12:08:34 2024 +0200

AVR: target/90616 - Improve adding constants that are 0 mod 256.

This patch introduces a new insn that works as an insn combine
pattern for

   (plus:HI (zero_extend:HI (reg:QI))
(const_0mod256_operannd:HI))

which requires at most 2 instructions.  When the input register operand
is already in HImode, the addhi3 printer only adds the hi8 part when
it sees a SYMBOL_REF or CONST aligned to at least 256 bytes.
(The CONST_INT case was already handled).

gcc/
PR target/90616
* config/avr/predicates.md (const_0mod256_operand): New predicate.
* config/avr/constraints.md (Cp8): New constraint.
* config/avr/avr.md (*aligned_add_symbol): New insn.
* config/avr/avr.cc (avr_out_plus_symbol) [HImode]:
When op2 is a multiple of 256, there is no need to add / subtract
the lo8 part.
(avr_rtx_costs_1) [PLUS && HImode]: Return expected costs for
new insn *aligned_add_symbol as it applies.

[Bug target/90616] Suboptimal code generated for accessing an aligned array.

2024-07-17 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90616

Georg-Johann Lay  changed:

   What|Removed |Added

  Build|amd64-portbld-freebsd10.4   |
   Priority|P3  |P5
   Target Milestone|--- |15.0
 Resolution|--- |FIXED
 Status|UNCONFIRMED |RESOLVED

--- Comment #3 from Georg-Johann Lay  ---
Added in v15.

[Bug middle-end/115527] incorrect folding of __builtin_clear_padding()

2024-07-17 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115527

--- Comment #13 from GCC Commits  ---
The releases/gcc-11 branch has been updated by Jakub Jelinek
:

https://gcc.gnu.org/g:3eec2d768d72944ed209b51ba60455d751b9aede

commit r11-11580-g3eec2d768d72944ed209b51ba60455d751b9aede
Author: Jakub Jelinek 
Date:   Wed Jul 17 11:38:33 2024 +0200

gimple-fold: Fix up __builtin_clear_padding lowering [PR115527]

The builtin-clear-padding-6.c testcase fails as clear_padding_type
doesn't correctly recompute the buf->size and buf->off members after
expanding clearing of an array using a runtime loop.
buf->size should be in that case the offset after which it should continue
with next members or padding before them modulo UNITS_PER_WORD and
buf->off that offset minus buf->size.  That is what the code was doing,
but with off being the start of the loop cleared array, not its end.
So, the last hunk in gimple-fold.cc fixes that.
When adding the testcase, I've noticed that the
c-c++-common/torture/builtin-clear-padding-* tests, although clearly
written as runtime tests to test the builtins at runtime, didn't have
{ dg-do run } directive and were just compile tests because of that.
When adding that to the tests, builtin-clear-padding-1.c was already
failing without that clear_padding_type hunk too, but
builtin-clear-padding-5.c was still failing even after the change.
That is due to a bug in clear_padding_flush which the patch fixes as
well - when clear_padding_flush is called with full=true (that happens
at the end of the whole __builtin_clear_padding or on those array
padding clears done by a runtime loop), it wants to flush all the pending
padding clearings rather than just some.  If it is at the end of the whole
object, it decreases wordsize when needed to make sure the code never
writes
including RMW cycles to something outside of the object:
  if ((unsigned HOST_WIDE_INT) (buf->off + i + wordsize)
  > (unsigned HOST_WIDE_INT) buf->sz)
{
  gcc_assert (wordsize > 1);
  wordsize /= 2;
  i -= wordsize;
  continue;
}
but if it is full==true flush in the middle, this doesn't happen, but we
still process just the buffer bytes before the current end.  If that end
is not on a wordsize boundary, e.g. on the builtin-clear-padding-5.c test
the last chunk is 2 bytes, '\0', '\xff', i is 16 and end is 18,
nonzero_last might be equal to the end - i, i.e. 2 here, but still all_ones
might be true, so in some spots we just didn't emit any clearing in that
last chunk.

2024-07-17  Jakub Jelinek  

PR middle-end/115527
* gimple-fold.c (clear_padding_flush): Introduce endsize
variable and use it instead of wordsize when comparing it against
nonzero_last.
(clear_padding_type): Increment off by sz.

* c-c++-common/torture/builtin-clear-padding-1.c: Add dg-do run
directive.
* c-c++-common/torture/builtin-clear-padding-2.c: Likewise.
* c-c++-common/torture/builtin-clear-padding-3.c: Likewise.
* c-c++-common/torture/builtin-clear-padding-4.c: Likewise.
* c-c++-common/torture/builtin-clear-padding-5.c: Likewise.
* c-c++-common/torture/builtin-clear-padding-6.c: New test.

(cherry picked from commit 8b5919bae11754f4b65a17e63663d3143f9615ac)

[Bug target/115966] [15 Regression] Miscompilation of 403.gcc with -Ofast -march=native on x86_64

2024-07-17 Thread pheeck at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115966

--- Comment #4 from Filip Kastl  ---
My last known to work is r15-1566-gfd536b8412d4da.

And yes, -no-ext-dce does fix the issue!

[Bug target/115526] [14/15 regression] invalid assember emitted for alpha, "Error: duplicate !tlsgd!62" since r14-5109-ga291237b628f41

2024-07-17 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115526

--- Comment #23 from GCC Commits  ---
The master branch has been updated by Uros Bizjak :

https://gcc.gnu.org/g:0841fd4c42ab053be951b7418233f0478282d020

commit r15-2104-g0841fd4c42ab053be951b7418233f0478282d020
Author: Uros Bizjak 
Date:   Wed Jul 17 18:11:26 2024 +0200

alpha: Fix duplicate !tlsgd!62 assemble error [PR115526]

Add missing "cannot_copy" attribute to instructions that have to
stay in 1-1 correspondence with another insn.

PR target/115526

gcc/ChangeLog:

* config/alpha/alpha.md (movdi_er_high_g): Add cannot_copy
attribute.
(movdi_er_tlsgd): Ditto.
(movdi_er_tlsldm): Ditto.
(call_value_osf_): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/alpha/pr115526.c: New test.

[Bug c/66425] (void) cast doesn't suppress __attribute__((warn_unused_result))

2024-07-17 Thread achurch+gcc at achurch dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66425

--- Comment #79 from Andrew Church  ---
(In reply to Segher Boessenkool from comment #78)
> If someone (the user, the author, anyone) used warn_unused_result where it is
> not appropriate, just fix *that*.  The attribute is specifically for cases
> where not looking at the result value is a big (often hard to find) bug,

The issue here is that the library user _cannot_ (realistically) fix improper
usage of WUR by the library author.  The intent of -Wunused-result=... is to
offer a low-resistance path with fewer side effects than just a blanket
-Wno-unused-result.

> or even a security problem.

The question of whether ignoring a return value from a function is a security
problem is rarely a static determination.  Does the following function raise a
security problem?

void spawn_command(const char *cmd) {
(void) system(cmd);
}

In some cases certainly, but if cmd is just setting keyboard LEDs to indicate
progress, probably not.  Only the library user knows for sure, so the library
author should not be using WUR here (though the weaker [[nodiscard]] would
arguably be appropriate).

If glibc had stuck to just using WUR on realloc(), this entire discussion would
probably never had arisen, because everyone can agree that ignoring the return
value from realloc() is an error (or a deliberate sticking-out-of-the-tongue to
show that there's exactly one case it's safe to ignore the return value from
realloc(), which is when it's called with a size of zero, and _that_ is a case
I'll happily disregard.)

[Bug c/66425] (void) cast doesn't suppress __attribute__((warn_unused_result))

2024-07-17 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66425

--- Comment #80 from Segher Boessenkool  ---
(In reply to Andrew Church from comment #79)
> (In reply to Segher Boessenkool from comment #78)
> > If someone (the user, the author, anyone) used warn_unused_result where it 
> > is
> > not appropriate, just fix *that*.  The attribute is specifically for cases
> > where not looking at the result value is a big (often hard to find) bug,
> 
> The issue here is that the library user _cannot_ (realistically) fix
> improper usage of WUR by the library author.

This is exactly the same problem as any other problematic bug in library
code.  Sane users vote with their feet in such cases, if a bug report (or
whatever equivalent) does not easily get it satisfactorily fixed.

[Bug target/115526] [14/15 regression] invalid assember emitted for alpha, "Error: duplicate !tlsgd!62" since r14-5109-ga291237b628f41

2024-07-17 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115526

--- Comment #24 from GCC Commits  ---
The releases/gcc-14 branch has been updated by Uros Bizjak :

https://gcc.gnu.org/g:3a963d441a68797956a5f67dcb351b2dbd4ac1d0

commit r14-10448-g3a963d441a68797956a5f67dcb351b2dbd4ac1d0
Author: Uros Bizjak 
Date:   Wed Jul 17 18:11:26 2024 +0200

alpha: Fix duplicate !tlsgd!62 assemble error [PR115526]

Add missing "cannot_copy" attribute to instructions that have to
stay in 1-1 correspondence with another insn.

PR target/115526

gcc/ChangeLog:

* config/alpha/alpha.md (movdi_er_high_g): Add cannot_copy
attribute.
(movdi_er_tlsgd): Ditto.
(movdi_er_tlsldm): Ditto.
(call_value_osf_): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/alpha/pr115526.c: New test.

(cherry picked from commit 0841fd4c42ab053be951b7418233f0478282d020)

[Bug tree-optimization/107200] False positive -Wdangling-pointer?

2024-07-17 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107200

--- Comment #5 from Andrew Pinski  ---
See https://libeigen.gitlab.io/docs/TopicPitfalls.html
section "C++11 and the auto keyword" explictly.

[Bug c++/115291] armv8-a GCC emits float32x2_t loads from uninitialized stack

2024-07-17 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115291

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|INVALID |DUPLICATE

--- Comment #4 from Andrew Pinski  ---
.

*** This bug has been marked as a duplicate of bug 107200 ***

[Bug tree-optimization/107200] False positive -Wdangling-pointer?

2024-07-17 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107200

Andrew Pinski  changed:

   What|Removed |Added

 CC||akihiko.odaki at daynix dot com

--- Comment #6 from Andrew Pinski  ---
*** Bug 115291 has been marked as a duplicate of this bug. ***

[Bug tree-optimization/107200] False positive -Wdangling-pointer?

2024-07-17 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107200

Andrew Pinski  changed:

   What|Removed |Added

 CC||summersnow9403 at gmail dot com

--- Comment #7 from Andrew Pinski  ---
*** Bug 115968 has been marked as a duplicate of this bug. ***

[Bug c++/115968] g++ 12 and above incorrectly optimize the code with Eigen (-O2 or -O1)

2024-07-17 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115968

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|INVALID |DUPLICATE

--- Comment #4 from Andrew Pinski  ---
.

*** This bug has been marked as a duplicate of bug 107200 ***

[Bug c++/111890] ICE in tsubst_friend_function with friend function declared inside a concept constrainted class inside a template class

2024-07-17 Thread heb1001 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111890

--- Comment #8 from Harry Butterworth  ---
I applied patch to 14.1.0 and my code compiles now.  Thanks.

[Bug target/115526] [14/15 regression] invalid assember emitted for alpha, "Error: duplicate !tlsgd!62" since r14-5109-ga291237b628f41

2024-07-17 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115526

--- Comment #25 from GCC Commits  ---
The releases/gcc-13 branch has been updated by Uros Bizjak :

https://gcc.gnu.org/g:37bd7d5c4e17c97d2b7d50f630b1cf8b347a31f4

commit r13-8920-g37bd7d5c4e17c97d2b7d50f630b1cf8b347a31f4
Author: Uros Bizjak 
Date:   Wed Jul 17 18:11:26 2024 +0200

alpha: Fix duplicate !tlsgd!62 assemble error [PR115526]

Add missing "cannot_copy" attribute to instructions that have to
stay in 1-1 correspondence with another insn.

PR target/115526

gcc/ChangeLog:

* config/alpha/alpha.md (movdi_er_high_g): Add cannot_copy
attribute.
(movdi_er_tlsgd): Ditto.
(movdi_er_tlsldm): Ditto.
(call_value_osf_): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/alpha/pr115526.c: New test.

(cherry picked from commit 0841fd4c42ab053be951b7418233f0478282d020)

[Bug tree-optimization/111150] (vec CMP vec) != (vec CMP vec) should just produce (vec CMP vec) ^ (vec CMP vec)

2024-07-17 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50

--- Comment #1 from GCC Commits  ---
The trunk branch has been updated by Andrew Pinski :

https://gcc.gnu.org/g:44fcc1ca11e7ea35dc9fb25a5317346bc1eaf7b2

commit r15-2106-g44fcc1ca11e7ea35dc9fb25a5317346bc1eaf7b2
Author: Eikansh Gupta 
Date:   Wed May 22 23:28:48 2024 +0530

MATCH: Simplify (a ? x : y) eq/ne (b ? x : y) [PR50]

This patch adds match pattern for `(a ? x : y) eq/ne (b ? x : y)`.
In forwprop1 pass, depending on the type of `a` and `b`, GCC produces
`vec_cond` or `cond_expr`. Based on the observation that `(x != y)` is
TRUE, the pattern can be optimized to produce `(a^b ? TRUE : FALSE)`.

The patch adds match pattern for a, b:
(a ? x : y) != (b ? x : y) --> (a^b) ? TRUE  : FALSE
(a ? x : y) == (b ? x : y) --> (a^b) ? FALSE : TRUE
(a ? x : y) != (b ? y : x) --> (a^b) ? TRUE  : FALSE
(a ? x : y) == (b ? y : x) --> (a^b) ? FALSE : TRUE

PR tree-optimization/50

gcc/ChangeLog:

* match.pd (`(a ? x : y) eq/ne (b ? x : y)`): New pattern.
(`(a ? x : y) eq/ne (b ? y : x)`): New pattern.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/pr50.c: New test.
* gcc.dg/tree-ssa/pr50-1.c: New test.
* g++.dg/tree-ssa/pr50.C: New test.

Signed-off-by: Eikansh Gupta 

[Bug tree-optimization/111150] (vec CMP vec) != (vec CMP vec) should just produce (vec CMP vec) ^ (vec CMP vec)

2024-07-17 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |15.0

[Bug tree-optimization/111150] (vec CMP vec) != (vec CMP vec) should just produce (vec CMP vec) ^ (vec CMP vec)

2024-07-17 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #2 from Andrew Pinski  ---
Fixed.

[Bug c++/115964] GCC accepts invalid program with explicit object member function overloads

2024-07-17 Thread mpolacek at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115964

Marek Polacek  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2024-07-17
 Status|UNCONFIRMED |NEW
 CC||mpolacek at gcc dot gnu.org

--- Comment #1 from Marek Polacek  ---
Confirmed, I guess.  I hope there isn't a DR changing this to be well-formed
that I missed :).

Thanks for the report.

[Bug c++/110343] [C++26] P2558R2 - Add @, $, and ` to the basic character set

2024-07-17 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110343

Jakub Jelinek  changed:

   What|Removed |Added

 CC||redi at gcc dot gnu.org

--- Comment #9 from Jakub Jelinek  ---
I've tried to understand the preprocessor issue mentioned in the paper, but am
confused on what is the right behavior and why.

Consider
#define STR(x) #x
const char *a = "\u00b7";
const char *b = STR(\u00b7);
const char *c = "\u0041";
const char *d = STR(\u0041);
const char *e = STR(a\u00b7);
const char *f = STR(a\u0041);
const char *g = STR(a \u00b7);
const char *h = STR(a \u0041);
const char *i = "\u066d";
const char *j = STR(\u066d);
const char *k = "\u0040";
const char *l = STR(\u0040);
const char *m = STR(a\u066d);
const char *n = STR(a\u0040);
const char *o = STR(a \u066d);
const char *p = STR(a \u0040);

Neither clang nor gcc emit any diagnostics on the a, c, i and k initializers,
those are certainly valid.
g++ emits with -pedantic-errors errors on all the others, while clang++ on the
ones with STR involving \u0041, \u0040 and a\u0066d.
The chosen values are \u0040 '@' as something being changed by this paper,
\u0041 'A',
\u00b7 as an example of character which is pedantically valid in identifiers if
not at the start and \u066d s something pedantically not valid in identifiers.

Now, https://eel.is/c++draft/lex.charset#6 says that UCN used outside of a
string/character literal which corresponds to basic character set character (or
control character) is ill-formed, that would make d, f, h cases invalid for C++
and l, n, p cases invalid for C++26.

https://eel.is/c++draft/lex.name states which characters can appear at the
start of the identifier and which can appear after the start.
And https://eel.is/c++draft/lex.pptoken states that preprocessing-token is
either identifier, or tons of other things, or
"each non-whitespace character that cannot be one of the above"

Then https://eel.is/c++draft/lex.pptoken#1 says that this last category is
invalid if the preprocessing token is being converted into token.

And https://eel.is/c++draft/lex.pptoken#2 includes
"If any character not in the basic character set matches the last category, the
program is ill-formed."

Now, e.g. for the C++23 STR(\u0040) case, \u0040 is there not in the basic
character set, so valid outside of the literals (not the case anymore in
C++26), but it isn't nondigit and doesn't have XID_Start property, so it isn't
IMHO an identifier and so must be the "each non-whitespace character that
cannot be one of the above" case.
Why doesn't the above mentioned https://eel.is/c++draft/lex.pptoken#2 sentence
make that invalid?  Ignoring that, I'd say it would be then stringized and that
feels like it is what clang++ is doing.
Now, e.g. for the STR(a\u066d) case, I wonder why that isn't lexed as a
identifier
followed by \u066d "each non-whitespace character that cannot be one of the
above"
token and stringified similarly, clang++ rejects that.

What GCC libcpp seems to be doing is that if that forms_identifier_p calls
_cpp_valid_utf8 or _cpp_valid_ucn with an argument which tells it is first or
second+ in identifier, and e.g. _cpp_valid_ucn then for UCNs valid in string
literals calls
  else if (identifier_pos)
{
  int validity = ucn_valid_in_identifier (pfile, result, nst);

  if (validity == 0)
cpp_error (pfile, CPP_DL_ERROR,
   "universal character %.*s is not valid in an identifier",
   (int) (str - base), base);
  else if (validity == 2 && identifier_pos == 1)
cpp_error (pfile, CPP_DL_ERROR,
   "universal character %.*s is not valid at the start of an identifier",
   (int) (str - base), base);
}
so basically all those invalid in identifiers cases emit an error and pretend
to be valid in identifiers, rather than what e.g. _cpp_valid_utf8 does for C
but not for C++ and only for the chars completely invalid in identifiers rather
than just valid in identifiers but not at the start:
  /* In C++, this is an error for invalid character in an identifier
 because logically, the UTF-8 was converted to a UCN during
 translation phase 1 (even though we don't physically do it that
 way).  In C, this byte rather becomes grammatically a separate
 token.  */

  if (CPP_OPTION (pfile, cplusplus))
cpp_error (pfile, CPP_DL_ERROR,
   "extended character %.*s is not valid in an identifier",
   (int) (*pstr - base), base);
  else
{
  *pstr = base;
  return false;
}
The comment doesn't really match what is done in recent C++ versions because
there UCNs are translated to characters and not the other way around.

[Bug c++/115900] [14/15 Regression] constexpr object modification during construction gives "Modifying a const object is not allowed in a constant expression"

2024-07-17 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115900

--- Comment #8 from GCC Commits  ---
The trunk branch has been updated by Marek Polacek :

https://gcc.gnu.org/g:d890b04197fb0ddba4fbfb32f88e266fa27e02f3

commit r15-2108-gd890b04197fb0ddba4fbfb32f88e266fa27e02f3
Author: Marek Polacek 
Date:   Wed Jul 17 11:19:32 2024 -0400

c++: wrong error initializing empty class [PR115900]

In r14-409, we started handling empty bases first in
cxx_fold_indirect_ref_1
so that we don't need to recurse and waste time.

This caused a bogus "modifying a const object" error.  I'm appending my
analysis from the PR, but basically, cxx_fold_indirect_ref now returns
a different object than before, and we mark the wrong thing as const,
but since we're initializing an empty object, we should avoid setting
the object constness.

~~
Pre-r14-409: we're evaluating the call to C::C(), which is in the body of
B::B(), which is the body of D::D(&d):

  C::C ((struct C *) this, NON_LVALUE_EXPR <0>)

It's a ctor so we get here:

 3118   /* Remember the object we are constructing or destructing.  */
 3119   tree new_obj = NULL_TREE;
 3120   if (DECL_CONSTRUCTOR_P (fun) || DECL_DESTRUCTOR_P (fun))
 3121 {
 3122   /* In a cdtor, it should be the first `this' argument.
 3123  At this point it has already been evaluated in the call
 3124  to cxx_bind_parameters_in_call.  */
 3125   new_obj = TREE_VEC_ELT (new_call.bindings, 0);

new_obj=(struct C *) &d.D.2656

 3126   new_obj = cxx_fold_indirect_ref (ctx, loc, DECL_CONTEXT (fun),
new_obj);

new_obj=d.D.2656.D.2597

We proceed to evaluate the call, then we get here:

 3317   /* At this point, the object's constructor will have run,
so
 3318  the object is no longer under construction, and its
possible
 3319  'const' semantics now apply.  Make a note of this fact
by
 3320  marking the CONSTRUCTOR TREE_READONLY.  */
 3321   if (new_obj && DECL_CONSTRUCTOR_P (fun))
 3322 cxx_set_object_constness (ctx, new_obj,
/*readonly_p=*/true,
 3323   non_constant_p, overflow_p);

new_obj is still d.D.2656.D.2597, its type is "C", cxx_set_object_constness
doesn't set anything as const.  This is fine.

After r14-409: on line 3125, new_obj is (struct C *) &d.D.2656 as before,
but we go to cxx_fold_indirect_ref_1:

 5739   if (is_empty_class (type)
 5740   && CLASS_TYPE_P (optype)
 5741   && lookup_base (optype, type, ba_any, NULL, tf_none, off))
 5742 {
 5743   if (empty_base)
 5744 *empty_base = true;
 5745   return op;

type is C, which is an empty class; optype is "const D", and C is a base of
D.
So we return the VAR_DECL 'd'.  Then we get to cxx_set_object_constness
with
object=d, which is const, so we mark the constructor READONLY.

Then we're evaluating A::A() which has

  ((A*)this)->data = 0;

we evaluate the LHS to d.D.2656.a, for which the initializer is
{.D.2656={.a={.data=}}} which is TREE_READONLY and 'd' is const, so we
think
we're modifying a const object and fail the constexpr evaluation.

PR c++/115900

gcc/cp/ChangeLog:

* constexpr.cc (cxx_eval_call_expression): Set new_obj to NULL_TREE
if cxx_fold_indirect_ref set empty_base to true.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/constexpr-init23.C: New test.

  1   2   >