Re: [PATCH] [PATCH,v4,1/1,AARCH64][PR102768] aarch64: Add compiler support for Shadow Call Stack

2022-02-25 Thread Dan Li via Gcc-patches




On 2/11/22 07:35, Richard Sandiford wrote:

Dan Li  writes:

On 2/11/22 01:53, Richard Sandiford wrote:

Dan Li  writes:

On 2/10/22 01:55, Richard Sandiford wrote:



void f();
int g(int x) {
  if (x == 0) {
  __asm__ ("":::"x19", "x20");
  return 1;
  }
  f();
  return 2;
}

Then it seems X30 is treat as a "component" (the test
result of aarch64.exp also seems fine).

g:
  stp x19, x20, [sp, -32]!
  cbnzw0, .L2
  mov w0, 1
  ldp x19, x20, [sp], 32
  ret
.L2:
  str x30, [sp, 16]
  bl  f
  ldr x30, [sp, 16]
  mov w0, 2
  ldp x19, x20, [sp], 32
  ret

And I think maybe we could handle this through three patches:
1.Keep current patch (a V5) unchanged for scs.
2.Add shrink-warpping for X30:
logically this might be a separate topic, and I think more testing
might be needed here (Well, I'm a little worried about if there might
be other effects, since I just read this part of the code roughly
yesterday).
3.Add scs push/pop to shrink-wrapping (and maybe we can do the same for
the PAC code in pro/epilogue, since it's also the operation of the X30).


Yeah, that's fair.

(Like I said earlier, I wasn't asking for the shrink-wrapping change.
It was just a note in passing.  But as you point out, the individual
shrink-wrapping support would be even more work than I'd imagined.)



Hi, Richard,

As said before, I try to enable R30 as component[1] in
shrink-wrapping separate.

The test results of this patch in x86-64 native compile/aarch64
cross-compile are consistent with the mainline (but there are
still few use cases not working properly in my test environment).

please let me know if I'm missing something :)

[1] https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590906.html



Re: [PATCH][V3][middle-end/104550]Suppress uninitialized warnings for new created uses from __builtin_clear_padding folding

2022-02-25 Thread Richard Biener via Gcc-patches
On Fri, 25 Feb 2022, Qing Zhao wrote:

> Hi, Jakub and Richard:
> 
> This is the 3rd version of the patch, the major change compared to the 
> previous version are:
> 
> 1. Add warning_enabled_at guard before “suppress_warning”
> 2. Add location to the call to __builtin_clear_padding for auto init.
> 
> The patch has been bootstrapped and regress tested on both x86 and aarch64.
> Okay for trunk?
> 
> Thanks.
> 
> Qing
> 
> ==
> From 8314ded4ca0f59c5a3ec431c9c3768fcaf2a0949 Mon Sep 17 00:00:00 2001
> From: Qing Zhao 
> Date: Thu, 24 Feb 2022 22:38:38 +
> Subject: [PATCH] Suppress uninitialized warnings for new created uses from
>  __builtin_clear_padding folding [PR104550]
> 
> __builtin_clear_padding(&object) will clear all the padding bits of the 
> object.
> actually, it doesn't involve any use of an user variable. Therefore, users do
> not expect any uninitialized warning from it. It's reasonable to suppress
> uninitialized warnings for all new created uses from __builtin_clear_padding
> folding.
> 
>   PR middle-end/104550
> 
> gcc/ChangeLog:
> 
>   * gimple-fold.cc (clear_padding_flush): Suppress warnings for new
>   created uses.
>   * gimplify.cc (gimple_add_padding_init_for_auto_var): Set
>   location for new created call to __builtin_clear_padding.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/auto-init-pr104550-1.c: New test.
>   * gcc.dg/auto-init-pr104550-2.c: New test.
>   * gcc.dg/auto-init-pr104550-3.c: New test.
> ---
>  gcc/gimple-fold.cc  | 11 ++-
>  gcc/gimplify.cc |  1 +
>  gcc/testsuite/gcc.dg/auto-init-pr104550-1.c | 10 ++
>  gcc/testsuite/gcc.dg/auto-init-pr104550-2.c | 11 +++
>  gcc/testsuite/gcc.dg/auto-init-pr104550-3.c | 11 +++
>  5 files changed, 43 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr104550-1.c
>  create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr104550-2.c
>  create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr104550-3.c
> 
> diff --git a/gcc/gimple-fold.cc b/gcc/gimple-fold.cc
> index 16f02c2d098d..67b4963ffd96 100644
> --- a/gcc/gimple-fold.cc
> +++ b/gcc/gimple-fold.cc
> @@ -62,6 +62,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "attribs.h"
>  #include "asan.h"
>  #include "diagnostic-core.h"
> +#include "diagnostic.h"
>  #include "intl.h"
>  #include "calls.h"
>  #include "tree-vector-builder.h"
> @@ -4379,7 +4380,15 @@ clear_padding_flush (clear_padding_struct *buf, bool 
> full)
> else
>   {
> src = make_ssa_name (type);
> -   g = gimple_build_assign (src, unshare_expr (dst));
> +   tree tmp_dst = unshare_expr (dst);
> +   /* The folding introduces a read from the tmp_dst, we should
> +  prevent uninitialized warning analysis from issuing warning
> +  for such fake read.  */
> +   if (warning_enabled_at (buf->loc, OPT_Wuninitialized)
> +   || warning_enabled_at (buf->loc,
> +  OPT_Wmaybe_uninitialized))
> + suppress_warning (tmp_dst, OPT_Wuninitialized);
> +   g = gimple_build_assign (src, tmp_dst);

So what about just gimple_set_no_warning (g, true); ?  (sorry for
the ping-pong between us three...)

> gimple_set_location (g, buf->loc);
> gsi_insert_before (buf->gsi, g, GSI_SAME_STMT);
> tree mask = native_interpret_expr (type,
> diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc
> index f570daa015a5..977cf458f858 100644
> --- a/gcc/gimplify.cc
> +++ b/gcc/gimplify.cc
> @@ -1823,6 +1823,7 @@ gimple_add_padding_init_for_auto_var (tree decl, bool 
> is_vla,
>  
>gimple *call = gimple_build_call (fn, 2, addr_of_decl,
>   build_one_cst (TREE_TYPE (addr_of_decl)));
> +  gimple_set_location (call, EXPR_LOCATION (decl));

I believe EXPR_LOCATION (decl) is bogus, 'decl's have DECL_LOCATION,
EXPR_LOCATION here will just return UNKNOWN_LOCATION.

>gimplify_seq_add_stmt (seq_p, call);
>  }
>  
> diff --git a/gcc/testsuite/gcc.dg/auto-init-pr104550-1.c 
> b/gcc/testsuite/gcc.dg/auto-init-pr104550-1.c
> new file mode 100644
> index ..a08110c3a170
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/auto-init-pr104550-1.c
> @@ -0,0 +1,10 @@
> +/* PR 104550*/
> +/* { dg-do compile } */
> +/* { dg-options "-O -Wuninitialized -ftrivial-auto-var-init=pattern" } */
> +struct vx_audio_level {
> + int has_monitor_level : 1;
> +};
> +
> +void vx_set_monitor_level() {
> + struct vx_audio_level info; /* { dg-bogus "info" "is used uninitialized" } 
> */
> +}
> diff --git a/gcc/testsuite/gcc.dg/auto-init-pr104550-2.c 
> b/gcc/testsuite/gcc.dg/auto-init-pr104550-2.c
> new file mode 100644
> index ..2c395b32d322
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/auto-init-pr104550-2.c
>

[PATCH] AVX512F: Add helper enumeration for ternary logic intrinsics.

2022-02-25 Thread Hongyu Wang via Gcc-patches
Hi,

This patch intends to sync with llvm change in
https://reviews.llvm.org/D120307 to add enumeration and truncate
imm to unsigned char, so users could use ~ on immediates.

Bootstraped/regtested on x86_64-pc-linux-gnu{-m32,}.
Ok for master?

gcc/ChangeLog:

* config/i386/avx512fintrin.h (_MM_TERNLOG_ENUM): New enum.
(_mm512_ternarylogic_epi64): Truncate imm to unsigned
char to avoid error when using ~enum as parameter.
(_mm512_mask_ternarylogic_epi64): Likewise.
(_mm512_maskz_ternarylogic_epi64): Likewise.
(_mm512_ternarylogic_epi32): Likewise.
(_mm512_mask_ternarylogic_epi32): Likewise.
(_mm512_maskz_ternarylogic_epi32): Likewise.
* config/i386/avx512vlintrin.h (_mm256_ternarylogic_epi64):
Adjust imm param type to unsigned char.
(_mm256_mask_ternarylogic_epi64): Likewise.
(_mm256_maskz_ternarylogic_epi64): Likewise.
(_mm256_ternarylogic_epi32): Likewise.
(_mm256_mask_ternarylogic_epi32): Likewise.
(_mm256_maskz_ternarylogic_epi32): Likewise.
(_mm_ternarylogic_epi64): Likewise.
(_mm_mask_ternarylogic_epi64): Likewise.
(_mm_maskz_ternarylogic_epi64): Likewise.
(_mm_ternarylogic_epi32): Likewise.
(_mm_mask_ternarylogic_epi32): Likewise.
(_mm_maskz_ternarylogic_epi32): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/i386/avx512f-vpternlogd-1.c: Use new enum.
* gcc.target/i386/avx512f-vpternlogq-1.c: Likewise.
* gcc.target/i386/avx512vl-vpternlogd-1.c: Likewise.
* gcc.target/i386/avx512vl-vpternlogq-1.c: Likewise.
* gcc.target/i386/testimm-10.c: Remove imm check for vpternlog
insns since the imm has been truncated in intrinsic.
---
 gcc/config/i386/avx512fintrin.h   | 132 ++---
 gcc/config/i386/avx512vlintrin.h  | 278 +++---
 .../gcc.target/i386/avx512f-vpternlogd-1.c|   7 +-
 .../gcc.target/i386/avx512f-vpternlogq-1.c|   7 +-
 .../gcc.target/i386/avx512vl-vpternlogd-1.c   |  13 +-
 .../gcc.target/i386/avx512vl-vpternlogq-1.c   |  14 +-
 gcc/testsuite/gcc.target/i386/testimm-10.c|   7 -
 7 files changed, 285 insertions(+), 173 deletions(-)

diff --git a/gcc/config/i386/avx512fintrin.h b/gcc/config/i386/avx512fintrin.h
index bc10c823c76..29511fd2831 100644
--- a/gcc/config/i386/avx512fintrin.h
+++ b/gcc/config/i386/avx512fintrin.h
@@ -1639,16 +1639,27 @@ _mm_maskz_sub_round_ss (__mmask8 __U, __m128 __A, 
__m128 __B,
 
 #endif
 
+/* Constant helper to represent the ternary logic operations among
+   vector A, B and C.  */
+typedef enum
+{
+  _MM_TERNLOG_A = 0xF0,
+  _MM_TERNLOG_B = 0xCC,
+  _MM_TERNLOG_C = 0xAA
+} _MM_TERNLOG_ENUM;
+
 #ifdef __OPTIMIZE__
 extern __inline __m512i
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm512_ternarylogic_epi64 (__m512i __A, __m512i __B, __m512i __C,
   const int __imm)
 {
-  return (__m512i) __builtin_ia32_pternlogq512_mask ((__v8di) __A,
-(__v8di) __B,
-(__v8di) __C, __imm,
-(__mmask8) -1);
+  return (__m512i)
+__builtin_ia32_pternlogq512_mask ((__v8di) __A,
+ (__v8di) __B,
+ (__v8di) __C,
+ (unsigned char) __imm,
+ (__mmask8) -1);
 }
 
 extern __inline __m512i
@@ -1656,10 +1667,12 @@ __attribute__ ((__gnu_inline__, __always_inline__, 
__artificial__))
 _mm512_mask_ternarylogic_epi64 (__m512i __A, __mmask8 __U, __m512i __B,
__m512i __C, const int __imm)
 {
-  return (__m512i) __builtin_ia32_pternlogq512_mask ((__v8di) __A,
-(__v8di) __B,
-(__v8di) __C, __imm,
-(__mmask8) __U);
+  return (__m512i)
+__builtin_ia32_pternlogq512_mask ((__v8di) __A,
+ (__v8di) __B,
+ (__v8di) __C,
+ (unsigned char) __imm,
+ (__mmask8) __U);
 }
 
 extern __inline __m512i
@@ -1667,10 +1680,12 @@ __attribute__ ((__gnu_inline__, __always_inline__, 
__artificial__))
 _mm512_maskz_ternarylogic_epi64 (__mmask8 __U, __m512i __A, __m512i __B,
 __m512i __C, const int __imm)
 {
-  return (__m512i) __builtin_ia32_pternlogq512_maskz ((__v8di) __A,
- (__v8di) __B,
- (__v8di) __C,
- __imm, (__mmask8) __U);
+  return (__m512i)
+__builtin_ia32_pternlogq512_maskz ((__v8di) __A,
+ 

Re: [PATCH] Check if loading const from mem is faster

2022-02-25 Thread Richard Biener via Gcc-patches
On Fri, 25 Feb 2022, Jiufu Guo wrote:

> Richard Biener  writes:
> 
> > On Thu, 24 Feb 2022, Jiufu Guo wrote:
> >
> >> Jiufu Guo via Gcc-patches  writes:
> >> 
> >> > Segher Boessenkool  writes:
> >> >
> >> >> On Wed, Feb 23, 2022 at 02:02:59PM +0100, Richard Biener wrote:
> >> >>> I'm assuming we're always dealing with
> >> >>> 
> >> >>>   (set (reg:MODE ..) )
> >> >>> 
> >> >>> here and CSE is not substituting into random places of an
> >> >>> instruction(?).  I don't know what 'rtx_cost' should evaluate
> >> >>> to for a constant, if it should implicitely evaluate the cost
> >> >>> of putting the result into a register for example.
> >> >>
> >> >> rtx_cost is no good here (and in most places).  rtx_cost should be 0
> >> >> for anything that is used as input in a machine instruction -- but you
> >> >> need much more context to determine that.  insn_cost is much simpler and
> >> >> much easier to use.
> >> >>
> >> >>> Using RTX_COST with SET and 1 at least looks no worse than using
> >> >>> your proposed new target hook and comparing it with the original
> >> >>> unfolded src (again with SET and 1).
> >> >>
> >> >> It is required to generate valid instructions no matter what, before
> >> >> the pass has finished that is.  On all more modern architectures it is
> >> >> futile to think you can usefully consider the cost of an RTL expression
> >> >> and derive a real-world cost of the generated code from that.
> >> >
> >> > Thanks Segher for pointing out these!  Here is  another reason that I
> >> > did not use rtx_cost: in a few passes, there are codes to check the
> >> > constants and store them in constant pool.  I'm thinking to integerate
> >> > those codes in a consistent way.
> >> 
> >> Hi Segher, Richard!
> >> 
> >> I'm thinking the way like: For a constant,
> >> 1. if the constant could be used as an immediate for the
> >> instruction, then retreated as an operand;
> >> 2. otherwise if the constant can not be stored into a
> >> constant pool, then handle through instructions;
> >> 3. if it is faster to access constant from pool, then emit
> >> constant as data(.rodata);
> >> 4. otherwise, handle the constant by instructions.
> >> 
> >> And to store the constant into a pool, besides force_const_mem,
> >> create reference (toc) may be needed on some platforms.
> >> 
> >> For this particular issue in CSE, there is already code that
> >> tries to put constant into a pool (invoke force_const_mem).
> >> While the code is too late.  So, we may check the constant
> >> earlier and store it into constant pool if profitable.
> >> 
> >> And another thing as Segher pointed out, CSE is doing too
> >> much work.  It may be ok to separate the constant handling
> >> logic from CSE.
> >
> > Not sure - CSE just is value numbering, I don't see that it does
> > more than that.  Yes, it might have developed "heuristics" over
> > the years what to CSE and to what and where to substitute and
> > where not.  But in the end it does just value numbering.
> >
> >> 
> >> I update a new version patch as follow (did not seprate CSE):
> >
> > How is the new target hook better in any way compared to rtx_cost
> > or insn_cost?  It looks like a total hack.
> >
> > I suppose the actual way of materializing a constant is done
> > behind GCCs back and not exposed anywhere?  But instead you
> > claim the constants are valid when they actually are not?
> > Isn't the problem then that the rs6000 backend lies?
> 
> Hi Richard,
> 
> Thanks for your comments and sugguestions!
> 
> Materializing a constant should be done behind GCC.
> On rs6000, in expand pass, during emit_move, the constant is
> checked and store into constant pool if necessary.
> Some other platforms are doing a similar thing, e.g.
> ix86_expand_vector_move, alpha_expand_mov,...
> mips_legitimize_const_move.
> 
> But, it does not as we expect, force_const_mem is also 
> exposed other places (not only ira/reload for stack reference).
> 
> CSE is one place, for example, CSE first retrieve the constant
> from insn's equal, but it also tries to put constant into
> pool for some condition (the condition was introduced at
> early age of cse.c, and it is rare to run into in trunk).
> In some aspects, IMHO, this seems not a great work of CSE.
> 
> And this is how the 'invalid(or say slow)' constant occurs.
> e.g.  before cse:
> 7: r119:DI=[unspec[`*.LC0',%r2:DI] 47]
>   REG_EQUAL 0x100803004101001
> after cse: 
> 7: r119:DI=0x100803004101001
>   REG_EQUAL 0x100803004101001
> 
> As you pointed out: we can also avoid this transformation through
> rtx_cost/insn_cost by estimating the COST more accurately for
>  "r119:DI=0x100803004101001". (r119:DI=0x100803004101001 will not
> be a real final instruction.)

At which point does this become the final instruction?  I suppose
CSE belives this constant is legitimate and the insn is recognized
just fine in CSE?  (that's what I mean with "behind GCCs back")

> Is it necessary to refine this constant handling for CSE, or even
> 

Re: [PATCH][V2][middle-end/104550]Suppress uninitialized warnings for new created uses from __builtin_clear_padding folding

2022-02-25 Thread Jakub Jelinek via Gcc-patches
On Fri, Feb 25, 2022 at 08:11:40AM +0100, Richard Biener via Gcc-patches wrote:
> On Thu, 24 Feb 2022, Qing Zhao wrote:
> 
> > I briefly checked all the usages of suppress_warning within the current 
> > gcc, 
> > and see that most of them are not guarded by any condition. 
> > 
> > So, the current change should be fine without introducing new issues. -:)
> > 
> > Another thing is, if we use “warning_enable_at” to guard, I just checked, 
> > this routine is not used by any routine right now, so it might be possible 
> > that this 
> > routine has some bug itself.  And now it’s the stage 4, we might need to be
> > conservative. 
> > 
> > Based on this, I think that it might be better to put the change in as it 
> > right now. 
> > If we think that all the “suppress_warning” need to be guarded by a specific
> > condition, we can do it in gcc13 earlier stage.
> > 
> > What’s your opinion?
> 
> I would agree here.  Maybe a compromise would be to simply
> set gimple_set_no_warning () on the actual stmt.  That shouldn't

I think it is set_no_warning_bit but it isn't exported from
warning-control.cc.
One can still use TREE_NO_WARNING though, but seems Martin converted
all usages of that to the suppress_warning APIs.
I thought the hash tables warning-control.cc use are based on the gimple *
and tree pointers, but apparently they aren't, they are location_t based.
So if we keep using suppress_warning (tmp_dst, OPT_Wuninitialized);
there, it actually means just that one entry will be added to the hash table
because all loads in one clear_padding_flush use the same buf->loc.
It is true that every suppress_warning will need to look it up in the hash
table.
But you're right that because the tmp_dst we've created is artificial, I
don't see a reason for any warning.  Unfortunately, even with
suppress_warning (tmp_dst, all_warnings); it looks up the hash tables
and notes stuff in there rather than just making sure the TREE_NO_WARNING
bit is set.
So let's go with the unconditional
suppress_warning (tmp_dst, OPT_Wuninitialized);
for now.

Jakub



Re: [PATCH][V2][middle-end/104550]Suppress uninitialized warnings for new created uses from __builtin_clear_padding folding

2022-02-25 Thread Richard Biener via Gcc-patches
On Fri, 25 Feb 2022, Jakub Jelinek wrote:

> On Fri, Feb 25, 2022 at 08:11:40AM +0100, Richard Biener via Gcc-patches 
> wrote:
> > On Thu, 24 Feb 2022, Qing Zhao wrote:
> > 
> > > I briefly checked all the usages of suppress_warning within the current 
> > > gcc, 
> > > and see that most of them are not guarded by any condition. 
> > > 
> > > So, the current change should be fine without introducing new issues. -:)
> > > 
> > > Another thing is, if we use “warning_enable_at” to guard, I just checked, 
> > > this routine is not used by any routine right now, so it might be 
> > > possible that this 
> > > routine has some bug itself.  And now it’s the stage 4, we might need to 
> > > be
> > > conservative. 
> > > 
> > > Based on this, I think that it might be better to put the change in as it 
> > > right now. 
> > > If we think that all the “suppress_warning” need to be guarded by a 
> > > specific
> > > condition, we can do it in gcc13 earlier stage.
> > > 
> > > What’s your opinion?
> > 
> > I would agree here.  Maybe a compromise would be to simply
> > set gimple_set_no_warning () on the actual stmt.  That shouldn't
> 
> I think it is set_no_warning_bit but it isn't exported from
> warning-control.cc.

I meant

/* Set the no_warning flag of STMT to NO_WARNING.  */

static inline void
gimple_set_no_warning (gimple *stmt, bool no_warning)
{
  stmt->no_warning = (unsigned) no_warning;
}

on the artificial stmt.

> One can still use TREE_NO_WARNING though, but seems Martin converted
> all usages of that to the suppress_warning APIs.

Yep, which is why I didn't suggest that.  Note that we don't have
a great location to use - the other patch suggests DECL_LOCATION
which will then silence all other uninit warnings on the decl
which would be IMHO bad.  Since the load is artificial we won't
ever have a more specific location for that unless there's a way
to invent it.  Thus my suggestion to set the no-warning bit
on the GIMPLE stmt itself ...

Richard.

> I thought the hash tables warning-control.cc use are based on the gimple *
> and tree pointers, but apparently they aren't, they are location_t based.
> So if we keep using suppress_warning (tmp_dst, OPT_Wuninitialized);
> there, it actually means just that one entry will be added to the hash table
> because all loads in one clear_padding_flush use the same buf->loc.
> It is true that every suppress_warning will need to look it up in the hash
> table.
> But you're right that because the tmp_dst we've created is artificial, I
> don't see a reason for any warning.  Unfortunately, even with
> suppress_warning (tmp_dst, all_warnings); it looks up the hash tables
> and notes stuff in there rather than just making sure the TREE_NO_WARNING
> bit is set.
> So let's go with the unconditional
>   suppress_warning (tmp_dst, OPT_Wuninitialized);
> for now.
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Ivo Totev; HRB 36809 (AG Nuernberg)


Re: [PATCH][V2][middle-end/104550]Suppress uninitialized warnings for new created uses from __builtin_clear_padding folding

2022-02-25 Thread Jakub Jelinek via Gcc-patches
On Fri, Feb 25, 2022 at 10:13:33AM +0100, Richard Biener via Gcc-patches wrote:
> I meant
> 
> /* Set the no_warning flag of STMT to NO_WARNING.  */
> 
> static inline void
> gimple_set_no_warning (gimple *stmt, bool no_warning)
> {
>   stmt->no_warning = (unsigned) no_warning;
> }
> 
> on the artificial stmt.

That seems to be the same case as TREE_NO_WARNING, all those calls
have been replaced:
$ grep -w gimple_set_no_warning *.{h,cc} */*.{h,cc}
gimple.h:gimple_set_no_warning (gimple *stmt, bool no_warning)

But perhaps it is a good idea to start using that when the location_t
isn't a good idea to use for suppression.
I think it will work.

Jakub



[PATCH] match.pd: Don't create BIT_NOT_EXPRs for COMPLEX_TYPE [PR104675]

2022-02-25 Thread Jakub Jelinek via Gcc-patches
Hi!

We don't support BIT_{AND,IOR,XOR,NOT}_EXPR on complex types,
&/|/^ are just rejected for them, and ~ is parsed as CONJ_EXPR.
So, we should avoid simplifications which turn valid complex type
expressions into something that will ICE during expansion.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2022-02-24  Jakub Jelinek  

PR tree-optimization/104675
* match.pd (-A - 1 -> ~A, -1 - A -> ~A): Don't simplify for
COMPLEX_TYPE.

* gcc.dg/pr104675-1.c: New test.
* gcc.dg/pr104675-2.c: New test.

--- gcc/match.pd.jj 2022-02-23 12:03:20.552435367 +0100
+++ gcc/match.pd2022-02-24 11:55:20.205673823 +0100
@@ -2776,13 +2776,15 @@ (define_operator_list SYNC_FETCH_AND_AND
   (simplify
(minus (convert? (negate @0)) integer_each_onep)
(if (!TYPE_OVERFLOW_TRAPS (type)
+   && TREE_CODE (type) != COMPLEX_TYPE
&& tree_nop_conversion_p (type, TREE_TYPE (@0)))
 (bit_not (convert @0
 
   /* -1 - A -> ~A */
   (simplify
(minus integer_all_onesp @0)
-   (bit_not @0))
+   (if (TREE_CODE (type) != COMPLEX_TYPE)
+(bit_not @0)))
 
   /* (T)(P + A) - (T)P -> (T) A */
   (simplify
--- gcc/testsuite/gcc.dg/pr104675-1.c.jj2022-02-24 11:56:33.253653436 
+0100
+++ gcc/testsuite/gcc.dg/pr104675-1.c   2022-02-24 11:56:14.450916087 +0100
@@ -0,0 +1,29 @@
+/* PR tree-optimization/104675 */
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+_Complex int
+foo (_Complex int a)
+{
+  return (-1 + -1i) - a;
+}
+
+_Complex int
+bar (_Complex int a)
+{
+  return -a - (1 + 1i);
+}
+
+_Complex int
+baz (_Complex int a)
+{
+  _Complex int b = -1 + -1i;
+  return b - a;
+}
+
+_Complex int
+qux (_Complex int a)
+{
+  _Complex int b = 1 + 1i;
+  return -a - b;
+}
--- gcc/testsuite/gcc.dg/pr104675-2.c.jj2022-02-24 11:57:09.200151312 
+0100
+++ gcc/testsuite/gcc.dg/pr104675-2.c   2022-02-24 11:57:04.646214924 +0100
@@ -0,0 +1,18 @@
+/* PR tree-optimization/104675 */
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+void baz (int i);
+
+void
+bar (_Complex int c, short s)
+{
+  c -= s;
+  baz (__real__ c + __imag__ c);
+}
+
+void
+foo (void)
+{
+  bar (-1 - 1i, 0);
+}

Jakub



Re: [PATCH][V2][middle-end/104550]Suppress uninitialized warnings for new created uses from __builtin_clear_padding folding

2022-02-25 Thread Richard Biener via Gcc-patches
On Fri, 25 Feb 2022, Jakub Jelinek wrote:

> On Fri, Feb 25, 2022 at 10:13:33AM +0100, Richard Biener via Gcc-patches 
> wrote:
> > I meant
> > 
> > /* Set the no_warning flag of STMT to NO_WARNING.  */
> > 
> > static inline void
> > gimple_set_no_warning (gimple *stmt, bool no_warning)
> > {
> >   stmt->no_warning = (unsigned) no_warning;
> > }
> > 
> > on the artificial stmt.
> 
> That seems to be the same case as TREE_NO_WARNING, all those calls
> have been replaced:
> $ grep -w gimple_set_no_warning *.{h,cc} */*.{h,cc}
> gimple.h:gimple_set_no_warning (gimple *stmt, bool no_warning)
> 
> But perhaps it is a good idea to start using that when the location_t
> isn't a good idea to use for suppression.

I think it's used as fallback for UNKNOWN_LOCATION, but if we "invent"
a creative location for the artificial stmt it will of course
affect other stmts/expressions using that location.

> I think it will work.

Yes, I think so.  OTOH the uninit pass does

  /* Avoid warning if we've already done so or if the warning has been
 suppressed.  */
  if (((warning_suppressed_p (context, OPT_Wuninitialized)
|| (gimple_assign_single_p (context)
&& get_no_uninit_warning (gimple_assign_rhs1 (context)
  || (var && get_no_uninit_warning (var))
  || (var_name_str
  && warning_suppressed_p (var_def_stmt, OPT_Wuninitialized)))
return;

that's a mightly complicated way to test and I'm not sure we get
to the bit on the stmt reliably.  So maybe TREE_NO_WARNING on the
reference (or making sure it has UNKNOWN_LOCATION and using
suppress_warning on it) is a better idea after all...

Richard.


Re: [PATCH] match.pd: Don't create BIT_NOT_EXPRs for COMPLEX_TYPE [PR104675]

2022-02-25 Thread Richard Biener via Gcc-patches
On Fri, 25 Feb 2022, Jakub Jelinek wrote:

> Hi!
> 
> We don't support BIT_{AND,IOR,XOR,NOT}_EXPR on complex types,
> &/|/^ are just rejected for them, and ~ is parsed as CONJ_EXPR.
> So, we should avoid simplifications which turn valid complex type
> expressions into something that will ICE during expansion.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK (also for backports when applicable).

Richard.

> 2022-02-24  Jakub Jelinek  
> 
>   PR tree-optimization/104675
>   * match.pd (-A - 1 -> ~A, -1 - A -> ~A): Don't simplify for
>   COMPLEX_TYPE.
> 
>   * gcc.dg/pr104675-1.c: New test.
>   * gcc.dg/pr104675-2.c: New test.
> 
> --- gcc/match.pd.jj   2022-02-23 12:03:20.552435367 +0100
> +++ gcc/match.pd  2022-02-24 11:55:20.205673823 +0100
> @@ -2776,13 +2776,15 @@ (define_operator_list SYNC_FETCH_AND_AND
>(simplify
> (minus (convert? (negate @0)) integer_each_onep)
> (if (!TYPE_OVERFLOW_TRAPS (type)
> + && TREE_CODE (type) != COMPLEX_TYPE
>   && tree_nop_conversion_p (type, TREE_TYPE (@0)))
>  (bit_not (convert @0
>  
>/* -1 - A -> ~A */
>(simplify
> (minus integer_all_onesp @0)
> -   (bit_not @0))
> +   (if (TREE_CODE (type) != COMPLEX_TYPE)
> +(bit_not @0)))
>  
>/* (T)(P + A) - (T)P -> (T) A */
>(simplify
> --- gcc/testsuite/gcc.dg/pr104675-1.c.jj  2022-02-24 11:56:33.253653436 
> +0100
> +++ gcc/testsuite/gcc.dg/pr104675-1.c 2022-02-24 11:56:14.450916087 +0100
> @@ -0,0 +1,29 @@
> +/* PR tree-optimization/104675 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +
> +_Complex int
> +foo (_Complex int a)
> +{
> +  return (-1 + -1i) - a;
> +}
> +
> +_Complex int
> +bar (_Complex int a)
> +{
> +  return -a - (1 + 1i);
> +}
> +
> +_Complex int
> +baz (_Complex int a)
> +{
> +  _Complex int b = -1 + -1i;
> +  return b - a;
> +}
> +
> +_Complex int
> +qux (_Complex int a)
> +{
> +  _Complex int b = 1 + 1i;
> +  return -a - b;
> +}
> --- gcc/testsuite/gcc.dg/pr104675-2.c.jj  2022-02-24 11:57:09.200151312 
> +0100
> +++ gcc/testsuite/gcc.dg/pr104675-2.c 2022-02-24 11:57:04.646214924 +0100
> @@ -0,0 +1,18 @@
> +/* PR tree-optimization/104675 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +
> +void baz (int i);
> +
> +void
> +bar (_Complex int c, short s)
> +{
> +  c -= s;
> +  baz (__real__ c + __imag__ c);
> +}
> +
> +void
> +foo (void)
> +{
> +  bar (-1 - 1i, 0);
> +}
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Ivo Totev; HRB 36809 (AG Nuernberg)


[PATCH] i386: Use a new temp slot kind for splitter to floatdi2_i387_with_xmm [PR104674]

2022-02-25 Thread Jakub Jelinek via Gcc-patches
Hi!

As mentioned in the PR, the following testcase is miscompiled for similar
reasons as the already fixed PR78791 - we use SLOT_TEMP slots in various
places during expansion and during expansion we can guarantee that the
lifetime of those temporary slot doesn't overlap.  But the following
splitter uses SLOT_TEMP too and in between expansion and split1 there is
a possibility that something extends the lifetime of SLOT_TEMP created
slots across an instruction that will be split by this splitter.

The following patch fixes it by using a new temp slot kind to make sure
it doesn't reuse a SLOT_TEMP that could be live across the instruction.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2022-02-24  Jakub Jelinek  

PR target/104674
* config/i386/i386.h (enum ix86_stack_slot): Add SLOT_FLOATxFDI_387.
* config/i386/i386.md (splitter to floatdi2_i387_with_xmm): Use
SLOT_FLOATxFDI_387 rather than SLOT_TEMP.

* gcc.target/i386/pr104674.c: New test.

--- gcc/config/i386/i386.h.jj   2022-01-18 11:58:59.118988685 +0100
+++ gcc/config/i386/i386.h  2022-02-24 13:47:02.809289843 +0100
@@ -2414,6 +2414,7 @@ enum ix86_stack_slot
   SLOT_CW_FLOOR,
   SLOT_CW_CEIL,
   SLOT_STV_TEMP,
+  SLOT_FLOATxFDI_387,
   MAX_386_STACK_LOCALS
 };
 
--- gcc/config/i386/i386.md.jj  2022-02-12 11:17:35.14347 +0100
+++ gcc/config/i386/i386.md 2022-02-24 13:35:57.832561968 +0100
@@ -5412,9 +5412,8 @@ (define_split
&& can_create_pseudo_p ()"
   [(const_int 0)]
 {
-  emit_insn (gen_floatdi2_i387_with_xmm
-(operands[0], operands[1],
- assign_386_stack_local (DImode, SLOT_TEMP)));
+  rtx s = assign_386_stack_local (DImode, SLOT_FLOATxFDI_387);
+  emit_insn (gen_floatdi2_i387_with_xmm (operands[0], operands[1], s));
   DONE;
 })
 
--- gcc/testsuite/gcc.target/i386/pr104674.c.jj 2022-02-24 13:45:31.630561330 
+0100
+++ gcc/testsuite/gcc.target/i386/pr104674.c2022-02-24 13:45:21.278705687 
+0100
@@ -0,0 +1,31 @@
+/* PR target/104674 */
+/* { dg-do run { target sse2_runtime } } */
+/* { dg-options "-O2 -msse2 -mfpmath=sse" } */
+
+__attribute__((noipa)) double
+bar (double x, double y)
+{
+  return x + y;
+}
+
+__attribute__((noipa)) double
+foo (long long x)
+{
+  long long a = x / 1000;
+  int b = x % 1000;
+  double s = (double) a;
+  double n = (double) b / 1e7;
+  double t = s + n;
+  if (t == s + 1.0)
+t = bar (t, s);
+  return t;
+}
+
+int
+main ()
+{
+  long long n = 88;
+  n = n * 1000;
+  if (foo (n) != 88.0)
+__builtin_abort ();
+}

Jakub



[PATCH] internal-fn: Call do_pending_stack_adjust in expand_SPACESHIP [PR104679]

2022-02-25 Thread Jakub Jelinek via Gcc-patches
Hi!

The following testcase is miscompiled on ia32 at -O2, because
when expand_SPACESHIP is called, we have pending stack adjustment
from the foo call right before it.
Now, ix86_expand_fp_spaceship uses emit_jump_insn several times
but then emit_jump also several times.  While emit_jump_insn doesn't
do do_pending_stack_adjust (), emit_jump does, so we end up with:
...
8: call [`_Z3foodl'] argc:0x10
  REG_CALL_DECL `_Z3foodl'
9: r88:DF=[`a']
   10: r89:HI=unspec[cmp(r88:DF,0.0)] 25
   11: flags:CC=unspec[r89:HI] 26
   12: pc={(unordered(flags:CCFP,0))?L27:pc}
  REG_BR_PROB 536868
   66: NOTE_INSN_BASIC_BLOCK 4
   13: pc={(uneq(flags:CCFP,0))?L19:pc}
  REG_BR_PROB 214748364
   67: NOTE_INSN_BASIC_BLOCK 5
   14: pc={(flags:CCFP>0)?L23:pc}
  REG_BR_PROB 536870916
   68: NOTE_INSN_BASIC_BLOCK 6
   15: r86:SI=0x
   16: {sp:SI=sp:SI+0x10;clobber flags:CC;}
  REG_ARGS_SIZE 0
   17: pc=L29
   18: barrier
   19: L19:
   69: NOTE_INSN_BASIC_BLOCK 7
...
The sp += 16 pending stuck adjust was emitted in the middle of the
sequence and is effective only for the single case of the 4 possibilities
where .SPACESHIP returns -1, in all other cases the stack isn't adjusted
and so we ICE during dwarf2cfi.

Now, we could either call do_pending_stack_adjust in
ix86_expand_fp_spaceship, or use there calls that actually don't call
do_pending_stack_adjust (but having the stack adjustment across branches is
generally undesirable), or we can call it in expand_SPACESHIP for all
targets (note, just i386 currently implements it).
I chose the generic code because e.g. expand_{addsub,neg,mul}_overflow
in the same file also call do_pending_stack_adjust in internal-fn.cc for the
same reasons, that it is expected that most if not all targets will expand
those through jumps and we don't want all of the targets to need to deal
with that.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Or do you prefer it in ix86_expand_fp_spaceship instead?

2022-02-24  Jakub Jelinek  

PR middle-end/104679
* internal-fn.cc (expand_SPACESHIP): Call do_pending_stack_adjust.

* g++.dg/torture/pr104679.C: New test.

--- gcc/internal-fn.cc.jj   2022-02-11 13:51:07.757597854 +0100
+++ gcc/internal-fn.cc  2022-02-24 17:46:01.476722703 +0100
@@ -4437,6 +4437,8 @@ expand_SPACESHIP (internal_fn, gcall *st
   tree rhs2 = gimple_call_arg (stmt, 1);
   tree type = TREE_TYPE (rhs1);
 
+  do_pending_stack_adjust ();
+
   rtx target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
   rtx op1 = expand_normal (rhs1);
   rtx op2 = expand_normal (rhs2);
--- gcc/testsuite/g++.dg/torture/pr104679.C.jj  2022-02-24 17:48:08.309959261 
+0100
+++ gcc/testsuite/g++.dg/torture/pr104679.C 2022-02-24 17:48:00.226071655 
+0100
@@ -0,0 +1,22 @@
+// PR middle-end/104679
+// { dg-do compile }
+
+struct A { ~A (); };
+void foo (double, long);
+void bar ();
+double a;
+long b;
+
+void
+baz ()
+{
+  foo (a, b);
+  if (a == 0.0)
+;
+  else
+while (a > 0.0)
+  {
+A c;
+bar ();
+  }
+}

Jakub



Re: [PATCH] internal-fn: Call do_pending_stack_adjust in expand_SPACESHIP [PR104679]

2022-02-25 Thread Richard Biener via Gcc-patches
On Fri, 25 Feb 2022, Jakub Jelinek wrote:

> Hi!
> 
> The following testcase is miscompiled on ia32 at -O2, because
> when expand_SPACESHIP is called, we have pending stack adjustment
> from the foo call right before it.
> Now, ix86_expand_fp_spaceship uses emit_jump_insn several times
> but then emit_jump also several times.  While emit_jump_insn doesn't
> do do_pending_stack_adjust (), emit_jump does, so we end up with:
> ...
> 8: call [`_Z3foodl'] argc:0x10
>   REG_CALL_DECL `_Z3foodl'
> 9: r88:DF=[`a']
>10: r89:HI=unspec[cmp(r88:DF,0.0)] 25
>11: flags:CC=unspec[r89:HI] 26
>12: pc={(unordered(flags:CCFP,0))?L27:pc}
>   REG_BR_PROB 536868
>66: NOTE_INSN_BASIC_BLOCK 4
>13: pc={(uneq(flags:CCFP,0))?L19:pc}
>   REG_BR_PROB 214748364
>67: NOTE_INSN_BASIC_BLOCK 5
>14: pc={(flags:CCFP>0)?L23:pc}
>   REG_BR_PROB 536870916
>68: NOTE_INSN_BASIC_BLOCK 6
>15: r86:SI=0x
>16: {sp:SI=sp:SI+0x10;clobber flags:CC;}
>   REG_ARGS_SIZE 0
>17: pc=L29
>18: barrier
>19: L19:
>69: NOTE_INSN_BASIC_BLOCK 7
> ...
> The sp += 16 pending stuck adjust was emitted in the middle of the
> sequence and is effective only for the single case of the 4 possibilities
> where .SPACESHIP returns -1, in all other cases the stack isn't adjusted
> and so we ICE during dwarf2cfi.
> 
> Now, we could either call do_pending_stack_adjust in
> ix86_expand_fp_spaceship, or use there calls that actually don't call
> do_pending_stack_adjust (but having the stack adjustment across branches is
> generally undesirable), or we can call it in expand_SPACESHIP for all
> targets (note, just i386 currently implements it).
> I chose the generic code because e.g. expand_{addsub,neg,mul}_overflow
> in the same file also call do_pending_stack_adjust in internal-fn.cc for the
> same reasons, that it is expected that most if not all targets will expand
> those through jumps and we don't want all of the targets to need to deal
> with that.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

> Or do you prefer it in ix86_expand_fp_spaceship instead?

No, as you say others will likely repeat the mistake otherwise.

Richard.

> 2022-02-24  Jakub Jelinek  
> 
>   PR middle-end/104679
>   * internal-fn.cc (expand_SPACESHIP): Call do_pending_stack_adjust.
> 
>   * g++.dg/torture/pr104679.C: New test.
> 
> --- gcc/internal-fn.cc.jj 2022-02-11 13:51:07.757597854 +0100
> +++ gcc/internal-fn.cc2022-02-24 17:46:01.476722703 +0100
> @@ -4437,6 +4437,8 @@ expand_SPACESHIP (internal_fn, gcall *st
>tree rhs2 = gimple_call_arg (stmt, 1);
>tree type = TREE_TYPE (rhs1);
>  
> +  do_pending_stack_adjust ();
> +
>rtx target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
>rtx op1 = expand_normal (rhs1);
>rtx op2 = expand_normal (rhs2);
> --- gcc/testsuite/g++.dg/torture/pr104679.C.jj2022-02-24 
> 17:48:08.309959261 +0100
> +++ gcc/testsuite/g++.dg/torture/pr104679.C   2022-02-24 17:48:00.226071655 
> +0100
> @@ -0,0 +1,22 @@
> +// PR middle-end/104679
> +// { dg-do compile }
> +
> +struct A { ~A (); };
> +void foo (double, long);
> +void bar ();
> +double a;
> +long b;
> +
> +void
> +baz ()
> +{
> +  foo (a, b);
> +  if (a == 0.0)
> +;
> +  else
> +while (a > 0.0)
> +  {
> +A c;
> +bar ();
> +  }
> +}
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Ivo Totev; HRB 36809 (AG Nuernberg)


[committed] warning-control: Comment spelling fix

2022-02-25 Thread Jakub Jelinek via Gcc-patches
Hi!

This fixes a spelling mistake I found while looking at warning-control
implementation.

Committed as obvious.

2022-02-25  Jakub Jelinek  

* warning-control.cc (get_nowarn_spec): Comment spelling fix.

--- gcc/warning-control.cc.jj   2022-01-17 18:05:04.861306618 +0100
+++ gcc/warning-control.cc  2022-02-25 10:21:44.231071154 +0100
@@ -98,7 +98,7 @@ get_nowarn_spec (const_tree expr)
   return nowarn_map ? nowarn_map->get (loc) : NULL;
 }
 
-/* Return the no-warning bitmap for stateemt STMT.  */
+/* Return the no-warning bitmap for statement STMT.  */
 
 static nowarn_spec_t *
 get_nowarn_spec (const gimple *stmt)


Jakub



Re: [PATCH][V2][middle-end/104550]Suppress uninitialized warnings for new created uses from __builtin_clear_padding folding

2022-02-25 Thread Jakub Jelinek via Gcc-patches
On Fri, Feb 25, 2022 at 10:31:50AM +0100, Richard Biener wrote:
> I think it's used as fallback for UNKNOWN_LOCATION, but if we "invent"
> a creative location for the artificial stmt it will of course
> affect other stmts/expressions using that location.
> 
> > I think it will work.
> 
> Yes, I think so.  OTOH the uninit pass does
> 
>   /* Avoid warning if we've already done so or if the warning has been
>  suppressed.  */
>   if (((warning_suppressed_p (context, OPT_Wuninitialized)
> || (gimple_assign_single_p (context)
> && get_no_uninit_warning (gimple_assign_rhs1 (context)
>   || (var && get_no_uninit_warning (var))
>   || (var_name_str
>   && warning_suppressed_p (var_def_stmt, OPT_Wuninitialized)))
> return;
> 
> that's a mightly complicated way to test and I'm not sure we get
> to the bit on the stmt reliably.  So maybe TREE_NO_WARNING on the
> reference (or making sure it has UNKNOWN_LOCATION and using
> suppress_warning on it) is a better idea after all...

So
SET_EXPR_LOCATION (tmp_dst, UNKNOWN_LOCATION);
suppress_warning (tmp_dst, OPT_Wuninitialized);
with a comment explaing why we do that?
LGTM.

Jakub



Re: [PATCH][V2][middle-end/104550]Suppress uninitialized warnings for new created uses from __builtin_clear_padding folding

2022-02-25 Thread Richard Biener via Gcc-patches
On Fri, 25 Feb 2022, Jakub Jelinek wrote:

> On Fri, Feb 25, 2022 at 10:31:50AM +0100, Richard Biener wrote:
> > I think it's used as fallback for UNKNOWN_LOCATION, but if we "invent"
> > a creative location for the artificial stmt it will of course
> > affect other stmts/expressions using that location.
> > 
> > > I think it will work.
> > 
> > Yes, I think so.  OTOH the uninit pass does
> > 
> >   /* Avoid warning if we've already done so or if the warning has been
> >  suppressed.  */
> >   if (((warning_suppressed_p (context, OPT_Wuninitialized)
> > || (gimple_assign_single_p (context)
> > && get_no_uninit_warning (gimple_assign_rhs1 (context)
> >   || (var && get_no_uninit_warning (var))
> >   || (var_name_str
> >   && warning_suppressed_p (var_def_stmt, OPT_Wuninitialized)))
> > return;
> > 
> > that's a mightly complicated way to test and I'm not sure we get
> > to the bit on the stmt reliably.  So maybe TREE_NO_WARNING on the
> > reference (or making sure it has UNKNOWN_LOCATION and using
> > suppress_warning on it) is a better idea after all...
> 
> So
>   SET_EXPR_LOCATION (tmp_dst, UNKNOWN_LOCATION);
>   suppress_warning (tmp_dst, OPT_Wuninitialized);
> with a comment explaing why we do that?

Yes.  Depending on the context tmp_dst will already reliably have
UNKNOWN_LOCATION?  If tmp_dst might be a plain decl we have to
be more careful of course.

Richard.

> LGTM.
> 
>   Jakub


Re: [PATCH][V2][middle-end/104550]Suppress uninitialized warnings for new created uses from __builtin_clear_padding folding

2022-02-25 Thread Jakub Jelinek via Gcc-patches
On Fri, Feb 25, 2022 at 11:12:13AM +0100, Richard Biener wrote:
> On Fri, 25 Feb 2022, Jakub Jelinek wrote:
> 
> > On Fri, Feb 25, 2022 at 10:31:50AM +0100, Richard Biener wrote:
> > > I think it's used as fallback for UNKNOWN_LOCATION, but if we "invent"
> > > a creative location for the artificial stmt it will of course
> > > affect other stmts/expressions using that location.
> > > 
> > > > I think it will work.
> > > 
> > > Yes, I think so.  OTOH the uninit pass does
> > > 
> > >   /* Avoid warning if we've already done so or if the warning has been
> > >  suppressed.  */
> > >   if (((warning_suppressed_p (context, OPT_Wuninitialized)
> > > || (gimple_assign_single_p (context)
> > > && get_no_uninit_warning (gimple_assign_rhs1 (context)
> > >   || (var && get_no_uninit_warning (var))
> > >   || (var_name_str
> > >   && warning_suppressed_p (var_def_stmt, OPT_Wuninitialized)))
> > > return;
> > > 
> > > that's a mightly complicated way to test and I'm not sure we get
> > > to the bit on the stmt reliably.  So maybe TREE_NO_WARNING on the
> > > reference (or making sure it has UNKNOWN_LOCATION and using
> > > suppress_warning on it) is a better idea after all...
> > 
> > So
> > SET_EXPR_LOCATION (tmp_dst, UNKNOWN_LOCATION);
> > suppress_warning (tmp_dst, OPT_Wuninitialized);
> > with a comment explaing why we do that?
> 
> Yes.  Depending on the context tmp_dst will already reliably have
> UNKNOWN_LOCATION?  If tmp_dst might be a plain decl we have to
> be more careful of course.

tmp_dst is:
  tree dst = build2_loc (buf->loc, MEM_REF, atype, buf->base,
 build_int_cst (buf->alias_type, off));
...
tmp_dst = unshare_expr (dst);
We could build2 the dst and set its location only after unshare_expr
too, but I guess clearing it with a comment on why we do that is better.

Jakub



Re: [PATCH] i386: Use a new temp slot kind for splitter to floatdi2_i387_with_xmm [PR104674]

2022-02-25 Thread Uros Bizjak via Gcc-patches
On Fri, Feb 25, 2022 at 10:33 AM Jakub Jelinek  wrote:
>
> Hi!
>
> As mentioned in the PR, the following testcase is miscompiled for similar
> reasons as the already fixed PR78791 - we use SLOT_TEMP slots in various
> places during expansion and during expansion we can guarantee that the
> lifetime of those temporary slot doesn't overlap.  But the following
> splitter uses SLOT_TEMP too and in between expansion and split1 there is
> a possibility that something extends the lifetime of SLOT_TEMP created
> slots across an instruction that will be split by this splitter.
>
> The following patch fixes it by using a new temp slot kind to make sure
> it doesn't reuse a SLOT_TEMP that could be live across the instruction.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2022-02-24  Jakub Jelinek  
>
> PR target/104674
> * config/i386/i386.h (enum ix86_stack_slot): Add SLOT_FLOATxFDI_387.
> * config/i386/i386.md (splitter to floatdi2_i387_with_xmm): Use
> SLOT_FLOATxFDI_387 rather than SLOT_TEMP.
>
> * gcc.target/i386/pr104674.c: New test.

OK.

Thanks,
Uros.

>
> --- gcc/config/i386/i386.h.jj   2022-01-18 11:58:59.118988685 +0100
> +++ gcc/config/i386/i386.h  2022-02-24 13:47:02.809289843 +0100
> @@ -2414,6 +2414,7 @@ enum ix86_stack_slot
>SLOT_CW_FLOOR,
>SLOT_CW_CEIL,
>SLOT_STV_TEMP,
> +  SLOT_FLOATxFDI_387,
>MAX_386_STACK_LOCALS
>  };
>
> --- gcc/config/i386/i386.md.jj  2022-02-12 11:17:35.14347 +0100
> +++ gcc/config/i386/i386.md 2022-02-24 13:35:57.832561968 +0100
> @@ -5412,9 +5412,8 @@ (define_split
> && can_create_pseudo_p ()"
>[(const_int 0)]
>  {
> -  emit_insn (gen_floatdi2_i387_with_xmm
> -(operands[0], operands[1],
> - assign_386_stack_local (DImode, SLOT_TEMP)));
> +  rtx s = assign_386_stack_local (DImode, SLOT_FLOATxFDI_387);
> +  emit_insn (gen_floatdi2_i387_with_xmm (operands[0], operands[1], s));
>DONE;
>  })
>
> --- gcc/testsuite/gcc.target/i386/pr104674.c.jj 2022-02-24 13:45:31.630561330 
> +0100
> +++ gcc/testsuite/gcc.target/i386/pr104674.c2022-02-24 13:45:21.278705687 
> +0100
> @@ -0,0 +1,31 @@
> +/* PR target/104674 */
> +/* { dg-do run { target sse2_runtime } } */
> +/* { dg-options "-O2 -msse2 -mfpmath=sse" } */
> +
> +__attribute__((noipa)) double
> +bar (double x, double y)
> +{
> +  return x + y;
> +}
> +
> +__attribute__((noipa)) double
> +foo (long long x)
> +{
> +  long long a = x / 1000;
> +  int b = x % 1000;
> +  double s = (double) a;
> +  double n = (double) b / 1e7;
> +  double t = s + n;
> +  if (t == s + 1.0)
> +t = bar (t, s);
> +  return t;
> +}
> +
> +int
> +main ()
> +{
> +  long long n = 88;
> +  n = n * 1000;
> +  if (foo (n) != 88.0)
> +__builtin_abort ();
> +}
>
> Jakub
>


[PATCH] tree-optimization/103037 - PRE simplifying valueized expressions

2022-02-25 Thread Richard Biener via Gcc-patches
This fixes a long-standing issue in PRE where we track valueized
expressions in our expression sets that we use for PHI translation,
code insertion but also feed into match-and-simplify via
vn_nary_simplify.  But that's not what is expected from vn_nary_simplify
or match-and-simplify which assume we are simplifying with operands
available at the point of the expression so they can use contextual
information on the SSA names like ranges.  While the VN side was
updated to ensure this with the rewrite to RPO VN, thereby removing
all workarounds that nullified such contextual info on all SSA names,
the PRE side still suffers from this.

The following patch tries to apply minimal surgery at this point
and makes PRE track un-valueized expressions in the expression sets
but only for the NARY kind (both NAME and CONSTANT do not suffer
from this issue), leaving the REFERENCE kind alone.  The REFERENCE
kind is important when trying to remove the workarounds still in
place in compute_avail for code hoisting, but that's a separate issue
and we have a working workaround in place.

Doing this comes at the cost of duplicating the VN IL on the PRE side
for NARY and eventually some extra overhead for translated expressions
that is difficult to assess.

Bootstrapped and tested on x86_64-unknown-linux-gnu.

2022-02-25  Richard Biener  

PR tree-optimization/103037
* tree-ssa-sccvn.h (alloc_vn_nary_op_noinit): Declare.
(vn_nary_length_from_stmt): Likewise.
(init_vn_nary_op_from_stmt): Likewise.
(vn_nary_op_compute_hash): Likewise.
* tree-ssa-sccvn.cc (alloc_vn_nary_op_noinit): Export.
(vn_nary_length_from_stmt): Likewise.
(init_vn_nary_op_from_stmt): Likewise.
(vn_nary_op_compute_hash): Likewise.
* tree-ssa-pre.cc (pre_expr_obstack): New obstack.
(get_or_alloc_expr_for_nary): Pass in the value-id to use,
(re-)compute the hash value and if the expression is not
found allocate it from pre_expr_obstack.
(phi_translate_1): Do not insert the NARY found in the
VN tables but build a PRE expression from the valueized
NARY with the value-id we eventually found.
(find_or_generate_expression): Assert we have an entry
for constant values.
(compute_avail): Insert not valueized expressions into
EXP_GEN using the value-id from the VN tables.
(init_pre): Allocate pre_expr_obstack.
(fini_pre): Free pre_expr_obstack.

* gcc.dg/torture/pr103037.c: New testcase.
---
 gcc/testsuite/gcc.dg/torture/pr103037.c | 23 +
 gcc/tree-ssa-pre.cc | 43 +
 gcc/tree-ssa-sccvn.cc   | 19 ++-
 gcc/tree-ssa-sccvn.h|  4 +++
 4 files changed, 65 insertions(+), 24 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr103037.c

diff --git a/gcc/testsuite/gcc.dg/torture/pr103037.c 
b/gcc/testsuite/gcc.dg/torture/pr103037.c
new file mode 100644
index 000..8b3bb1e4c8b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr103037.c
@@ -0,0 +1,23 @@
+/* { dg-do run } */
+
+static inline const unsigned short *
+min(unsigned short *d, const unsigned short *e)
+{
+  return *e < *d ? e : d;
+}
+
+unsigned short __attribute__((noipa))
+test(unsigned short arr, unsigned short val)
+{
+  unsigned short tem = 1;
+  unsigned short tem2 = *min(&arr, &tem);
+  return tem2 / (arr ? arr : val);
+}
+
+int
+main()
+{
+  if (test (2, 2) != 0)
+__builtin_abort();
+  return 0;
+}
diff --git a/gcc/tree-ssa-pre.cc b/gcc/tree-ssa-pre.cc
index 7c2a52e1850..4138511cab4 100644
--- a/gcc/tree-ssa-pre.cc
+++ b/gcc/tree-ssa-pre.cc
@@ -323,6 +323,7 @@ static unsigned int next_expression_id;
 static vec expressions;
 static hash_table *expression_to_id;
 static vec name_to_id;
+static obstack pre_expr_obstack;
 
 /* Allocate an expression id for EXPR.  */
 
@@ -433,7 +434,7 @@ get_or_alloc_expr_for_name (tree name)
 /* Given an NARY, get or create a pre_expr to represent it.  */
 
 static pre_expr
-get_or_alloc_expr_for_nary (vn_nary_op_t nary,
+get_or_alloc_expr_for_nary (vn_nary_op_t nary, unsigned value_id,
location_t loc = UNKNOWN_LOCATION)
 {
   struct pre_expr_d expr;
@@ -442,6 +443,7 @@ get_or_alloc_expr_for_nary (vn_nary_op_t nary,
 
   expr.kind = NARY;
   expr.id = 0;
+  nary->hashcode = vn_nary_op_compute_hash (nary);
   PRE_EXPR_NARY (&expr) = nary;
   result_id = lookup_expression_id (&expr);
   if (result_id != 0)
@@ -450,8 +452,10 @@ get_or_alloc_expr_for_nary (vn_nary_op_t nary,
   result = pre_expr_pool.allocate ();
   result->kind = NARY;
   result->loc = loc;
-  result->value_id = nary->value_id;
-  PRE_EXPR_NARY (result) = nary;
+  result->value_id = value_id ? value_id : get_next_value_id ();
+  PRE_EXPR_NARY (result)
+= alloc_vn_nary_op_noinit (nary->length, &pre_expr_obstack);
+  memcpy (PRE_EXPR_NARY (result), nary, sizeof_vn_nary_op (nary->length));

[PATCH] s390: Change SET rtx_cost handling.

2022-02-25 Thread Robin Dapp via Gcc-patches
Hi,

the IF_THEN_ELSE detection currently prevents us from properly costing
register-register moves which causes the lower-subreg pass to assume
that a VR-VR move is as expensive as two GPR-GPR moves.

This patch adds handling for SETs containing REGs as well as MEMs and is
inspired by the aarch64 implementation.

Bootstrapped and regtested on z900 up to z15. Is it OK?

Regards
 Robin

--

gcc/ChangeLog:

* config/s390/s390.cc (s390_address_cost): Declare.
(s390_hard_regno_nregs): Declare.
(s390_rtx_costs): Add handling for REG and MEM in SET.

gcc/testsuite/ChangeLog:

* gcc.target/s390/vector/vec-sum-across-no-lower-subreg-1.c: New
test.


>From 8c4c6f029dbf0c9db12c2877189a5ec0ce0a9c89 Mon Sep 17 00:00:00 2001
From: Robin Dapp 
Date: Thu, 3 Feb 2022 12:50:04 +0100
Subject: [PATCH] s390: Change SET rtx_cost handling.

The IF_THEN_ELSE detection currently prevents us from properly costing
register-register moves which causes the lower-subreg pass to assume that
a VR-VR move is as expensive as two GPR-GPR moves.

This patch adds handling for SETs containing REGs as well as MEMs and is
inspired by the aarch64 implementation.

gcc/ChangeLog:

* config/s390/s390.cc (s390_address_cost): Declare.
(s390_hard_regno_nregs): Declare.
(s390_rtx_costs): Add handling for REG and MEM in SET.

gcc/testsuite/ChangeLog:

* gcc.target/s390/vector/vec-sum-across-no-lower-subreg-1.c: New
test.
---
 gcc/config/s390/s390.cc   | 140 +-
 .../vector/vec-sum-across-no-lower-subreg-1.c |  17 +++
 2 files changed, 118 insertions(+), 39 deletions(-)
 create mode 100644
gcc/testsuite/gcc.target/s390/vector/vec-sum-across-no-lower-subreg-1.c

diff --git a/gcc/config/s390/s390.cc b/gcc/config/s390/s390.cc
index d2af6d8813d..e647c90ab29 100644
--- a/gcc/config/s390/s390.cc
+++ b/gcc/config/s390/s390.cc
@@ -429,6 +429,14 @@ struct s390_address
bytes on a z10 (or higher) CPU.  */
 #define PREDICT_DISTANCE (TARGET_Z10 ? 384 : 2048)

+static int
+s390_address_cost (rtx addr, machine_mode mode ATTRIBUTE_UNUSED,
+  addr_space_t as ATTRIBUTE_UNUSED,
+  bool speed ATTRIBUTE_UNUSED);
+
+static unsigned int
+s390_hard_regno_nregs (unsigned int regno, machine_mode mode);
+
 /* Masks per jump target register indicating which thunk need to be
generated.  */
 static GTY(()) int indirect_branch_prez10thunk_mask = 0;
@@ -3619,50 +3627,104 @@ s390_rtx_costs (rtx x, machine_mode mode, int
outer_code,
 case MEM:
   *total = 0;
   return true;
-
 case SET:
-  {
-   /* Without this a conditional move instruction would be
-  accounted as 3 * COSTS_N_INSNS (set, if_then_else,
-  comparison operator).  That's a bit pessimistic.  */
+   {
+ rtx dest = SET_DEST (x);
+ rtx src = SET_SRC (x);

-   if (!TARGET_Z196 || GET_CODE (SET_SRC (x)) != IF_THEN_ELSE)
- return false;
+ switch (GET_CODE (src))
+   {
+   case IF_THEN_ELSE:
+   {
+ /* Without this a conditional move instruction would be
+accounted as 3 * COSTS_N_INSNS (set, if_then_else,
+comparison operator).  That's a bit pessimistic.  */
+
+ if (!TARGET_Z196)
+   return false;
+
+ rtx cond = XEXP (src, 0);
+ if (!CC_REG_P (XEXP (cond, 0)) || !CONST_INT_P (XEXP (cond, 
1)))
+   return false;
+
+ /* It is going to be a load/store on condition.  Make it
+slightly more expensive than a normal load.  */
+ *total = COSTS_N_INSNS (1) + 2;
+
+ rtx dst = SET_DEST (src);
+ rtx then = XEXP (src, 1);
+ rtx els = XEXP (src, 2);
+
+ /* It is a real IF-THEN-ELSE.  An additional move will be
+needed to implement that.  */
+ if (!TARGET_Z15
+ && reload_completed
+ && !rtx_equal_p (dst, then)
+ && !rtx_equal_p (dst, els))
+   *total += COSTS_N_INSNS (1) / 2;
+
+ /* A minor penalty for constants we cannot directly handle.  
*/
+ if ((CONST_INT_P (then) || CONST_INT_P (els))
+ && (!TARGET_Z13 || MEM_P (dst)
+ || (CONST_INT_P (then) && !satisfies_constraint_K 
(then))
+ || (CONST_INT_P (els) && !satisfies_constraint_K 
(els
+   *total += COSTS_N_INSNS (1) / 2;
+
+ /* A store on condition can only handle register src 
operands.  */
+ if (MEM_P (dst) && (!REG_P (then) || !REG_P (els)))
+   *total += COSTS_N_INSNS (1) / 2;
+
+ return true;
+   }
+   default:
+ break;
+   }

-   rtx cond = XEXP (SET_SRC (x), 0);

Re: [PATCH] tree-optimization/103037 - PRE simplifying valueized expressions

2022-02-25 Thread Richard Biener via Gcc-patches
On Fri, 25 Feb 2022, Richard Biener wrote:

> This fixes a long-standing issue in PRE where we track valueized
> expressions in our expression sets that we use for PHI translation,
> code insertion but also feed into match-and-simplify via
> vn_nary_simplify.  But that's not what is expected from vn_nary_simplify
> or match-and-simplify which assume we are simplifying with operands
> available at the point of the expression so they can use contextual
> information on the SSA names like ranges.  While the VN side was
> updated to ensure this with the rewrite to RPO VN, thereby removing
> all workarounds that nullified such contextual info on all SSA names,
> the PRE side still suffers from this.
> 
> The following patch tries to apply minimal surgery at this point
> and makes PRE track un-valueized expressions in the expression sets
> but only for the NARY kind (both NAME and CONSTANT do not suffer
> from this issue), leaving the REFERENCE kind alone.  The REFERENCE
> kind is important when trying to remove the workarounds still in
> place in compute_avail for code hoisting, but that's a separate issue
> and we have a working workaround in place.
> 
> Doing this comes at the cost of duplicating the VN IL on the PRE side
> for NARY and eventually some extra overhead for translated expressions
> that is difficult to assess.
> 
> Bootstrapped and tested on x86_64-unknown-linux-gnu.

I've attached the wrong version - the following is the cleaned up
and minimal variant I tested and will push.

Richard.

2022-02-25  Richard Biener  

PR tree-optimization/103037
* tree-ssa-sccvn.h (alloc_vn_nary_op_noinit): Declare.
(vn_nary_length_from_stmt): Likewise.
(init_vn_nary_op_from_stmt): Likewise.
(vn_nary_op_compute_hash): Likewise.
* tree-ssa-sccvn.cc (alloc_vn_nary_op_noinit): Export.
(vn_nary_length_from_stmt): Likewise.
(init_vn_nary_op_from_stmt): Likewise.
(vn_nary_op_compute_hash): Likewise.
* tree-ssa-pre.cc (pre_expr_obstack): New obstack.
(get_or_alloc_expr_for_nary): Pass in the value-id to use,
(re-)compute the hash value and if the expression is not
found allocate it from pre_expr_obstack.
(phi_translate_1): Do not insert the NARY found in the
VN tables but build a PRE expression from the valueized
NARY with the value-id we eventually found.
(find_or_generate_expression): Assert we have an entry
for constant values.
(compute_avail): Insert not valueized expressions into
EXP_GEN using the value-id from the VN tables.
(init_pre): Allocate pre_expr_obstack.
(fini_pre): Free pre_expr_obstack.

* gcc.dg/torture/pr103037.c: New testcase.
---
 gcc/testsuite/gcc.dg/torture/pr103037.c | 23 +++
 gcc/tree-ssa-pre.cc | 52 ++---
 gcc/tree-ssa-sccvn.cc   | 11 ++
 gcc/tree-ssa-sccvn.h|  4 ++
 4 files changed, 68 insertions(+), 22 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr103037.c

diff --git a/gcc/testsuite/gcc.dg/torture/pr103037.c 
b/gcc/testsuite/gcc.dg/torture/pr103037.c
new file mode 100644
index 000..8b3bb1e4c8b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr103037.c
@@ -0,0 +1,23 @@
+/* { dg-do run } */
+
+static inline const unsigned short *
+min(unsigned short *d, const unsigned short *e)
+{
+  return *e < *d ? e : d;
+}
+
+unsigned short __attribute__((noipa))
+test(unsigned short arr, unsigned short val)
+{
+  unsigned short tem = 1;
+  unsigned short tem2 = *min(&arr, &tem);
+  return tem2 / (arr ? arr : val);
+}
+
+int
+main()
+{
+  if (test (2, 2) != 0)
+__builtin_abort();
+  return 0;
+}
diff --git a/gcc/tree-ssa-pre.cc b/gcc/tree-ssa-pre.cc
index 7c2a52e1850..d6c83a72dd8 100644
--- a/gcc/tree-ssa-pre.cc
+++ b/gcc/tree-ssa-pre.cc
@@ -323,6 +323,7 @@ static unsigned int next_expression_id;
 static vec expressions;
 static hash_table *expression_to_id;
 static vec name_to_id;
+static obstack pre_expr_obstack;
 
 /* Allocate an expression id for EXPR.  */
 
@@ -430,18 +431,23 @@ get_or_alloc_expr_for_name (tree name)
   return result;
 }
 
-/* Given an NARY, get or create a pre_expr to represent it.  */
+/* Given an NARY, get or create a pre_expr to represent it.  Assign
+   VALUE_ID to it or allocate a new value-id if it is zero.  Record
+   LOC as the original location of the expression.  */
 
 static pre_expr
-get_or_alloc_expr_for_nary (vn_nary_op_t nary,
+get_or_alloc_expr_for_nary (vn_nary_op_t nary, unsigned value_id,
location_t loc = UNKNOWN_LOCATION)
 {
   struct pre_expr_d expr;
   pre_expr result;
   unsigned int result_id;
 
+  gcc_assert (value_id == 0 || !value_id_constant_p (value_id));
+
   expr.kind = NARY;
   expr.id = 0;
+  nary->hashcode = vn_nary_op_compute_hash (nary);
   PRE_EXPR_NARY (&expr) = nary;
   result_id = lookup_expression_id (&expr);
   if (res

Re: [PATCH 2/2][middle-end/102276] Adding -Wtrivial-auto-var-init and update documentation.

2022-02-25 Thread Richard Biener via Gcc-patches
On Thu, 24 Feb 2022, Qing Zhao wrote:

> 
> 
> > On Feb 24, 2022, at 4:16 AM, Richard Biener  wrote:
> > 
> > On Sat, 19 Feb 2022, Qing Zhao wrote:
> > 
> >> Hi,
> >> 
> >> This is the 2nd patch for fixing pr102276.
> >> 
> >> Adding -Wtrivial-auto-var-init and update documentation.
> >> 
> >> Adding a new warning option -Wtrivial-auto-var-init to report cases when
> >> -ftrivial-auto-var-init cannot initialize the auto variable. At the same
> >> time, update documentation for -ftrivial-auto-var-init to connect it with
> >> the new warning option -Wtrivial-auto-var-init,  and add documentation
> >> for -Wtrivial-auto-var-init.
> >> 
> >> Bootstraped and regression tested on both x86 and aarch64.
> >> 
> >> Okay for committing?
> >> 
> >> thanks.
> >> 
> >> Qing.
> >> 
> >> ==
> >> From 4346890b8f4258489c4841f1992ba3ce816d7689 Mon Sep 17 00:00:00 2001
> >> From: Qing Zhao 
> >> Date: Fri, 18 Feb 2022 15:53:15 +
> >> Subject: [PATCH 2/2] Adding -Wtrivial-auto-var-init and update 
> >> documentation.
> >> 
> >> Adding a new warning option -Wtrivial-auto-var-init to report cases when
> >> -ftrivial-auto-var-init cannot initialize the auto variable. At the same
> >> time, update documentation for -ftrivial-auto-var-init to connect it with
> >> the new warning option -Wtrivial-auto-var-init,  and add documentation
> >> for -Wtrivial-auto-var-init.
> >> 
> >> 2022-02-18 Qing Zhao  
> >> gcc/ChangeLog:
> >> 
> >>* common.opt (-Wtrivial-auto-var-init): New option.
> >>* doc/invoke.texi (-Wtrivial-auto-var-init): Document new option.
> >>(-ftrivial-auto-var-init): Update option;
> >>* gimplify.cc (maybe_warn_switch_unreachable): Rename...
> >>(maybe_warn_switch_unreachable_and_auto_init): ...to this.
> >>(gimplify_switch_expr): Call new function.
> >> 
> >> gcc/testsuite/ChangeLog:
> >> 
> >>* gcc.dg/auto-init-pr102276-3.c: New test.
> >>* gcc.dg/auto-init-pr102276-4.c: New test.
> >> ---
> >> gcc/common.opt  |   4 +
> >> gcc/doc/invoke.texi |  14 ++-
> >> gcc/gimplify.cc | 100 +++-
> >> gcc/testsuite/gcc.dg/auto-init-pr102276-3.c |  40 
> >> gcc/testsuite/gcc.dg/auto-init-pr102276-4.c |  40 
> >> 5 files changed, 175 insertions(+), 23 deletions(-)
> >> create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr102276-3.c
> >> create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr102276-4.c
> >> 
> >> diff --git a/gcc/common.opt b/gcc/common.opt
> >> index c21e5273ae3..22c95dbfa49 100644
> >> --- a/gcc/common.opt
> >> +++ b/gcc/common.opt
> >> @@ -801,6 +801,10 @@ Wtrampolines
> >> Common Var(warn_trampolines) Warning
> >> Warn whenever a trampoline is generated.
> >> 
> >> +Wtrivial-auto-var-init
> >> +Common Var(warn_trivial_auto_var_init) Warning Init(0)
> >> +Warn about where -ftrivial-auto-var-init cannot initialize the auto 
> >> variable.
> >> +
> > 
> > Warn about cases where ... initialize a variable.
> 
> Okay. 
> 
> > 
> >> Wtype-limits
> >> Common Var(warn_type_limits) Warning EnabledBy(Wextra)
> >> Warn if a comparison is always true or always false due to the limited 
> >> range of the data type.
> >> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> >> index e1a00c80307..c61a5b4b4a5 100644
> >> --- a/gcc/doc/invoke.texi
> >> +++ b/gcc/doc/invoke.texi
> >> @@ -399,7 +399,7 @@ Objective-C and Objective-C++ Dialects}.
> >> -Wswitch  -Wno-switch-bool  -Wswitch-default  -Wswitch-enum @gol
> >> -Wno-switch-outside-range  -Wno-switch-unreachable  -Wsync-nand @gol
> >> -Wsystem-headers  -Wtautological-compare  -Wtrampolines  -Wtrigraphs @gol
> >> --Wtsan -Wtype-limits  -Wundef @gol
> >> +-Wtrivial-auto-var-init -Wtsan -Wtype-limits  -Wundef @gol
> >> -Wuninitialized  -Wunknown-pragmas @gol
> >> -Wunsuffixed-float-constants  -Wunused @gol
> >> -Wunused-but-set-parameter  -Wunused-but-set-variable @gol
> >> @@ -6953,6 +6953,14 @@ This warning is enabled by default for C and C++ 
> >> programs.
> >> Warn when @code{__sync_fetch_and_nand} and @code{__sync_nand_and_fetch}
> >> built-in functions are used.  These functions changed semantics in GCC 4.4.
> >> 
> >> +@item -Wtrivial-auto-var-init
> >> +@opindex Wtrivial-auto-var-init
> >> +@opindex Wno-trivial-auto-var-init
> >> +Warn when @code{-ftrivial-auto-var-init} cannot initialize the automatic
> >> +variable.  A common situation is an automatic variable that is declared
> >> +between the controlling expression and the first case lable of a 
> >> @code{switch}
> >> +statement.
> >> +
> >> @item -Wunused-but-set-parameter
> >> @opindex Wunused-but-set-parameter
> >> @opindex Wno-unused-but-set-parameter
> >> @@ -12314,6 +12322,10 @@ initializer as uninitialized, 
> >> @option{-Wuninitialized} and
> >> warning messages on such automatic variables.
> >> With this option, GCC will also initialize any padding of automatic 
> >> variables
> >> that have structure or union types to zeroes.
> >> +H

[committed] arc: Fail conditional move expand patterns

2022-02-25 Thread Claudiu Zissulescu via Gcc-patches
If the movcc comparison is not valid it triggers an assert in the
current implementation.  This behavior is not needed as we can FAIL
the movcc expand pattern.

gcc/
* config/arc/arc.cc (gen_compare_reg): Return NULL_RTX if the
comparison is not valid.
* config/arc/arc.md (movsicc): Fail if comparison is not valid.
(movdicc): Likewise.
(movsfcc): Likewise.
(movdfcc): Likewise.

Signed-off-by: Claudiu Zissulescu 
---
 gcc/config/arc/arc.cc |  3 ++-
 gcc/config/arc/arc.md | 25 -
 2 files changed, 22 insertions(+), 6 deletions(-)

diff --git a/gcc/config/arc/arc.cc b/gcc/config/arc/arc.cc
index 8cc173519ab..c27ba99eb60 100644
--- a/gcc/config/arc/arc.cc
+++ b/gcc/config/arc/arc.cc
@@ -2256,7 +2256,8 @@ gen_compare_reg (rtx comparison, machine_mode omode)
   cmode = GET_MODE (x);
   if (cmode == VOIDmode)
 cmode = GET_MODE (y);
-  gcc_assert (cmode == SImode || cmode == SFmode || cmode == DFmode);
+  if (cmode != SImode && cmode != SFmode && cmode != DFmode)
+return NULL_RTX;
   if (cmode == SImode)
 {
   if (!register_operand (x, SImode))
diff --git a/gcc/config/arc/arc.md b/gcc/config/arc/arc.md
index ace3cb70424..39b358052c1 100644
--- a/gcc/config/arc/arc.md
+++ b/gcc/config/arc/arc.md
@@ -1618,8 +1618,11 @@ (define_expand "movsicc"
 (match_operand:SI 2 "nonmemory_operand" "")
 (match_operand:SI 3 "register_operand" "")))]
   ""
-  "operands[1] = gen_compare_reg (operands[1], VOIDmode);")
-
+  "
+  operands[1] = gen_compare_reg (operands[1], VOIDmode);
+  if (operands[1] == NULL_RTX)
+FAIL;
+  ")
 
 (define_expand "movdicc"
   [(set (match_operand:DI 0 "dest_reg_operand" "")
@@ -1627,7 +1630,11 @@ (define_expand "movdicc"
(match_operand:DI 2 "nonmemory_operand" "")
(match_operand:DI 3 "register_operand" "")))]
   ""
-  "operands[1] = gen_compare_reg (operands[1], VOIDmode);")
+  "
+  operands[1] = gen_compare_reg (operands[1], VOIDmode);
+  if (operands[1] == NULL_RTX)
+FAIL;
+  ")
 
 
 (define_expand "movsfcc"
@@ -1636,7 +1643,11 @@ (define_expand "movsfcc"
  (match_operand:SF 2 "nonmemory_operand" "")
  (match_operand:SF 3 "register_operand" "")))]
   ""
-  "operands[1] = gen_compare_reg (operands[1], VOIDmode);")
+  "
+  operands[1] = gen_compare_reg (operands[1], VOIDmode);
+  if (operands[1] == NULL_RTX)
+FAIL;
+  ")
 
 (define_expand "movdfcc"
   [(set (match_operand:DF 0 "dest_reg_operand" "")
@@ -1644,7 +1655,11 @@ (define_expand "movdfcc"
  (match_operand:DF 2 "nonmemory_operand" "")
  (match_operand:DF 3 "register_operand" "")))]
   ""
-  "operands[1] = gen_compare_reg (operands[1], VOIDmode);")
+  "
+  operands[1] = gen_compare_reg (operands[1], VOIDmode);
+  if (operands[1] == NULL_RTX)
+FAIL;
+  ")
 
 (define_insn "*movsicc_insn"
   [(set (match_operand:SI 0 "dest_reg_operand" "=w,w")
-- 
2.35.1



Re: [committed] arc: Fail conditional move expand patterns

2022-02-25 Thread Robin Dapp via Gcc-patches
> If the movcc comparison is not valid it triggers an assert in the
> current implementation.  This behavior is not needed as we can FAIL
> the movcc expand pattern.

In case of a MODE_CC comparison you can also just return it as described
here https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104154

or here: https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590639.html

If there already is a "CC comparison" the backend does not need to
create one and ifcvt can make use of this, creating better sequences.

Regards
 Robin


Re: [PATCH] Check if loading const from mem is faster

2022-02-25 Thread Jiufu Guo via Gcc-patches
Richard Biener  writes:

> On Fri, 25 Feb 2022, Jiufu Guo wrote:
>
>> Richard Biener  writes:
>> 
>> > On Thu, 24 Feb 2022, Jiufu Guo wrote:
>> >
>> >> Jiufu Guo via Gcc-patches  writes:
>> >> 
>> >> > Segher Boessenkool  writes:
>> >> >
>> >> >> On Wed, Feb 23, 2022 at 02:02:59PM +0100, Richard Biener wrote:
>> >> >>> I'm assuming we're always dealing with
>> >> >>> 
>> >> >>>   (set (reg:MODE ..) )
>> >> >>> 
>> >> >>> here and CSE is not substituting into random places of an
>> >> >>> instruction(?).  I don't know what 'rtx_cost' should evaluate
>> >> >>> to for a constant, if it should implicitely evaluate the cost
>> >> >>> of putting the result into a register for example.
>> >> >>
>> >> >> rtx_cost is no good here (and in most places).  rtx_cost should be 0
>> >> >> for anything that is used as input in a machine instruction -- but you
>> >> >> need much more context to determine that.  insn_cost is much simpler 
>> >> >> and
>> >> >> much easier to use.
>> >> >>
>> >> >>> Using RTX_COST with SET and 1 at least looks no worse than using
>> >> >>> your proposed new target hook and comparing it with the original
>> >> >>> unfolded src (again with SET and 1).
>> >> >>
>> >> >> It is required to generate valid instructions no matter what, before
>> >> >> the pass has finished that is.  On all more modern architectures it is
>> >> >> futile to think you can usefully consider the cost of an RTL expression
>> >> >> and derive a real-world cost of the generated code from that.
>> >> >
>> >> > Thanks Segher for pointing out these!  Here is  another reason that I
>> >> > did not use rtx_cost: in a few passes, there are codes to check the
>> >> > constants and store them in constant pool.  I'm thinking to integerate
>> >> > those codes in a consistent way.
>> >> 
>> >> Hi Segher, Richard!
>> >> 
>> >> I'm thinking the way like: For a constant,
>> >> 1. if the constant could be used as an immediate for the
>> >> instruction, then retreated as an operand;
>> >> 2. otherwise if the constant can not be stored into a
>> >> constant pool, then handle through instructions;
>> >> 3. if it is faster to access constant from pool, then emit
>> >> constant as data(.rodata);
>> >> 4. otherwise, handle the constant by instructions.
>> >> 
>> >> And to store the constant into a pool, besides force_const_mem,
>> >> create reference (toc) may be needed on some platforms.
>> >> 
>> >> For this particular issue in CSE, there is already code that
>> >> tries to put constant into a pool (invoke force_const_mem).
>> >> While the code is too late.  So, we may check the constant
>> >> earlier and store it into constant pool if profitable.
>> >> 
>> >> And another thing as Segher pointed out, CSE is doing too
>> >> much work.  It may be ok to separate the constant handling
>> >> logic from CSE.
>> >
>> > Not sure - CSE just is value numbering, I don't see that it does
>> > more than that.  Yes, it might have developed "heuristics" over
>> > the years what to CSE and to what and where to substitute and
>> > where not.  But in the end it does just value numbering.
>> >
>> >> 
>> >> I update a new version patch as follow (did not seprate CSE):
>> >
>> > How is the new target hook better in any way compared to rtx_cost
>> > or insn_cost?  It looks like a total hack.
>> >
>> > I suppose the actual way of materializing a constant is done
>> > behind GCCs back and not exposed anywhere?  But instead you
>> > claim the constants are valid when they actually are not?
>> > Isn't the problem then that the rs6000 backend lies?
>> 
>> Hi Richard,
>> 
>> Thanks for your comments and sugguestions!
>> 
>> Materializing a constant should be done behind GCC.
>> On rs6000, in expand pass, during emit_move, the constant is
>> checked and store into constant pool if necessary.
>> Some other platforms are doing a similar thing, e.g.
>> ix86_expand_vector_move, alpha_expand_mov,...
>> mips_legitimize_const_move.
>> 
>> But, it does not as we expect, force_const_mem is also 
>> exposed other places (not only ira/reload for stack reference).
>> 
>> CSE is one place, for example, CSE first retrieve the constant
>> from insn's equal, but it also tries to put constant into
>> pool for some condition (the condition was introduced at
>> early age of cse.c, and it is rare to run into in trunk).
>> In some aspects, IMHO, this seems not a great work of CSE.
>> 
>> And this is how the 'invalid(or say slow)' constant occurs.
>> e.g.  before cse:
>> 7: r119:DI=[unspec[`*.LC0',%r2:DI] 47]
>>   REG_EQUAL 0x100803004101001
>> after cse: 
>> 7: r119:DI=0x100803004101001
>>   REG_EQUAL 0x100803004101001
>> 
>> As you pointed out: we can also avoid this transformation through
>> rtx_cost/insn_cost by estimating the COST more accurately for
>>  "r119:DI=0x100803004101001". (r119:DI=0x100803004101001 will not
>> be a real final instruction.)
>
> At which point does this become the final instruction?  I suppose
> CSE belives this constant is legitimate and the ins

Re: [PATCH] Check if loading const from mem is faster

2022-02-25 Thread Richard Biener via Gcc-patches
On Fri, 25 Feb 2022, Jiufu Guo wrote:

> Richard Biener  writes:
> 
> > On Fri, 25 Feb 2022, Jiufu Guo wrote:
> >
> >> Richard Biener  writes:
> >> 
> >> > On Thu, 24 Feb 2022, Jiufu Guo wrote:
> >> >
> >> >> Jiufu Guo via Gcc-patches  writes:
> >> >> 
> >> >> > Segher Boessenkool  writes:
> >> >> >
> >> >> >> On Wed, Feb 23, 2022 at 02:02:59PM +0100, Richard Biener wrote:
> >> >> >>> I'm assuming we're always dealing with
> >> >> >>> 
> >> >> >>>   (set (reg:MODE ..) )
> >> >> >>> 
> >> >> >>> here and CSE is not substituting into random places of an
> >> >> >>> instruction(?).  I don't know what 'rtx_cost' should evaluate
> >> >> >>> to for a constant, if it should implicitely evaluate the cost
> >> >> >>> of putting the result into a register for example.
> >> >> >>
> >> >> >> rtx_cost is no good here (and in most places).  rtx_cost should be 0
> >> >> >> for anything that is used as input in a machine instruction -- but 
> >> >> >> you
> >> >> >> need much more context to determine that.  insn_cost is much simpler 
> >> >> >> and
> >> >> >> much easier to use.
> >> >> >>
> >> >> >>> Using RTX_COST with SET and 1 at least looks no worse than using
> >> >> >>> your proposed new target hook and comparing it with the original
> >> >> >>> unfolded src (again with SET and 1).
> >> >> >>
> >> >> >> It is required to generate valid instructions no matter what, before
> >> >> >> the pass has finished that is.  On all more modern architectures it 
> >> >> >> is
> >> >> >> futile to think you can usefully consider the cost of an RTL 
> >> >> >> expression
> >> >> >> and derive a real-world cost of the generated code from that.
> >> >> >
> >> >> > Thanks Segher for pointing out these!  Here is  another reason that I
> >> >> > did not use rtx_cost: in a few passes, there are codes to check the
> >> >> > constants and store them in constant pool.  I'm thinking to integerate
> >> >> > those codes in a consistent way.
> >> >> 
> >> >> Hi Segher, Richard!
> >> >> 
> >> >> I'm thinking the way like: For a constant,
> >> >> 1. if the constant could be used as an immediate for the
> >> >> instruction, then retreated as an operand;
> >> >> 2. otherwise if the constant can not be stored into a
> >> >> constant pool, then handle through instructions;
> >> >> 3. if it is faster to access constant from pool, then emit
> >> >> constant as data(.rodata);
> >> >> 4. otherwise, handle the constant by instructions.
> >> >> 
> >> >> And to store the constant into a pool, besides force_const_mem,
> >> >> create reference (toc) may be needed on some platforms.
> >> >> 
> >> >> For this particular issue in CSE, there is already code that
> >> >> tries to put constant into a pool (invoke force_const_mem).
> >> >> While the code is too late.  So, we may check the constant
> >> >> earlier and store it into constant pool if profitable.
> >> >> 
> >> >> And another thing as Segher pointed out, CSE is doing too
> >> >> much work.  It may be ok to separate the constant handling
> >> >> logic from CSE.
> >> >
> >> > Not sure - CSE just is value numbering, I don't see that it does
> >> > more than that.  Yes, it might have developed "heuristics" over
> >> > the years what to CSE and to what and where to substitute and
> >> > where not.  But in the end it does just value numbering.
> >> >
> >> >> 
> >> >> I update a new version patch as follow (did not seprate CSE):
> >> >
> >> > How is the new target hook better in any way compared to rtx_cost
> >> > or insn_cost?  It looks like a total hack.
> >> >
> >> > I suppose the actual way of materializing a constant is done
> >> > behind GCCs back and not exposed anywhere?  But instead you
> >> > claim the constants are valid when they actually are not?
> >> > Isn't the problem then that the rs6000 backend lies?
> >> 
> >> Hi Richard,
> >> 
> >> Thanks for your comments and sugguestions!
> >> 
> >> Materializing a constant should be done behind GCC.
> >> On rs6000, in expand pass, during emit_move, the constant is
> >> checked and store into constant pool if necessary.
> >> Some other platforms are doing a similar thing, e.g.
> >> ix86_expand_vector_move, alpha_expand_mov,...
> >> mips_legitimize_const_move.
> >> 
> >> But, it does not as we expect, force_const_mem is also 
> >> exposed other places (not only ira/reload for stack reference).
> >> 
> >> CSE is one place, for example, CSE first retrieve the constant
> >> from insn's equal, but it also tries to put constant into
> >> pool for some condition (the condition was introduced at
> >> early age of cse.c, and it is rare to run into in trunk).
> >> In some aspects, IMHO, this seems not a great work of CSE.
> >> 
> >> And this is how the 'invalid(or say slow)' constant occurs.
> >> e.g.  before cse:
> >> 7: r119:DI=[unspec[`*.LC0',%r2:DI] 47]
> >>   REG_EQUAL 0x100803004101001
> >> after cse: 
> >> 7: r119:DI=0x100803004101001
> >>   REG_EQUAL 0x100803004101001
> >> 
> >> As you pointed out: we can also avoid this transformation thr

[PATCH][pushed] testsuite: Fix ASAN error [PR104687]

2022-02-25 Thread Martin Liška

Installed as obvious.

Martin

PR testsuite/104687

gcc/testsuite/ChangeLog:

* gcc.dg/lto/20090717_0.c: Fix asan error.
---
 gcc/testsuite/gcc.dg/lto/20090717_0.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/lto/20090717_0.c 
b/gcc/testsuite/gcc.dg/lto/20090717_0.c
index fe13bdf78d4..41558df7a7c 100644
--- a/gcc/testsuite/gcc.dg/lto/20090717_0.c
+++ b/gcc/testsuite/gcc.dg/lto/20090717_0.c
@@ -1,4 +1,4 @@
 struct variable {
 const char *string;
 };
-struct variable table[] = { };
+struct variable table[] = { { 0 } };
--
2.35.1



[PATCH] rtl-optimization/104686 - speed up conflict iteration

2022-02-25 Thread Richard Biener via Gcc-patches
The following replaces

   /* Skip bits that are zero.  */
   for (; (word & 1) == 0; word >>= 1)
 bit_num++;

idioms in ira-int.h in the attempt to speedup update_conflict_hard_regno_costs
which we're bound on in PR104686.  The trick is to use ctz_hwi here
which should pay off even with dense bitmaps on architectures that
have HW support for this.

For the PR in question this speeds up compile-time from 31s to 24s for
me.

Bootstrapped and tested on x86_64-unknown-linux-gnu.

OK for trunk?

Thanks,
Richard.

2022-02-25  Richard Biener  

PR rtl-optimization/104686
* ira-int.h (minmax_set_iter_cond): Use ctz_hwi to elide loop
skipping bits that are zero.
(ira_object_conflict_iter_cond): Likewise.
---
 gcc/ira-int.h | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/gcc/ira-int.h b/gcc/ira-int.h
index 957604b22e9..f42a314fa7f 100644
--- a/gcc/ira-int.h
+++ b/gcc/ira-int.h
@@ -764,8 +764,9 @@ minmax_set_iter_cond (minmax_set_iterator *i, int *n)
 }
 
   /* Skip bits that are zero.  */
-  for (; (i->word & 1) == 0; i->word >>= 1)
-i->bit_num++;
+  int off = ctz_hwi (i->word);
+  i->bit_num += off;
+  i->word >>= off;
 
   *n = (int) i->bit_num + i->start_val;
 
@@ -1379,8 +1380,9 @@ ira_object_conflict_iter_cond 
(ira_object_conflict_iterator *i,
}
 
   /* Skip bits that are zero.  */
-  for (; (word & 1) == 0; word >>= 1)
-   bit_num++;
+  int off = ctz_hwi (word);
+  bit_num += off;
+  word >>= off;
 
   obj = ira_object_id_map[bit_num + i->base_conflict_id];
   i->bit_num = bit_num + 1;
-- 
2.34.1


Re: [PATCH][V3][middle-end/104550]Suppress uninitialized warnings for new created uses from __builtin_clear_padding folding

2022-02-25 Thread Qing Zhao


> On Feb 25, 2022, at 2:38 AM, Richard Biener  wrote:
> 
> On Fri, 25 Feb 2022, Qing Zhao wrote:
> 
>> Hi, Jakub and Richard:
>> 
>> This is the 3rd version of the patch, the major change compared to the 
>> previous version are:
>> 
>> 1. Add warning_enabled_at guard before “suppress_warning”
>> 2. Add location to the call to __builtin_clear_padding for auto init.
>> 
>> The patch has been bootstrapped and regress tested on both x86 and aarch64.
>> Okay for trunk?
>> 
>> Thanks.
>> 
>> Qing
>> 
>> ==
>> From 8314ded4ca0f59c5a3ec431c9c3768fcaf2a0949 Mon Sep 17 00:00:00 2001
>> From: Qing Zhao 
>> Date: Thu, 24 Feb 2022 22:38:38 +
>> Subject: [PATCH] Suppress uninitialized warnings for new created uses from
>> __builtin_clear_padding folding [PR104550]
>> 
>> __builtin_clear_padding(&object) will clear all the padding bits of the 
>> object.
>> actually, it doesn't involve any use of an user variable. Therefore, users do
>> not expect any uninitialized warning from it. It's reasonable to suppress
>> uninitialized warnings for all new created uses from __builtin_clear_padding
>> folding.
>> 
>>  PR middle-end/104550
>> 
>> gcc/ChangeLog:
>> 
>>  * gimple-fold.cc (clear_padding_flush): Suppress warnings for new
>>  created uses.
>>  * gimplify.cc (gimple_add_padding_init_for_auto_var): Set
>>  location for new created call to __builtin_clear_padding.
>> 
>> gcc/testsuite/ChangeLog:
>> 
>>  * gcc.dg/auto-init-pr104550-1.c: New test.
>>  * gcc.dg/auto-init-pr104550-2.c: New test.
>>  * gcc.dg/auto-init-pr104550-3.c: New test.
>> ---
>> gcc/gimple-fold.cc  | 11 ++-
>> gcc/gimplify.cc |  1 +
>> gcc/testsuite/gcc.dg/auto-init-pr104550-1.c | 10 ++
>> gcc/testsuite/gcc.dg/auto-init-pr104550-2.c | 11 +++
>> gcc/testsuite/gcc.dg/auto-init-pr104550-3.c | 11 +++
>> 5 files changed, 43 insertions(+), 1 deletion(-)
>> create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr104550-1.c
>> create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr104550-2.c
>> create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr104550-3.c
>> 
>> diff --git a/gcc/gimple-fold.cc b/gcc/gimple-fold.cc
>> index 16f02c2d098d..67b4963ffd96 100644
>> --- a/gcc/gimple-fold.cc
>> +++ b/gcc/gimple-fold.cc
>> @@ -62,6 +62,7 @@ along with GCC; see the file COPYING3.  If not see
>> #include "attribs.h"
>> #include "asan.h"
>> #include "diagnostic-core.h"
>> +#include "diagnostic.h"
>> #include "intl.h"
>> #include "calls.h"
>> #include "tree-vector-builder.h"
>> @@ -4379,7 +4380,15 @@ clear_padding_flush (clear_padding_struct *buf, bool 
>> full)
>>else
>>  {
>>src = make_ssa_name (type);
>> -  g = gimple_build_assign (src, unshare_expr (dst));
>> +  tree tmp_dst = unshare_expr (dst);
>> +  /* The folding introduces a read from the tmp_dst, we should
>> + prevent uninitialized warning analysis from issuing warning
>> + for such fake read.  */
>> +  if (warning_enabled_at (buf->loc, OPT_Wuninitialized)
>> +  || warning_enabled_at (buf->loc,
>> + OPT_Wmaybe_uninitialized))
>> +suppress_warning (tmp_dst, OPT_Wuninitialized);
>> +  g = gimple_build_assign (src, tmp_dst);
> 
> So what about just gimple_set_no_warning (g, true); ?  (sorry for
> the ping-pong between us three...)

This didn’t work.  The small testing case still failed. This is due to in 
tree-ssa-uninit.cc,
 it checks get_no_uninit_warning (RHS), not for the whole stmt.

We can update tree-sea-uninit.cc to check the whole stmt, but I am not sure 
whether doing this might introduce other issue.

Qing

> 
>>gimple_set_location (g, buf->loc);
>>gsi_insert_before (buf->gsi, g, GSI_SAME_STMT);
>>tree mask = native_interpret_expr (type,
>> diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc
>> index f570daa015a5..977cf458f858 100644
>> --- a/gcc/gimplify.cc
>> +++ b/gcc/gimplify.cc
>> @@ -1823,6 +1823,7 @@ gimple_add_padding_init_for_auto_var (tree decl, bool 
>> is_vla,
>> 
>>   gimple *call = gimple_build_call (fn, 2, addr_of_decl,
>>  build_one_cst (TREE_TYPE (addr_of_decl)));
>> +  gimple_set_location (call, EXPR_LOCATION (decl));
> 
> I believe EXPR_LOCATION (decl) is bogus, 'decl's have DECL_LOCATION,
> EXPR_LOCATION here will just return UNKNOWN_LOCATION.
> 
>>   gimplify_seq_add_stmt (seq_p, call);
>> }
>> 
>> diff --git a/gcc/testsuite/gcc.dg/auto-init-pr104550-1.c 
>> b/gcc/testsuite/gcc.dg/auto-init-pr104550-1.c
>> new file mode 100644
>> index ..a08110c3a170
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.dg/auto-init-pr104550-1.c
>> @@ -0,0 +1,10 @@
>> +/* PR 104550*/
>> +/* { dg-do compile } */
>> +/* { dg-options "-O -Wuninitialized -ftrivial-auto-var-init=pattern

Re: [PATCH 2/2][middle-end/102276] Adding -Wtrivial-auto-var-init and update documentation.

2022-02-25 Thread Qing Zhao



> On Feb 25, 2022, at 6:43 AM, Richard Biener  wrote:
> 
> On Thu, 24 Feb 2022, Qing Zhao wrote:
> 
>> 
>> 
>>> On Feb 24, 2022, at 4:16 AM, Richard Biener  wrote:
>>> 
>>> On Sat, 19 Feb 2022, Qing Zhao wrote:
>>> 
 Hi,
 
 This is the 2nd patch for fixing pr102276.
 
 Adding -Wtrivial-auto-var-init and update documentation.
 
 Adding a new warning option -Wtrivial-auto-var-init to report cases when
 -ftrivial-auto-var-init cannot initialize the auto variable. At the same
 time, update documentation for -ftrivial-auto-var-init to connect it with
 the new warning option -Wtrivial-auto-var-init,  and add documentation
 for -Wtrivial-auto-var-init.
 
 Bootstraped and regression tested on both x86 and aarch64.
 
 Okay for committing?
 
 thanks.
 
 Qing.
 
 ==
 From 4346890b8f4258489c4841f1992ba3ce816d7689 Mon Sep 17 00:00:00 2001
 From: Qing Zhao 
 Date: Fri, 18 Feb 2022 15:53:15 +
 Subject: [PATCH 2/2] Adding -Wtrivial-auto-var-init and update 
 documentation.
 
 Adding a new warning option -Wtrivial-auto-var-init to report cases when
 -ftrivial-auto-var-init cannot initialize the auto variable. At the same
 time, update documentation for -ftrivial-auto-var-init to connect it with
 the new warning option -Wtrivial-auto-var-init,  and add documentation
 for -Wtrivial-auto-var-init.
 
 2022-02-18 Qing Zhao  
 gcc/ChangeLog:
 
* common.opt (-Wtrivial-auto-var-init): New option.
* doc/invoke.texi (-Wtrivial-auto-var-init): Document new option.
(-ftrivial-auto-var-init): Update option;
* gimplify.cc (maybe_warn_switch_unreachable): Rename...
(maybe_warn_switch_unreachable_and_auto_init): ...to this.
(gimplify_switch_expr): Call new function.
 
 gcc/testsuite/ChangeLog:
 
* gcc.dg/auto-init-pr102276-3.c: New test.
* gcc.dg/auto-init-pr102276-4.c: New test.
 ---
 gcc/common.opt  |   4 +
 gcc/doc/invoke.texi |  14 ++-
 gcc/gimplify.cc | 100 +++-
 gcc/testsuite/gcc.dg/auto-init-pr102276-3.c |  40 
 gcc/testsuite/gcc.dg/auto-init-pr102276-4.c |  40 
 5 files changed, 175 insertions(+), 23 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr102276-3.c
 create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr102276-4.c
 
 diff --git a/gcc/common.opt b/gcc/common.opt
 index c21e5273ae3..22c95dbfa49 100644
 --- a/gcc/common.opt
 +++ b/gcc/common.opt
 @@ -801,6 +801,10 @@ Wtrampolines
 Common Var(warn_trampolines) Warning
 Warn whenever a trampoline is generated.
 
 +Wtrivial-auto-var-init
 +Common Var(warn_trivial_auto_var_init) Warning Init(0)
 +Warn about where -ftrivial-auto-var-init cannot initialize the auto 
 variable.
 +
>>> 
>>> Warn about cases where ... initialize a variable.
>> 
>> Okay. 
>> 
>>> 
 Wtype-limits
 Common Var(warn_type_limits) Warning EnabledBy(Wextra)
 Warn if a comparison is always true or always false due to the limited 
 range of the data type.
 diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
 index e1a00c80307..c61a5b4b4a5 100644
 --- a/gcc/doc/invoke.texi
 +++ b/gcc/doc/invoke.texi
 @@ -399,7 +399,7 @@ Objective-C and Objective-C++ Dialects}.
 -Wswitch  -Wno-switch-bool  -Wswitch-default  -Wswitch-enum @gol
 -Wno-switch-outside-range  -Wno-switch-unreachable  -Wsync-nand @gol
 -Wsystem-headers  -Wtautological-compare  -Wtrampolines  -Wtrigraphs @gol
 --Wtsan -Wtype-limits  -Wundef @gol
 +-Wtrivial-auto-var-init -Wtsan -Wtype-limits  -Wundef @gol
 -Wuninitialized  -Wunknown-pragmas @gol
 -Wunsuffixed-float-constants  -Wunused @gol
 -Wunused-but-set-parameter  -Wunused-but-set-variable @gol
 @@ -6953,6 +6953,14 @@ This warning is enabled by default for C and C++ 
 programs.
 Warn when @code{__sync_fetch_and_nand} and @code{__sync_nand_and_fetch}
 built-in functions are used.  These functions changed semantics in GCC 4.4.
 
 +@item -Wtrivial-auto-var-init
 +@opindex Wtrivial-auto-var-init
 +@opindex Wno-trivial-auto-var-init
 +Warn when @code{-ftrivial-auto-var-init} cannot initialize the automatic
 +variable.  A common situation is an automatic variable that is declared
 +between the controlling expression and the first case lable of a 
 @code{switch}
 +statement.
 +
 @item -Wunused-but-set-parameter
 @opindex Wunused-but-set-parameter
 @opindex Wno-unused-but-set-parameter
 @@ -12314,6 +12322,10 @@ initializer as uninitialized, 
 @option{-Wuninitialized} and
 warning messages on such automatic variables.
 With this option, GCC will also initialize any padding of automatic 

Re: [PATCH] rtl-optimization/104686 - speed up conflict iteration

2022-02-25 Thread Vladimir Makarov via Gcc-patches



On 2022-02-25 09:14, Richard Biener wrote:

The following replaces

/* Skip bits that are zero.  */
for (; (word & 1) == 0; word >>= 1)
  bit_num++;

idioms in ira-int.h in the attempt to speedup update_conflict_hard_regno_costs
which we're bound on in PR104686.  The trick is to use ctz_hwi here
which should pay off even with dense bitmaps on architectures that
have HW support for this.

For the PR in question this speeds up compile-time from 31s to 24s for
me.

It is a really significant improvement.

Bootstrapped and tested on x86_64-unknown-linux-gnu.

OK for trunk?

Yes.  Thank you for working on this PR, Richard.

2022-02-25  Richard Biener  

PR rtl-optimization/104686
* ira-int.h (minmax_set_iter_cond): Use ctz_hwi to elide loop
skipping bits that are zero.
(ira_object_conflict_iter_cond): Likewise.




Re: [PATCH] s390: Change SET rtx_cost handling.

2022-02-25 Thread Andreas Krebbel via Gcc-patches
On 2/25/22 12:38, Robin Dapp wrote:
> Hi,
> 
> the IF_THEN_ELSE detection currently prevents us from properly costing
> register-register moves which causes the lower-subreg pass to assume
> that a VR-VR move is as expensive as two GPR-GPR moves.
> 
> This patch adds handling for SETs containing REGs as well as MEMs and is
> inspired by the aarch64 implementation.
> 
> Bootstrapped and regtested on z900 up to z15. Is it OK?
> 
> Regards
>  Robin
> 
> --
> 
> gcc/ChangeLog:
> 
>   * config/s390/s390.cc (s390_address_cost): Declare.
>   (s390_hard_regno_nregs): Declare.
>   (s390_rtx_costs): Add handling for REG and MEM in SET.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/s390/vector/vec-sum-across-no-lower-subreg-1.c: New
> test.

Ok. Thanks

Andreas


[committed] testsuite: Move pr104540.C test to g++.target/i386/

2022-02-25 Thread Jakub Jelinek via Gcc-patches
On Wed, Feb 23, 2022 at 07:45:20PM -0300, Alexandre Oliva via Gcc-patches wrote:
>   PR middle-end/104540
>   * g++.dg/PR104540.C: New.

Both -mforce-drap and -mstackrealign options are x86 specific.

Tested on x86_64-linux, i686-linux and powerpc64le-linux, committed to
trunk.

2022-02-25  Jakub Jelinek  

* g++.dg/pr104540.C: Move to ...
* g++.target/i386/pr104540.C: ... here.

diff --git a/gcc/testsuite/g++.dg/pr104540.C 
b/gcc/testsuite/g++.target/i386/pr104540.C
similarity index 100%
rename from gcc/testsuite/g++.dg/pr104540.C
rename to gcc/testsuite/g++.target/i386/pr104540.C

Jakub



[PATCH] match.pd: Further complex simplification fixes [PR104675]

2022-02-25 Thread Jakub Jelinek via Gcc-patches
Hi!

Mark mentioned in the PR further 2 simplifications that also ICE
with complex types.
For these, eventually (but IMO GCC 13 materials) we could support it
for vector types if it would be uniform vector constants.
Currently integer_pow2p is true only for INTEGER_CSTs and COMPLEX_CSTs
and we can't use bit_and etc. for complex type.

Bootstrapped/regtested on powerpc64le-linux and powerpc64-linux, ok
for trunk?

2022-02-25  Jakub Jelinek  
Marc Glisse  

PR tree-optimization/104675
* match.pd (t * 2U / 2 -> t & (~0 / 2), t / 2U * 2 -> t & ~1):
Restrict simplifications to INTEGRAL_TYPE_P.

* gcc.dg/pr104675-3.c : New test.

--- gcc/match.pd.jj 2022-02-25 10:55:08.0 +0100
+++ gcc/match.pd2022-02-25 11:48:04.730110154 +0100
@@ -731,7 +731,7 @@ (define_operator_list SYNC_FETCH_AND_AND
 /* Simplify (unsigned t * 2)/2 -> unsigned t & 0x7FFF.  */
 (simplify
  (trunc_div (mult @0 integer_pow2p@1) @1)
- (if (TYPE_UNSIGNED (TREE_TYPE (@0)))
+ (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && TYPE_UNSIGNED (TREE_TYPE (@0)))
   (bit_and @0 { wide_int_to_tree
(type, wi::mask (TYPE_PRECISION (type)
 - wi::exact_log2 (wi::to_wide (@1)),
@@ -740,7 +740,7 @@ (define_operator_list SYNC_FETCH_AND_AND
 /* Simplify (unsigned t / 2) * 2 -> unsigned t & ~1.  */
 (simplify
  (mult (trunc_div @0 integer_pow2p@1) @1)
- (if (TYPE_UNSIGNED (TREE_TYPE (@0)))
+ (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && TYPE_UNSIGNED (TREE_TYPE (@0)))
   (bit_and @0 (negate @1
 
 /* Simplify (t * 2) / 2) -> t.  */
--- gcc/testsuite/gcc.dg/pr104675-3.c.jj2022-02-25 11:24:43.111619659 
+0100
+++ gcc/testsuite/gcc.dg/pr104675-3.c   2022-02-25 11:26:35.406056600 +0100
@@ -0,0 +1,29 @@
+/* PR tree-optimization/104675 */
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+_Complex unsigned int
+foo (_Complex unsigned int x)
+{
+  return (x / 2) * 2;
+}
+
+_Complex unsigned int
+bar (_Complex unsigned int x)
+{
+  return (x * 2) / 2;
+}
+
+_Complex unsigned int
+baz (_Complex unsigned int x)
+{
+  _Complex unsigned int y = x / 2;
+  return y * 2;
+}
+
+_Complex unsigned int
+qux (_Complex unsigned int x)
+{
+  _Complex unsigned int y = x * 2;
+  return y / 2;
+}

Jakub



[committed] rs6000: Use rs6000_emit_move in movmisalign expander [PR104681]

2022-02-25 Thread Jakub Jelinek via Gcc-patches
Hi!

The following testcase ICEs, because for some strange reason it decides to use
movmisaligntf during expansion where the destination is MEM and source is
CONST_DOUBLE.  For normal mov expanders the rs6000 backend uses
rs6000_emit_move to ensure that if one operand is a MEM, the other is a REG
and a few other things, but for movmisalign nothing enforced this.
The middle-end documents that movmisalign shouldn't fail, so we can't
force that through predicates or condition on the expander.

Bootstrapped/regtested on powerpc64le-linux and powerpc64-linux (the latter
with -m32/-m64 testing), preapproved in the PR by Segher, committed to trunk
(so far).

2022-02-25  Jakub Jelinek  

PR target/104681
* config/rs6000/vector.md (movmisalign): Use rs6000_emit_move.

* g++.dg/opt/pr104681.C: New test.

--- gcc/config/rs6000/vector.md.jj  2022-01-11 23:11:21.974296000 +0100
+++ gcc/config/rs6000/vector.md 2022-02-25 12:36:53.763307488 +0100
@@ -1519,7 +1519,10 @@ (define_expand "movmisalign"
  [(set (match_operand:VEC_N 0 "nonimmediate_operand")
(match_operand:VEC_N 1 "any_operand"))]
  "VECTOR_MEM_VSX_P (mode) && TARGET_ALLOW_MOVMISALIGN"
- "")
+{
+  rs6000_emit_move (operands[0], operands[1], mode);
+  DONE;
+})
 
 ;; Vector shift right in bits. Currently supported ony for shift
 ;; amounts that can be expressed as byte shifts (divisible by 8).
--- gcc/testsuite/g++.dg/opt/pr104681.C.jj  2022-02-25 12:38:39.419835845 
+0100
+++ gcc/testsuite/g++.dg/opt/pr104681.C 2022-02-25 12:35:30.137472275 +0100
@@ -0,0 +1,19 @@
+// PR target/104681
+// { dg-do compile }
+// { dg-options "-O2" }
+
+void bar ();
+struct A {
+  A (bool) : a(7.0L), b(0) {}
+  long double a;
+  long b;
+};
+struct B {
+  void foo () { c = bar; }
+  A c;
+};
+struct C {
+  void baz ();
+  B d;
+};
+void C::baz () { d.foo (); }

Jakub



Re: [PATCH] match.pd: Further complex simplification fixes [PR104675]

2022-02-25 Thread Richard Biener via Gcc-patches



> Am 25.02.2022 um 18:58 schrieb Jakub Jelinek via Gcc-patches 
> :
> 
> Hi!
> 
> Mark mentioned in the PR further 2 simplifications that also ICE
> with complex types.
> For these, eventually (but IMO GCC 13 materials) we could support it
> for vector types if it would be uniform vector constants.
> Currently integer_pow2p is true only for INTEGER_CSTs and COMPLEX_CSTs
> and we can't use bit_and etc. for complex type.
> 
> Bootstrapped/regtested on powerpc64le-linux and powerpc64-linux, ok
> for trunk?

Ok.

Richard 

> 2022-02-25  Jakub Jelinek  
>Marc Glisse  
> 
>PR tree-optimization/104675
>* match.pd (t * 2U / 2 -> t & (~0 / 2), t / 2U * 2 -> t & ~1):
>Restrict simplifications to INTEGRAL_TYPE_P.
> 
>* gcc.dg/pr104675-3.c : New test.
> 
> --- gcc/match.pd.jj2022-02-25 10:55:08.0 +0100
> +++ gcc/match.pd2022-02-25 11:48:04.730110154 +0100
> @@ -731,7 +731,7 @@ (define_operator_list SYNC_FETCH_AND_AND
> /* Simplify (unsigned t * 2)/2 -> unsigned t & 0x7FFF.  */
> (simplify
>  (trunc_div (mult @0 integer_pow2p@1) @1)
> - (if (TYPE_UNSIGNED (TREE_TYPE (@0)))
> + (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && TYPE_UNSIGNED (TREE_TYPE (@0)))
>   (bit_and @0 { wide_int_to_tree
>(type, wi::mask (TYPE_PRECISION (type)
> - wi::exact_log2 (wi::to_wide (@1)),
> @@ -740,7 +740,7 @@ (define_operator_list SYNC_FETCH_AND_AND
> /* Simplify (unsigned t / 2) * 2 -> unsigned t & ~1.  */
> (simplify
>  (mult (trunc_div @0 integer_pow2p@1) @1)
> - (if (TYPE_UNSIGNED (TREE_TYPE (@0)))
> + (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && TYPE_UNSIGNED (TREE_TYPE (@0)))
>   (bit_and @0 (negate @1
> 
> /* Simplify (t * 2) / 2) -> t.  */
> --- gcc/testsuite/gcc.dg/pr104675-3.c.jj2022-02-25 11:24:43.111619659 
> +0100
> +++ gcc/testsuite/gcc.dg/pr104675-3.c2022-02-25 11:26:35.406056600 +0100
> @@ -0,0 +1,29 @@
> +/* PR tree-optimization/104675 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +
> +_Complex unsigned int
> +foo (_Complex unsigned int x)
> +{
> +  return (x / 2) * 2;
> +}
> +
> +_Complex unsigned int
> +bar (_Complex unsigned int x)
> +{
> +  return (x * 2) / 2;
> +}
> +
> +_Complex unsigned int
> +baz (_Complex unsigned int x)
> +{
> +  _Complex unsigned int y = x / 2;
> +  return y * 2;
> +}
> +
> +_Complex unsigned int
> +qux (_Complex unsigned int x)
> +{
> +  _Complex unsigned int y = x * 2;
> +  return y / 2;
> +}
> 
>Jakub
> 


Re: [PATCH][V2][middle-end/104550]Suppress uninitialized warnings for new created uses from __builtin_clear_padding folding

2022-02-25 Thread Qing Zhao
Hi, 

After more study of all the discussion so far and the corresponding code for 
suppress_warning, I think the following suggestion
Should be the best approach right now for this issue:

>   SET_EXPR_LOCATION (tmp_dst, UNKNOWN_LOCATION);
>   suppress_warning (tmp_dst, OPT_Wuninitialized);
> with a comment explaing why we do that?


The reason is:

After “SET_EXPR_LOCATION (tmp_dst, UNKNOWN_LOCATION)”, 

152 /* Enable, or by default disable, a warning for the expression.
153The wildcard OPT of -1 controls all warnings.  */
154 
155 void
156 suppress_warning (tree expr, opt_code opt /* = all_warnings */,
157   bool supp /* = true */)
158 {
159   if (opt == no_warning)
160 return;
161 
162   const location_t loc = get_location (expr);
163 
164   if (!RESERVED_LOCATION_P (loc))
165 supp = suppress_warning_at (loc, opt, supp) || supp;
166   set_no_warning_bit (expr, supp);
167 }

Suppress_warning will NOT call “suppress_warning_at” to involve any operation 
on the hash tables. It just
simply call “set_no_warning_bit” to set the no_warning bit for the MEM_REF 
expr. 

And later during the routine “maybe_warn_operand” in tree-sea-uninit.cc, 
“get_no_uninit_warning” will also
Simply check the no_warning bit of the MEM_REF to see whether the warning need 
to be issued. 

This resolved all the concerns we have so far.

I will prepare the patch based on this approach.

Let me know your opinion.

Qing

> On Feb 25, 2022, at 4:04 AM, Jakub Jelinek via Gcc-patches 
>  wrote:
> 
> On Fri, Feb 25, 2022 at 10:31:50AM +0100, Richard Biener wrote:
>> I think it's used as fallback for UNKNOWN_LOCATION, but if we "invent"
>> a creative location for the artificial stmt it will of course
>> affect other stmts/expressions using that location.
>> 
>>> I think it will work.
>> 
>> Yes, I think so.  OTOH the uninit pass does
>> 
>>  /* Avoid warning if we've already done so or if the warning has been
>> suppressed.  */
>>  if (((warning_suppressed_p (context, OPT_Wuninitialized)
>>|| (gimple_assign_single_p (context)
>>&& get_no_uninit_warning (gimple_assign_rhs1 (context)
>>  || (var && get_no_uninit_warning (var))
>>  || (var_name_str
>>  && warning_suppressed_p (var_def_stmt, OPT_Wuninitialized)))
>>return;
>> 
>> that's a mightly complicated way to test and I'm not sure we get
>> to the bit on the stmt reliably.  So maybe TREE_NO_WARNING on the
>> reference (or making sure it has UNKNOWN_LOCATION and using
>> suppress_warning on it) is a better idea after all...
> 
> So
>   SET_EXPR_LOCATION (tmp_dst, UNKNOWN_LOCATION);
>   suppress_warning (tmp_dst, OPT_Wuninitialized);
> with a comment explaing why we do that?
> LGTM.
> 
>   Jakub
> 



[PATCH] c++: ICE with attribute on enumerator [PR104667]

2022-02-25 Thread Marek Polacek via Gcc-patches
When processing a template, the enumerators we build don't have a type
yet.  But is_late_template_attribute is not prepared to see a _DECL
without a type, so we crash on

  enum tree_code code = TREE_CODE (type);

(I found that we don't give the "is deprecated" warning for the enumerator
'f' in the test.  Reported as PR104682.)

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/11?

PR c++/104667

gcc/cp/ChangeLog:

* decl2.cc (is_late_template_attribute): Cope with a decl without
a type.

gcc/testsuite/ChangeLog:

* g++.dg/ext/attrib64.C: New test.
---
 gcc/cp/decl2.cc |  2 +-
 gcc/testsuite/g++.dg/ext/attrib64.C | 11 +++
 2 files changed, 12 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/ext/attrib64.C

diff --git a/gcc/cp/decl2.cc b/gcc/cp/decl2.cc
index 2e58419ea51..dc7710660d0 100644
--- a/gcc/cp/decl2.cc
+++ b/gcc/cp/decl2.cc
@@ -1300,7 +1300,7 @@ is_late_template_attribute (tree attr, tree decl)
 
   /* We can't apply any attributes to a completely unknown type until
 instantiation time.  */
-  enum tree_code code = TREE_CODE (type);
+  enum tree_code code = type ? TREE_CODE (type) : ERROR_MARK;
   if (code == TEMPLATE_TYPE_PARM
  || code == BOUND_TEMPLATE_TEMPLATE_PARM
  || code == TYPENAME_TYPE)
diff --git a/gcc/testsuite/g++.dg/ext/attrib64.C 
b/gcc/testsuite/g++.dg/ext/attrib64.C
new file mode 100644
index 000..4a4505fc4b2
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/attrib64.C
@@ -0,0 +1,11 @@
+// PR c++/104667
+// { dg-do compile }
+
+template struct A {
+  enum E { // { dg-warning "only applies to function types" }
+e __attribute__ ((access(read_only))),
+f __attribute__ ((deprecated))
+  };
+};
+
+A a;

base-commit: ae3c4e521dd0b66db712639298cd08331d62f315
-- 
2.35.1



[PATCH] c++: Lost deprecated/unavailable attr in class tmpl [PR104682]

2022-02-25 Thread Marek Polacek via Gcc-patches
[ Most likely a GCC 13 patch, but I'm posting it now so that I don't lose it. ]

When looking into the other PR I noticed that we fail to give a warning
for a deprecated enumerator when the enum is in a class template.  This
only happens when the attribute doesn't have an argument.  The reason is
that when we tsubst_enum, we create a new enumerator:

  build_enumerator (DECL_NAME (decl), value, newtag,
   DECL_ATTRIBUTES (decl), DECL_SOURCE_LOCATION (decl));

but DECL_ATTRIBUTES (decl) is null when the attribute was provided
without an argument -- in that case it simply melts into a tree flag.
handle_deprecated_attribute has:

  if (!args)
 *no_add_attrs = true;

so the attribute isn't retained and we lose it when tsubsting.  Same
thing when the attribute is on the enum itself.

Attribute unavailable is a similar case, but it's different in that
it can be a late attribute whereas "deprecated" can't:
is_late_template_attribute has

/* But some attributes specifically apply to templates.  */
&& !is_attribute_p ("abi_tag", name)
&& !is_attribute_p ("deprecated", name)
&& !is_attribute_p ("visibility", name))
 return true;
   else
 return false;

which looks strange, but attr-unavailable-9.C tests that we don't error when
the attribute is applied on a template.

Bootstrapped/regtested on x86_64-pc-linux-gnu.

PR c++/104682

gcc/cp/ChangeLog:

* cp-tree.h (build_enumerator): Adjust.
* decl.cc (finish_enum): Make it return the new decl.
* pt.cc (tsubst_enum): Propagate TREE_DEPRECATED and TREE_UNAVAILABLE.

gcc/testsuite/ChangeLog:

* g++.dg/ext/attr-unavailable-10.C: New test.
* g++.dg/ext/attr-unavailable-11.C: New test.
* g++.dg/warn/deprecated-17.C: New test.
* g++.dg/warn/deprecated-18.C: New test.
---
 gcc/cp/cp-tree.h  |  2 +-
 gcc/cp/decl.cc|  4 +-
 gcc/cp/pt.cc  | 17 +++--
 .../g++.dg/ext/attr-unavailable-10.C  | 22 +++
 .../g++.dg/ext/attr-unavailable-11.C  | 22 +++
 gcc/testsuite/g++.dg/warn/deprecated-17.C | 35 ++
 gcc/testsuite/g++.dg/warn/deprecated-18.C | 37 +++
 7 files changed, 133 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/ext/attr-unavailable-10.C
 create mode 100644 gcc/testsuite/g++.dg/ext/attr-unavailable-11.C
 create mode 100644 gcc/testsuite/g++.dg/warn/deprecated-17.C
 create mode 100644 gcc/testsuite/g++.dg/warn/deprecated-18.C

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 37d462fca6e..80994e94793 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -6833,7 +6833,7 @@ extern void xref_basetypes(tree, 
tree);
 extern tree start_enum (tree, tree, tree, tree, bool, 
bool *);
 extern void finish_enum_value_list (tree);
 extern void finish_enum(tree);
-extern void build_enumerator   (tree, tree, tree, tree, 
location_t);
+extern tree build_enumerator   (tree, tree, tree, tree, 
location_t);
 extern tree lookup_enumerator  (tree, tree);
 extern bool start_preparsed_function   (tree, tree, int);
 extern bool start_function (cp_decl_specifier_seq *,
diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc
index 7b48b56231b..7f80f9d4d7a 100644
--- a/gcc/cp/decl.cc
+++ b/gcc/cp/decl.cc
@@ -16409,7 +16409,7 @@ finish_enum (tree enumtype)
Apply ATTRIBUTES if available.  LOC is the location of NAME.
Assignment of sequential values by default is handled here.  */
 
-void
+tree
 build_enumerator (tree name, tree value, tree enumtype, tree attributes,
  location_t loc)
 {
@@ -16611,6 +16611,8 @@ incremented enumerator value is too large for 
%"));
 
   /* Add this enumeration constant to the list for this type.  */
   TYPE_VALUES (enumtype) = tree_cons (name, decl, TYPE_VALUES (enumtype));
+
+  return decl;
 }
 
 /* Look for an enumerator with the given NAME within the enumeration
diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 70f02db8757..8fb17349ee1 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -26944,9 +26944,8 @@ tsubst_enum (tree tag, tree newtag, tree args)
   for (e = TYPE_VALUES (tag); e; e = TREE_CHAIN (e))
 {
   tree value;
-  tree decl;
+  tree decl = TREE_VALUE (e);
 
-  decl = TREE_VALUE (e);
   /* Note that in a template enum, the TREE_VALUE is the
 CONST_DECL, not the corresponding INTEGER_CST.  */
   value = tsubst_expr (DECL_INITIAL (decl),
@@ -26958,8 +26957,14 @@ tsubst_enum (tree tag, tree newtag, tree args)
 
   /* Actually build the enumerator itself.  Here we're assuming that
 enumerators can't have dependent attributes.  */
-  build_enumerator (DECL_NAME (decl), value, newtag

[PATCH][V4][middle-end/104550]Suppress uninitialized warnings for new created uses from __builtin_clear_padding folding

2022-02-25 Thread Qing Zhao
Hi, 

This is the 4th version based on the discussion so far.

The major change is:

> SET_EXPR_LOCATION (tmp_dst, UNKNOWN_LOCATION);
>   suppress_warning (tmp_dst, OPT_Wuninitialized);
> with a comment explaing why we do that.


The patch has been bootstrapped and regress tested on both x86 and aarch64.
Okay for trunk?

Thanks.

Qing

=
>From 276975e60827942f0dd4043ce5f52e600327d6a8 Mon Sep 17 00:00:00 2001
From: Qing Zhao 
Date: Thu, 24 Feb 2022 22:38:38 +
Subject: [PATCH] Suppress uninitialized warnings for new created uses from
 __builtin_clear_padding folding [PR104550]

__builtin_clear_padding(&object) will clear all the padding bits of the object.
actually, it doesn't involve any use of an user variable. Therefore, users do
not expect any uninitialized warning from it. It's reasonable to suppress
uninitialized warnings for all new created uses from __builtin_clear_padding
folding.

PR middle-end/104550

gcc/ChangeLog:

* gimple-fold.cc (clear_padding_flush): Suppress warnings for new
created uses.

gcc/testsuite/ChangeLog:

* gcc.dg/auto-init-pr104550-1.c: New test.
* gcc.dg/auto-init-pr104550-2.c: New test.
* gcc.dg/auto-init-pr104550-3.c: New test.
---
 gcc/gimple-fold.cc  | 12 +++-
 gcc/testsuite/gcc.dg/auto-init-pr104550-1.c | 10 ++
 gcc/testsuite/gcc.dg/auto-init-pr104550-2.c | 11 +++
 gcc/testsuite/gcc.dg/auto-init-pr104550-3.c | 11 +++
 4 files changed, 43 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr104550-1.c
 create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr104550-2.c
 create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr104550-3.c

diff --git a/gcc/gimple-fold.cc b/gcc/gimple-fold.cc
index 16f02c2d098d..c9179abb27ed 100644
--- a/gcc/gimple-fold.cc
+++ b/gcc/gimple-fold.cc
@@ -4379,7 +4379,17 @@ clear_padding_flush (clear_padding_struct *buf, bool 
full)
  else
{
  src = make_ssa_name (type);
- g = gimple_build_assign (src, unshare_expr (dst));
+ tree tmp_dst = unshare_expr (dst);
+ /* The folding introduces a read from the tmp_dst, we should
+prevent uninitialized warning analysis from issuing warning
+for such fake read.  In order to suppress warning only for
+this expr, we should set the location of tmp_dst to
+UNKNOWN_LOCATION first, then suppress_warning will call
+set_no_warning_bit to set the no_warning flag only for
+tmp_dst.  */
+ SET_EXPR_LOCATION (tmp_dst, UNKNOWN_LOCATION);
+ suppress_warning (tmp_dst, OPT_Wuninitialized);
+ g = gimple_build_assign (src, tmp_dst);
  gimple_set_location (g, buf->loc);
  gsi_insert_before (buf->gsi, g, GSI_SAME_STMT);
  tree mask = native_interpret_expr (type,
diff --git a/gcc/testsuite/gcc.dg/auto-init-pr104550-1.c 
b/gcc/testsuite/gcc.dg/auto-init-pr104550-1.c
new file mode 100644
index ..a08110c3a170
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/auto-init-pr104550-1.c
@@ -0,0 +1,10 @@
+/* PR 104550*/
+/* { dg-do compile } */
+/* { dg-options "-O -Wuninitialized -ftrivial-auto-var-init=pattern" } */
+struct vx_audio_level {
+ int has_monitor_level : 1;
+};
+
+void vx_set_monitor_level() {
+ struct vx_audio_level info; /* { dg-bogus "info" "is used uninitialized" } */
+}
diff --git a/gcc/testsuite/gcc.dg/auto-init-pr104550-2.c 
b/gcc/testsuite/gcc.dg/auto-init-pr104550-2.c
new file mode 100644
index ..2c395b32d322
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/auto-init-pr104550-2.c
@@ -0,0 +1,11 @@
+/* PR 104550 */
+/* { dg-do compile } */
+/* { dg-options "-O -Wuninitialized -ftrivial-auto-var-init=zero" } */
+struct vx_audio_level {
+ int has_monitor_level : 1;
+};
+
+void vx_set_monitor_level() {
+ struct vx_audio_level info; 
+ __builtin_clear_padding (&info);  /* { dg-bogus "info" "is used 
uninitialized" } */ 
+}
diff --git a/gcc/testsuite/gcc.dg/auto-init-pr104550-3.c 
b/gcc/testsuite/gcc.dg/auto-init-pr104550-3.c
new file mode 100644
index ..9893e37f12d8
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/auto-init-pr104550-3.c
@@ -0,0 +1,11 @@
+/* PR 104550 */
+/* { dg-do compile } */
+/* { dg-options "-O -Wuninitialized -ftrivial-auto-var-init=pattern" } */
+struct vx_audio_level {
+ int has_monitor_level : 1;
+};
+
+void vx_set_monitor_level() {
+ struct vx_audio_level info;   /* { dg-bogus "info" "is used uninitialized" } 
*/
+ __builtin_clear_padding (&info);  /* { dg-bogus "info" "is used 
uninitialized" } */ 
+}
-- 
2.27.0