Re: [PATCH] [PATCH,v4,1/1,AARCH64][PR102768] aarch64: Add compiler support for Shadow Call Stack
On 2/11/22 07:35, Richard Sandiford wrote: Dan Li writes: On 2/11/22 01:53, Richard Sandiford wrote: Dan Li writes: On 2/10/22 01:55, Richard Sandiford wrote: void f(); int g(int x) { if (x == 0) { __asm__ ("":::"x19", "x20"); return 1; } f(); return 2; } Then it seems X30 is treat as a "component" (the test result of aarch64.exp also seems fine). g: stp x19, x20, [sp, -32]! cbnzw0, .L2 mov w0, 1 ldp x19, x20, [sp], 32 ret .L2: str x30, [sp, 16] bl f ldr x30, [sp, 16] mov w0, 2 ldp x19, x20, [sp], 32 ret And I think maybe we could handle this through three patches: 1.Keep current patch (a V5) unchanged for scs. 2.Add shrink-warpping for X30: logically this might be a separate topic, and I think more testing might be needed here (Well, I'm a little worried about if there might be other effects, since I just read this part of the code roughly yesterday). 3.Add scs push/pop to shrink-wrapping (and maybe we can do the same for the PAC code in pro/epilogue, since it's also the operation of the X30). Yeah, that's fair. (Like I said earlier, I wasn't asking for the shrink-wrapping change. It was just a note in passing. But as you point out, the individual shrink-wrapping support would be even more work than I'd imagined.) Hi, Richard, As said before, I try to enable R30 as component[1] in shrink-wrapping separate. The test results of this patch in x86-64 native compile/aarch64 cross-compile are consistent with the mainline (but there are still few use cases not working properly in my test environment). please let me know if I'm missing something :) [1] https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590906.html
Re: [PATCH][V3][middle-end/104550]Suppress uninitialized warnings for new created uses from __builtin_clear_padding folding
On Fri, 25 Feb 2022, Qing Zhao wrote: > Hi, Jakub and Richard: > > This is the 3rd version of the patch, the major change compared to the > previous version are: > > 1. Add warning_enabled_at guard before “suppress_warning” > 2. Add location to the call to __builtin_clear_padding for auto init. > > The patch has been bootstrapped and regress tested on both x86 and aarch64. > Okay for trunk? > > Thanks. > > Qing > > == > From 8314ded4ca0f59c5a3ec431c9c3768fcaf2a0949 Mon Sep 17 00:00:00 2001 > From: Qing Zhao > Date: Thu, 24 Feb 2022 22:38:38 + > Subject: [PATCH] Suppress uninitialized warnings for new created uses from > __builtin_clear_padding folding [PR104550] > > __builtin_clear_padding(&object) will clear all the padding bits of the > object. > actually, it doesn't involve any use of an user variable. Therefore, users do > not expect any uninitialized warning from it. It's reasonable to suppress > uninitialized warnings for all new created uses from __builtin_clear_padding > folding. > > PR middle-end/104550 > > gcc/ChangeLog: > > * gimple-fold.cc (clear_padding_flush): Suppress warnings for new > created uses. > * gimplify.cc (gimple_add_padding_init_for_auto_var): Set > location for new created call to __builtin_clear_padding. > > gcc/testsuite/ChangeLog: > > * gcc.dg/auto-init-pr104550-1.c: New test. > * gcc.dg/auto-init-pr104550-2.c: New test. > * gcc.dg/auto-init-pr104550-3.c: New test. > --- > gcc/gimple-fold.cc | 11 ++- > gcc/gimplify.cc | 1 + > gcc/testsuite/gcc.dg/auto-init-pr104550-1.c | 10 ++ > gcc/testsuite/gcc.dg/auto-init-pr104550-2.c | 11 +++ > gcc/testsuite/gcc.dg/auto-init-pr104550-3.c | 11 +++ > 5 files changed, 43 insertions(+), 1 deletion(-) > create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr104550-1.c > create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr104550-2.c > create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr104550-3.c > > diff --git a/gcc/gimple-fold.cc b/gcc/gimple-fold.cc > index 16f02c2d098d..67b4963ffd96 100644 > --- a/gcc/gimple-fold.cc > +++ b/gcc/gimple-fold.cc > @@ -62,6 +62,7 @@ along with GCC; see the file COPYING3. If not see > #include "attribs.h" > #include "asan.h" > #include "diagnostic-core.h" > +#include "diagnostic.h" > #include "intl.h" > #include "calls.h" > #include "tree-vector-builder.h" > @@ -4379,7 +4380,15 @@ clear_padding_flush (clear_padding_struct *buf, bool > full) > else > { > src = make_ssa_name (type); > - g = gimple_build_assign (src, unshare_expr (dst)); > + tree tmp_dst = unshare_expr (dst); > + /* The folding introduces a read from the tmp_dst, we should > + prevent uninitialized warning analysis from issuing warning > + for such fake read. */ > + if (warning_enabled_at (buf->loc, OPT_Wuninitialized) > + || warning_enabled_at (buf->loc, > + OPT_Wmaybe_uninitialized)) > + suppress_warning (tmp_dst, OPT_Wuninitialized); > + g = gimple_build_assign (src, tmp_dst); So what about just gimple_set_no_warning (g, true); ? (sorry for the ping-pong between us three...) > gimple_set_location (g, buf->loc); > gsi_insert_before (buf->gsi, g, GSI_SAME_STMT); > tree mask = native_interpret_expr (type, > diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc > index f570daa015a5..977cf458f858 100644 > --- a/gcc/gimplify.cc > +++ b/gcc/gimplify.cc > @@ -1823,6 +1823,7 @@ gimple_add_padding_init_for_auto_var (tree decl, bool > is_vla, > >gimple *call = gimple_build_call (fn, 2, addr_of_decl, > build_one_cst (TREE_TYPE (addr_of_decl))); > + gimple_set_location (call, EXPR_LOCATION (decl)); I believe EXPR_LOCATION (decl) is bogus, 'decl's have DECL_LOCATION, EXPR_LOCATION here will just return UNKNOWN_LOCATION. >gimplify_seq_add_stmt (seq_p, call); > } > > diff --git a/gcc/testsuite/gcc.dg/auto-init-pr104550-1.c > b/gcc/testsuite/gcc.dg/auto-init-pr104550-1.c > new file mode 100644 > index ..a08110c3a170 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/auto-init-pr104550-1.c > @@ -0,0 +1,10 @@ > +/* PR 104550*/ > +/* { dg-do compile } */ > +/* { dg-options "-O -Wuninitialized -ftrivial-auto-var-init=pattern" } */ > +struct vx_audio_level { > + int has_monitor_level : 1; > +}; > + > +void vx_set_monitor_level() { > + struct vx_audio_level info; /* { dg-bogus "info" "is used uninitialized" } > */ > +} > diff --git a/gcc/testsuite/gcc.dg/auto-init-pr104550-2.c > b/gcc/testsuite/gcc.dg/auto-init-pr104550-2.c > new file mode 100644 > index ..2c395b32d322 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/auto-init-pr104550-2.c >
[PATCH] AVX512F: Add helper enumeration for ternary logic intrinsics.
Hi, This patch intends to sync with llvm change in https://reviews.llvm.org/D120307 to add enumeration and truncate imm to unsigned char, so users could use ~ on immediates. Bootstraped/regtested on x86_64-pc-linux-gnu{-m32,}. Ok for master? gcc/ChangeLog: * config/i386/avx512fintrin.h (_MM_TERNLOG_ENUM): New enum. (_mm512_ternarylogic_epi64): Truncate imm to unsigned char to avoid error when using ~enum as parameter. (_mm512_mask_ternarylogic_epi64): Likewise. (_mm512_maskz_ternarylogic_epi64): Likewise. (_mm512_ternarylogic_epi32): Likewise. (_mm512_mask_ternarylogic_epi32): Likewise. (_mm512_maskz_ternarylogic_epi32): Likewise. * config/i386/avx512vlintrin.h (_mm256_ternarylogic_epi64): Adjust imm param type to unsigned char. (_mm256_mask_ternarylogic_epi64): Likewise. (_mm256_maskz_ternarylogic_epi64): Likewise. (_mm256_ternarylogic_epi32): Likewise. (_mm256_mask_ternarylogic_epi32): Likewise. (_mm256_maskz_ternarylogic_epi32): Likewise. (_mm_ternarylogic_epi64): Likewise. (_mm_mask_ternarylogic_epi64): Likewise. (_mm_maskz_ternarylogic_epi64): Likewise. (_mm_ternarylogic_epi32): Likewise. (_mm_mask_ternarylogic_epi32): Likewise. (_mm_maskz_ternarylogic_epi32): Likewise. gcc/testsuite/ChangeLog: * gcc.target/i386/avx512f-vpternlogd-1.c: Use new enum. * gcc.target/i386/avx512f-vpternlogq-1.c: Likewise. * gcc.target/i386/avx512vl-vpternlogd-1.c: Likewise. * gcc.target/i386/avx512vl-vpternlogq-1.c: Likewise. * gcc.target/i386/testimm-10.c: Remove imm check for vpternlog insns since the imm has been truncated in intrinsic. --- gcc/config/i386/avx512fintrin.h | 132 ++--- gcc/config/i386/avx512vlintrin.h | 278 +++--- .../gcc.target/i386/avx512f-vpternlogd-1.c| 7 +- .../gcc.target/i386/avx512f-vpternlogq-1.c| 7 +- .../gcc.target/i386/avx512vl-vpternlogd-1.c | 13 +- .../gcc.target/i386/avx512vl-vpternlogq-1.c | 14 +- gcc/testsuite/gcc.target/i386/testimm-10.c| 7 - 7 files changed, 285 insertions(+), 173 deletions(-) diff --git a/gcc/config/i386/avx512fintrin.h b/gcc/config/i386/avx512fintrin.h index bc10c823c76..29511fd2831 100644 --- a/gcc/config/i386/avx512fintrin.h +++ b/gcc/config/i386/avx512fintrin.h @@ -1639,16 +1639,27 @@ _mm_maskz_sub_round_ss (__mmask8 __U, __m128 __A, __m128 __B, #endif +/* Constant helper to represent the ternary logic operations among + vector A, B and C. */ +typedef enum +{ + _MM_TERNLOG_A = 0xF0, + _MM_TERNLOG_B = 0xCC, + _MM_TERNLOG_C = 0xAA +} _MM_TERNLOG_ENUM; + #ifdef __OPTIMIZE__ extern __inline __m512i __attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) _mm512_ternarylogic_epi64 (__m512i __A, __m512i __B, __m512i __C, const int __imm) { - return (__m512i) __builtin_ia32_pternlogq512_mask ((__v8di) __A, -(__v8di) __B, -(__v8di) __C, __imm, -(__mmask8) -1); + return (__m512i) +__builtin_ia32_pternlogq512_mask ((__v8di) __A, + (__v8di) __B, + (__v8di) __C, + (unsigned char) __imm, + (__mmask8) -1); } extern __inline __m512i @@ -1656,10 +1667,12 @@ __attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) _mm512_mask_ternarylogic_epi64 (__m512i __A, __mmask8 __U, __m512i __B, __m512i __C, const int __imm) { - return (__m512i) __builtin_ia32_pternlogq512_mask ((__v8di) __A, -(__v8di) __B, -(__v8di) __C, __imm, -(__mmask8) __U); + return (__m512i) +__builtin_ia32_pternlogq512_mask ((__v8di) __A, + (__v8di) __B, + (__v8di) __C, + (unsigned char) __imm, + (__mmask8) __U); } extern __inline __m512i @@ -1667,10 +1680,12 @@ __attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) _mm512_maskz_ternarylogic_epi64 (__mmask8 __U, __m512i __A, __m512i __B, __m512i __C, const int __imm) { - return (__m512i) __builtin_ia32_pternlogq512_maskz ((__v8di) __A, - (__v8di) __B, - (__v8di) __C, - __imm, (__mmask8) __U); + return (__m512i) +__builtin_ia32_pternlogq512_maskz ((__v8di) __A, +
Re: [PATCH] Check if loading const from mem is faster
On Fri, 25 Feb 2022, Jiufu Guo wrote: > Richard Biener writes: > > > On Thu, 24 Feb 2022, Jiufu Guo wrote: > > > >> Jiufu Guo via Gcc-patches writes: > >> > >> > Segher Boessenkool writes: > >> > > >> >> On Wed, Feb 23, 2022 at 02:02:59PM +0100, Richard Biener wrote: > >> >>> I'm assuming we're always dealing with > >> >>> > >> >>> (set (reg:MODE ..) ) > >> >>> > >> >>> here and CSE is not substituting into random places of an > >> >>> instruction(?). I don't know what 'rtx_cost' should evaluate > >> >>> to for a constant, if it should implicitely evaluate the cost > >> >>> of putting the result into a register for example. > >> >> > >> >> rtx_cost is no good here (and in most places). rtx_cost should be 0 > >> >> for anything that is used as input in a machine instruction -- but you > >> >> need much more context to determine that. insn_cost is much simpler and > >> >> much easier to use. > >> >> > >> >>> Using RTX_COST with SET and 1 at least looks no worse than using > >> >>> your proposed new target hook and comparing it with the original > >> >>> unfolded src (again with SET and 1). > >> >> > >> >> It is required to generate valid instructions no matter what, before > >> >> the pass has finished that is. On all more modern architectures it is > >> >> futile to think you can usefully consider the cost of an RTL expression > >> >> and derive a real-world cost of the generated code from that. > >> > > >> > Thanks Segher for pointing out these! Here is another reason that I > >> > did not use rtx_cost: in a few passes, there are codes to check the > >> > constants and store them in constant pool. I'm thinking to integerate > >> > those codes in a consistent way. > >> > >> Hi Segher, Richard! > >> > >> I'm thinking the way like: For a constant, > >> 1. if the constant could be used as an immediate for the > >> instruction, then retreated as an operand; > >> 2. otherwise if the constant can not be stored into a > >> constant pool, then handle through instructions; > >> 3. if it is faster to access constant from pool, then emit > >> constant as data(.rodata); > >> 4. otherwise, handle the constant by instructions. > >> > >> And to store the constant into a pool, besides force_const_mem, > >> create reference (toc) may be needed on some platforms. > >> > >> For this particular issue in CSE, there is already code that > >> tries to put constant into a pool (invoke force_const_mem). > >> While the code is too late. So, we may check the constant > >> earlier and store it into constant pool if profitable. > >> > >> And another thing as Segher pointed out, CSE is doing too > >> much work. It may be ok to separate the constant handling > >> logic from CSE. > > > > Not sure - CSE just is value numbering, I don't see that it does > > more than that. Yes, it might have developed "heuristics" over > > the years what to CSE and to what and where to substitute and > > where not. But in the end it does just value numbering. > > > >> > >> I update a new version patch as follow (did not seprate CSE): > > > > How is the new target hook better in any way compared to rtx_cost > > or insn_cost? It looks like a total hack. > > > > I suppose the actual way of materializing a constant is done > > behind GCCs back and not exposed anywhere? But instead you > > claim the constants are valid when they actually are not? > > Isn't the problem then that the rs6000 backend lies? > > Hi Richard, > > Thanks for your comments and sugguestions! > > Materializing a constant should be done behind GCC. > On rs6000, in expand pass, during emit_move, the constant is > checked and store into constant pool if necessary. > Some other platforms are doing a similar thing, e.g. > ix86_expand_vector_move, alpha_expand_mov,... > mips_legitimize_const_move. > > But, it does not as we expect, force_const_mem is also > exposed other places (not only ira/reload for stack reference). > > CSE is one place, for example, CSE first retrieve the constant > from insn's equal, but it also tries to put constant into > pool for some condition (the condition was introduced at > early age of cse.c, and it is rare to run into in trunk). > In some aspects, IMHO, this seems not a great work of CSE. > > And this is how the 'invalid(or say slow)' constant occurs. > e.g. before cse: > 7: r119:DI=[unspec[`*.LC0',%r2:DI] 47] > REG_EQUAL 0x100803004101001 > after cse: > 7: r119:DI=0x100803004101001 > REG_EQUAL 0x100803004101001 > > As you pointed out: we can also avoid this transformation through > rtx_cost/insn_cost by estimating the COST more accurately for > "r119:DI=0x100803004101001". (r119:DI=0x100803004101001 will not > be a real final instruction.) At which point does this become the final instruction? I suppose CSE belives this constant is legitimate and the insn is recognized just fine in CSE? (that's what I mean with "behind GCCs back") > Is it necessary to refine this constant handling for CSE, or even >
Re: [PATCH][V2][middle-end/104550]Suppress uninitialized warnings for new created uses from __builtin_clear_padding folding
On Fri, Feb 25, 2022 at 08:11:40AM +0100, Richard Biener via Gcc-patches wrote: > On Thu, 24 Feb 2022, Qing Zhao wrote: > > > I briefly checked all the usages of suppress_warning within the current > > gcc, > > and see that most of them are not guarded by any condition. > > > > So, the current change should be fine without introducing new issues. -:) > > > > Another thing is, if we use “warning_enable_at” to guard, I just checked, > > this routine is not used by any routine right now, so it might be possible > > that this > > routine has some bug itself. And now it’s the stage 4, we might need to be > > conservative. > > > > Based on this, I think that it might be better to put the change in as it > > right now. > > If we think that all the “suppress_warning” need to be guarded by a specific > > condition, we can do it in gcc13 earlier stage. > > > > What’s your opinion? > > I would agree here. Maybe a compromise would be to simply > set gimple_set_no_warning () on the actual stmt. That shouldn't I think it is set_no_warning_bit but it isn't exported from warning-control.cc. One can still use TREE_NO_WARNING though, but seems Martin converted all usages of that to the suppress_warning APIs. I thought the hash tables warning-control.cc use are based on the gimple * and tree pointers, but apparently they aren't, they are location_t based. So if we keep using suppress_warning (tmp_dst, OPT_Wuninitialized); there, it actually means just that one entry will be added to the hash table because all loads in one clear_padding_flush use the same buf->loc. It is true that every suppress_warning will need to look it up in the hash table. But you're right that because the tmp_dst we've created is artificial, I don't see a reason for any warning. Unfortunately, even with suppress_warning (tmp_dst, all_warnings); it looks up the hash tables and notes stuff in there rather than just making sure the TREE_NO_WARNING bit is set. So let's go with the unconditional suppress_warning (tmp_dst, OPT_Wuninitialized); for now. Jakub
Re: [PATCH][V2][middle-end/104550]Suppress uninitialized warnings for new created uses from __builtin_clear_padding folding
On Fri, 25 Feb 2022, Jakub Jelinek wrote: > On Fri, Feb 25, 2022 at 08:11:40AM +0100, Richard Biener via Gcc-patches > wrote: > > On Thu, 24 Feb 2022, Qing Zhao wrote: > > > > > I briefly checked all the usages of suppress_warning within the current > > > gcc, > > > and see that most of them are not guarded by any condition. > > > > > > So, the current change should be fine without introducing new issues. -:) > > > > > > Another thing is, if we use “warning_enable_at” to guard, I just checked, > > > this routine is not used by any routine right now, so it might be > > > possible that this > > > routine has some bug itself. And now it’s the stage 4, we might need to > > > be > > > conservative. > > > > > > Based on this, I think that it might be better to put the change in as it > > > right now. > > > If we think that all the “suppress_warning” need to be guarded by a > > > specific > > > condition, we can do it in gcc13 earlier stage. > > > > > > What’s your opinion? > > > > I would agree here. Maybe a compromise would be to simply > > set gimple_set_no_warning () on the actual stmt. That shouldn't > > I think it is set_no_warning_bit but it isn't exported from > warning-control.cc. I meant /* Set the no_warning flag of STMT to NO_WARNING. */ static inline void gimple_set_no_warning (gimple *stmt, bool no_warning) { stmt->no_warning = (unsigned) no_warning; } on the artificial stmt. > One can still use TREE_NO_WARNING though, but seems Martin converted > all usages of that to the suppress_warning APIs. Yep, which is why I didn't suggest that. Note that we don't have a great location to use - the other patch suggests DECL_LOCATION which will then silence all other uninit warnings on the decl which would be IMHO bad. Since the load is artificial we won't ever have a more specific location for that unless there's a way to invent it. Thus my suggestion to set the no-warning bit on the GIMPLE stmt itself ... Richard. > I thought the hash tables warning-control.cc use are based on the gimple * > and tree pointers, but apparently they aren't, they are location_t based. > So if we keep using suppress_warning (tmp_dst, OPT_Wuninitialized); > there, it actually means just that one entry will be added to the hash table > because all loads in one clear_padding_flush use the same buf->loc. > It is true that every suppress_warning will need to look it up in the hash > table. > But you're right that because the tmp_dst we've created is artificial, I > don't see a reason for any warning. Unfortunately, even with > suppress_warning (tmp_dst, all_warnings); it looks up the hash tables > and notes stuff in there rather than just making sure the TREE_NO_WARNING > bit is set. > So let's go with the unconditional > suppress_warning (tmp_dst, OPT_Wuninitialized); > for now. > > Jakub > > -- Richard Biener SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany; GF: Ivo Totev; HRB 36809 (AG Nuernberg)
Re: [PATCH][V2][middle-end/104550]Suppress uninitialized warnings for new created uses from __builtin_clear_padding folding
On Fri, Feb 25, 2022 at 10:13:33AM +0100, Richard Biener via Gcc-patches wrote: > I meant > > /* Set the no_warning flag of STMT to NO_WARNING. */ > > static inline void > gimple_set_no_warning (gimple *stmt, bool no_warning) > { > stmt->no_warning = (unsigned) no_warning; > } > > on the artificial stmt. That seems to be the same case as TREE_NO_WARNING, all those calls have been replaced: $ grep -w gimple_set_no_warning *.{h,cc} */*.{h,cc} gimple.h:gimple_set_no_warning (gimple *stmt, bool no_warning) But perhaps it is a good idea to start using that when the location_t isn't a good idea to use for suppression. I think it will work. Jakub
[PATCH] match.pd: Don't create BIT_NOT_EXPRs for COMPLEX_TYPE [PR104675]
Hi! We don't support BIT_{AND,IOR,XOR,NOT}_EXPR on complex types, &/|/^ are just rejected for them, and ~ is parsed as CONJ_EXPR. So, we should avoid simplifications which turn valid complex type expressions into something that will ICE during expansion. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2022-02-24 Jakub Jelinek PR tree-optimization/104675 * match.pd (-A - 1 -> ~A, -1 - A -> ~A): Don't simplify for COMPLEX_TYPE. * gcc.dg/pr104675-1.c: New test. * gcc.dg/pr104675-2.c: New test. --- gcc/match.pd.jj 2022-02-23 12:03:20.552435367 +0100 +++ gcc/match.pd2022-02-24 11:55:20.205673823 +0100 @@ -2776,13 +2776,15 @@ (define_operator_list SYNC_FETCH_AND_AND (simplify (minus (convert? (negate @0)) integer_each_onep) (if (!TYPE_OVERFLOW_TRAPS (type) + && TREE_CODE (type) != COMPLEX_TYPE && tree_nop_conversion_p (type, TREE_TYPE (@0))) (bit_not (convert @0 /* -1 - A -> ~A */ (simplify (minus integer_all_onesp @0) - (bit_not @0)) + (if (TREE_CODE (type) != COMPLEX_TYPE) +(bit_not @0))) /* (T)(P + A) - (T)P -> (T) A */ (simplify --- gcc/testsuite/gcc.dg/pr104675-1.c.jj2022-02-24 11:56:33.253653436 +0100 +++ gcc/testsuite/gcc.dg/pr104675-1.c 2022-02-24 11:56:14.450916087 +0100 @@ -0,0 +1,29 @@ +/* PR tree-optimization/104675 */ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ + +_Complex int +foo (_Complex int a) +{ + return (-1 + -1i) - a; +} + +_Complex int +bar (_Complex int a) +{ + return -a - (1 + 1i); +} + +_Complex int +baz (_Complex int a) +{ + _Complex int b = -1 + -1i; + return b - a; +} + +_Complex int +qux (_Complex int a) +{ + _Complex int b = 1 + 1i; + return -a - b; +} --- gcc/testsuite/gcc.dg/pr104675-2.c.jj2022-02-24 11:57:09.200151312 +0100 +++ gcc/testsuite/gcc.dg/pr104675-2.c 2022-02-24 11:57:04.646214924 +0100 @@ -0,0 +1,18 @@ +/* PR tree-optimization/104675 */ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ + +void baz (int i); + +void +bar (_Complex int c, short s) +{ + c -= s; + baz (__real__ c + __imag__ c); +} + +void +foo (void) +{ + bar (-1 - 1i, 0); +} Jakub
Re: [PATCH][V2][middle-end/104550]Suppress uninitialized warnings for new created uses from __builtin_clear_padding folding
On Fri, 25 Feb 2022, Jakub Jelinek wrote: > On Fri, Feb 25, 2022 at 10:13:33AM +0100, Richard Biener via Gcc-patches > wrote: > > I meant > > > > /* Set the no_warning flag of STMT to NO_WARNING. */ > > > > static inline void > > gimple_set_no_warning (gimple *stmt, bool no_warning) > > { > > stmt->no_warning = (unsigned) no_warning; > > } > > > > on the artificial stmt. > > That seems to be the same case as TREE_NO_WARNING, all those calls > have been replaced: > $ grep -w gimple_set_no_warning *.{h,cc} */*.{h,cc} > gimple.h:gimple_set_no_warning (gimple *stmt, bool no_warning) > > But perhaps it is a good idea to start using that when the location_t > isn't a good idea to use for suppression. I think it's used as fallback for UNKNOWN_LOCATION, but if we "invent" a creative location for the artificial stmt it will of course affect other stmts/expressions using that location. > I think it will work. Yes, I think so. OTOH the uninit pass does /* Avoid warning if we've already done so or if the warning has been suppressed. */ if (((warning_suppressed_p (context, OPT_Wuninitialized) || (gimple_assign_single_p (context) && get_no_uninit_warning (gimple_assign_rhs1 (context) || (var && get_no_uninit_warning (var)) || (var_name_str && warning_suppressed_p (var_def_stmt, OPT_Wuninitialized))) return; that's a mightly complicated way to test and I'm not sure we get to the bit on the stmt reliably. So maybe TREE_NO_WARNING on the reference (or making sure it has UNKNOWN_LOCATION and using suppress_warning on it) is a better idea after all... Richard.
Re: [PATCH] match.pd: Don't create BIT_NOT_EXPRs for COMPLEX_TYPE [PR104675]
On Fri, 25 Feb 2022, Jakub Jelinek wrote: > Hi! > > We don't support BIT_{AND,IOR,XOR,NOT}_EXPR on complex types, > &/|/^ are just rejected for them, and ~ is parsed as CONJ_EXPR. > So, we should avoid simplifications which turn valid complex type > expressions into something that will ICE during expansion. > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? OK (also for backports when applicable). Richard. > 2022-02-24 Jakub Jelinek > > PR tree-optimization/104675 > * match.pd (-A - 1 -> ~A, -1 - A -> ~A): Don't simplify for > COMPLEX_TYPE. > > * gcc.dg/pr104675-1.c: New test. > * gcc.dg/pr104675-2.c: New test. > > --- gcc/match.pd.jj 2022-02-23 12:03:20.552435367 +0100 > +++ gcc/match.pd 2022-02-24 11:55:20.205673823 +0100 > @@ -2776,13 +2776,15 @@ (define_operator_list SYNC_FETCH_AND_AND >(simplify > (minus (convert? (negate @0)) integer_each_onep) > (if (!TYPE_OVERFLOW_TRAPS (type) > + && TREE_CODE (type) != COMPLEX_TYPE > && tree_nop_conversion_p (type, TREE_TYPE (@0))) > (bit_not (convert @0 > >/* -1 - A -> ~A */ >(simplify > (minus integer_all_onesp @0) > - (bit_not @0)) > + (if (TREE_CODE (type) != COMPLEX_TYPE) > +(bit_not @0))) > >/* (T)(P + A) - (T)P -> (T) A */ >(simplify > --- gcc/testsuite/gcc.dg/pr104675-1.c.jj 2022-02-24 11:56:33.253653436 > +0100 > +++ gcc/testsuite/gcc.dg/pr104675-1.c 2022-02-24 11:56:14.450916087 +0100 > @@ -0,0 +1,29 @@ > +/* PR tree-optimization/104675 */ > +/* { dg-do compile } */ > +/* { dg-options "-O2" } */ > + > +_Complex int > +foo (_Complex int a) > +{ > + return (-1 + -1i) - a; > +} > + > +_Complex int > +bar (_Complex int a) > +{ > + return -a - (1 + 1i); > +} > + > +_Complex int > +baz (_Complex int a) > +{ > + _Complex int b = -1 + -1i; > + return b - a; > +} > + > +_Complex int > +qux (_Complex int a) > +{ > + _Complex int b = 1 + 1i; > + return -a - b; > +} > --- gcc/testsuite/gcc.dg/pr104675-2.c.jj 2022-02-24 11:57:09.200151312 > +0100 > +++ gcc/testsuite/gcc.dg/pr104675-2.c 2022-02-24 11:57:04.646214924 +0100 > @@ -0,0 +1,18 @@ > +/* PR tree-optimization/104675 */ > +/* { dg-do compile } */ > +/* { dg-options "-O2" } */ > + > +void baz (int i); > + > +void > +bar (_Complex int c, short s) > +{ > + c -= s; > + baz (__real__ c + __imag__ c); > +} > + > +void > +foo (void) > +{ > + bar (-1 - 1i, 0); > +} > > Jakub > > -- Richard Biener SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany; GF: Ivo Totev; HRB 36809 (AG Nuernberg)
[PATCH] i386: Use a new temp slot kind for splitter to floatdi2_i387_with_xmm [PR104674]
Hi! As mentioned in the PR, the following testcase is miscompiled for similar reasons as the already fixed PR78791 - we use SLOT_TEMP slots in various places during expansion and during expansion we can guarantee that the lifetime of those temporary slot doesn't overlap. But the following splitter uses SLOT_TEMP too and in between expansion and split1 there is a possibility that something extends the lifetime of SLOT_TEMP created slots across an instruction that will be split by this splitter. The following patch fixes it by using a new temp slot kind to make sure it doesn't reuse a SLOT_TEMP that could be live across the instruction. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2022-02-24 Jakub Jelinek PR target/104674 * config/i386/i386.h (enum ix86_stack_slot): Add SLOT_FLOATxFDI_387. * config/i386/i386.md (splitter to floatdi2_i387_with_xmm): Use SLOT_FLOATxFDI_387 rather than SLOT_TEMP. * gcc.target/i386/pr104674.c: New test. --- gcc/config/i386/i386.h.jj 2022-01-18 11:58:59.118988685 +0100 +++ gcc/config/i386/i386.h 2022-02-24 13:47:02.809289843 +0100 @@ -2414,6 +2414,7 @@ enum ix86_stack_slot SLOT_CW_FLOOR, SLOT_CW_CEIL, SLOT_STV_TEMP, + SLOT_FLOATxFDI_387, MAX_386_STACK_LOCALS }; --- gcc/config/i386/i386.md.jj 2022-02-12 11:17:35.14347 +0100 +++ gcc/config/i386/i386.md 2022-02-24 13:35:57.832561968 +0100 @@ -5412,9 +5412,8 @@ (define_split && can_create_pseudo_p ()" [(const_int 0)] { - emit_insn (gen_floatdi2_i387_with_xmm -(operands[0], operands[1], - assign_386_stack_local (DImode, SLOT_TEMP))); + rtx s = assign_386_stack_local (DImode, SLOT_FLOATxFDI_387); + emit_insn (gen_floatdi2_i387_with_xmm (operands[0], operands[1], s)); DONE; }) --- gcc/testsuite/gcc.target/i386/pr104674.c.jj 2022-02-24 13:45:31.630561330 +0100 +++ gcc/testsuite/gcc.target/i386/pr104674.c2022-02-24 13:45:21.278705687 +0100 @@ -0,0 +1,31 @@ +/* PR target/104674 */ +/* { dg-do run { target sse2_runtime } } */ +/* { dg-options "-O2 -msse2 -mfpmath=sse" } */ + +__attribute__((noipa)) double +bar (double x, double y) +{ + return x + y; +} + +__attribute__((noipa)) double +foo (long long x) +{ + long long a = x / 1000; + int b = x % 1000; + double s = (double) a; + double n = (double) b / 1e7; + double t = s + n; + if (t == s + 1.0) +t = bar (t, s); + return t; +} + +int +main () +{ + long long n = 88; + n = n * 1000; + if (foo (n) != 88.0) +__builtin_abort (); +} Jakub
[PATCH] internal-fn: Call do_pending_stack_adjust in expand_SPACESHIP [PR104679]
Hi! The following testcase is miscompiled on ia32 at -O2, because when expand_SPACESHIP is called, we have pending stack adjustment from the foo call right before it. Now, ix86_expand_fp_spaceship uses emit_jump_insn several times but then emit_jump also several times. While emit_jump_insn doesn't do do_pending_stack_adjust (), emit_jump does, so we end up with: ... 8: call [`_Z3foodl'] argc:0x10 REG_CALL_DECL `_Z3foodl' 9: r88:DF=[`a'] 10: r89:HI=unspec[cmp(r88:DF,0.0)] 25 11: flags:CC=unspec[r89:HI] 26 12: pc={(unordered(flags:CCFP,0))?L27:pc} REG_BR_PROB 536868 66: NOTE_INSN_BASIC_BLOCK 4 13: pc={(uneq(flags:CCFP,0))?L19:pc} REG_BR_PROB 214748364 67: NOTE_INSN_BASIC_BLOCK 5 14: pc={(flags:CCFP>0)?L23:pc} REG_BR_PROB 536870916 68: NOTE_INSN_BASIC_BLOCK 6 15: r86:SI=0x 16: {sp:SI=sp:SI+0x10;clobber flags:CC;} REG_ARGS_SIZE 0 17: pc=L29 18: barrier 19: L19: 69: NOTE_INSN_BASIC_BLOCK 7 ... The sp += 16 pending stuck adjust was emitted in the middle of the sequence and is effective only for the single case of the 4 possibilities where .SPACESHIP returns -1, in all other cases the stack isn't adjusted and so we ICE during dwarf2cfi. Now, we could either call do_pending_stack_adjust in ix86_expand_fp_spaceship, or use there calls that actually don't call do_pending_stack_adjust (but having the stack adjustment across branches is generally undesirable), or we can call it in expand_SPACESHIP for all targets (note, just i386 currently implements it). I chose the generic code because e.g. expand_{addsub,neg,mul}_overflow in the same file also call do_pending_stack_adjust in internal-fn.cc for the same reasons, that it is expected that most if not all targets will expand those through jumps and we don't want all of the targets to need to deal with that. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? Or do you prefer it in ix86_expand_fp_spaceship instead? 2022-02-24 Jakub Jelinek PR middle-end/104679 * internal-fn.cc (expand_SPACESHIP): Call do_pending_stack_adjust. * g++.dg/torture/pr104679.C: New test. --- gcc/internal-fn.cc.jj 2022-02-11 13:51:07.757597854 +0100 +++ gcc/internal-fn.cc 2022-02-24 17:46:01.476722703 +0100 @@ -4437,6 +4437,8 @@ expand_SPACESHIP (internal_fn, gcall *st tree rhs2 = gimple_call_arg (stmt, 1); tree type = TREE_TYPE (rhs1); + do_pending_stack_adjust (); + rtx target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE); rtx op1 = expand_normal (rhs1); rtx op2 = expand_normal (rhs2); --- gcc/testsuite/g++.dg/torture/pr104679.C.jj 2022-02-24 17:48:08.309959261 +0100 +++ gcc/testsuite/g++.dg/torture/pr104679.C 2022-02-24 17:48:00.226071655 +0100 @@ -0,0 +1,22 @@ +// PR middle-end/104679 +// { dg-do compile } + +struct A { ~A (); }; +void foo (double, long); +void bar (); +double a; +long b; + +void +baz () +{ + foo (a, b); + if (a == 0.0) +; + else +while (a > 0.0) + { +A c; +bar (); + } +} Jakub
Re: [PATCH] internal-fn: Call do_pending_stack_adjust in expand_SPACESHIP [PR104679]
On Fri, 25 Feb 2022, Jakub Jelinek wrote: > Hi! > > The following testcase is miscompiled on ia32 at -O2, because > when expand_SPACESHIP is called, we have pending stack adjustment > from the foo call right before it. > Now, ix86_expand_fp_spaceship uses emit_jump_insn several times > but then emit_jump also several times. While emit_jump_insn doesn't > do do_pending_stack_adjust (), emit_jump does, so we end up with: > ... > 8: call [`_Z3foodl'] argc:0x10 > REG_CALL_DECL `_Z3foodl' > 9: r88:DF=[`a'] >10: r89:HI=unspec[cmp(r88:DF,0.0)] 25 >11: flags:CC=unspec[r89:HI] 26 >12: pc={(unordered(flags:CCFP,0))?L27:pc} > REG_BR_PROB 536868 >66: NOTE_INSN_BASIC_BLOCK 4 >13: pc={(uneq(flags:CCFP,0))?L19:pc} > REG_BR_PROB 214748364 >67: NOTE_INSN_BASIC_BLOCK 5 >14: pc={(flags:CCFP>0)?L23:pc} > REG_BR_PROB 536870916 >68: NOTE_INSN_BASIC_BLOCK 6 >15: r86:SI=0x >16: {sp:SI=sp:SI+0x10;clobber flags:CC;} > REG_ARGS_SIZE 0 >17: pc=L29 >18: barrier >19: L19: >69: NOTE_INSN_BASIC_BLOCK 7 > ... > The sp += 16 pending stuck adjust was emitted in the middle of the > sequence and is effective only for the single case of the 4 possibilities > where .SPACESHIP returns -1, in all other cases the stack isn't adjusted > and so we ICE during dwarf2cfi. > > Now, we could either call do_pending_stack_adjust in > ix86_expand_fp_spaceship, or use there calls that actually don't call > do_pending_stack_adjust (but having the stack adjustment across branches is > generally undesirable), or we can call it in expand_SPACESHIP for all > targets (note, just i386 currently implements it). > I chose the generic code because e.g. expand_{addsub,neg,mul}_overflow > in the same file also call do_pending_stack_adjust in internal-fn.cc for the > same reasons, that it is expected that most if not all targets will expand > those through jumps and we don't want all of the targets to need to deal > with that. > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? OK. > Or do you prefer it in ix86_expand_fp_spaceship instead? No, as you say others will likely repeat the mistake otherwise. Richard. > 2022-02-24 Jakub Jelinek > > PR middle-end/104679 > * internal-fn.cc (expand_SPACESHIP): Call do_pending_stack_adjust. > > * g++.dg/torture/pr104679.C: New test. > > --- gcc/internal-fn.cc.jj 2022-02-11 13:51:07.757597854 +0100 > +++ gcc/internal-fn.cc2022-02-24 17:46:01.476722703 +0100 > @@ -4437,6 +4437,8 @@ expand_SPACESHIP (internal_fn, gcall *st >tree rhs2 = gimple_call_arg (stmt, 1); >tree type = TREE_TYPE (rhs1); > > + do_pending_stack_adjust (); > + >rtx target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE); >rtx op1 = expand_normal (rhs1); >rtx op2 = expand_normal (rhs2); > --- gcc/testsuite/g++.dg/torture/pr104679.C.jj2022-02-24 > 17:48:08.309959261 +0100 > +++ gcc/testsuite/g++.dg/torture/pr104679.C 2022-02-24 17:48:00.226071655 > +0100 > @@ -0,0 +1,22 @@ > +// PR middle-end/104679 > +// { dg-do compile } > + > +struct A { ~A (); }; > +void foo (double, long); > +void bar (); > +double a; > +long b; > + > +void > +baz () > +{ > + foo (a, b); > + if (a == 0.0) > +; > + else > +while (a > 0.0) > + { > +A c; > +bar (); > + } > +} > > Jakub > > -- Richard Biener SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany; GF: Ivo Totev; HRB 36809 (AG Nuernberg)
[committed] warning-control: Comment spelling fix
Hi! This fixes a spelling mistake I found while looking at warning-control implementation. Committed as obvious. 2022-02-25 Jakub Jelinek * warning-control.cc (get_nowarn_spec): Comment spelling fix. --- gcc/warning-control.cc.jj 2022-01-17 18:05:04.861306618 +0100 +++ gcc/warning-control.cc 2022-02-25 10:21:44.231071154 +0100 @@ -98,7 +98,7 @@ get_nowarn_spec (const_tree expr) return nowarn_map ? nowarn_map->get (loc) : NULL; } -/* Return the no-warning bitmap for stateemt STMT. */ +/* Return the no-warning bitmap for statement STMT. */ static nowarn_spec_t * get_nowarn_spec (const gimple *stmt) Jakub
Re: [PATCH][V2][middle-end/104550]Suppress uninitialized warnings for new created uses from __builtin_clear_padding folding
On Fri, Feb 25, 2022 at 10:31:50AM +0100, Richard Biener wrote: > I think it's used as fallback for UNKNOWN_LOCATION, but if we "invent" > a creative location for the artificial stmt it will of course > affect other stmts/expressions using that location. > > > I think it will work. > > Yes, I think so. OTOH the uninit pass does > > /* Avoid warning if we've already done so or if the warning has been > suppressed. */ > if (((warning_suppressed_p (context, OPT_Wuninitialized) > || (gimple_assign_single_p (context) > && get_no_uninit_warning (gimple_assign_rhs1 (context) > || (var && get_no_uninit_warning (var)) > || (var_name_str > && warning_suppressed_p (var_def_stmt, OPT_Wuninitialized))) > return; > > that's a mightly complicated way to test and I'm not sure we get > to the bit on the stmt reliably. So maybe TREE_NO_WARNING on the > reference (or making sure it has UNKNOWN_LOCATION and using > suppress_warning on it) is a better idea after all... So SET_EXPR_LOCATION (tmp_dst, UNKNOWN_LOCATION); suppress_warning (tmp_dst, OPT_Wuninitialized); with a comment explaing why we do that? LGTM. Jakub
Re: [PATCH][V2][middle-end/104550]Suppress uninitialized warnings for new created uses from __builtin_clear_padding folding
On Fri, 25 Feb 2022, Jakub Jelinek wrote: > On Fri, Feb 25, 2022 at 10:31:50AM +0100, Richard Biener wrote: > > I think it's used as fallback for UNKNOWN_LOCATION, but if we "invent" > > a creative location for the artificial stmt it will of course > > affect other stmts/expressions using that location. > > > > > I think it will work. > > > > Yes, I think so. OTOH the uninit pass does > > > > /* Avoid warning if we've already done so or if the warning has been > > suppressed. */ > > if (((warning_suppressed_p (context, OPT_Wuninitialized) > > || (gimple_assign_single_p (context) > > && get_no_uninit_warning (gimple_assign_rhs1 (context) > > || (var && get_no_uninit_warning (var)) > > || (var_name_str > > && warning_suppressed_p (var_def_stmt, OPT_Wuninitialized))) > > return; > > > > that's a mightly complicated way to test and I'm not sure we get > > to the bit on the stmt reliably. So maybe TREE_NO_WARNING on the > > reference (or making sure it has UNKNOWN_LOCATION and using > > suppress_warning on it) is a better idea after all... > > So > SET_EXPR_LOCATION (tmp_dst, UNKNOWN_LOCATION); > suppress_warning (tmp_dst, OPT_Wuninitialized); > with a comment explaing why we do that? Yes. Depending on the context tmp_dst will already reliably have UNKNOWN_LOCATION? If tmp_dst might be a plain decl we have to be more careful of course. Richard. > LGTM. > > Jakub
Re: [PATCH][V2][middle-end/104550]Suppress uninitialized warnings for new created uses from __builtin_clear_padding folding
On Fri, Feb 25, 2022 at 11:12:13AM +0100, Richard Biener wrote: > On Fri, 25 Feb 2022, Jakub Jelinek wrote: > > > On Fri, Feb 25, 2022 at 10:31:50AM +0100, Richard Biener wrote: > > > I think it's used as fallback for UNKNOWN_LOCATION, but if we "invent" > > > a creative location for the artificial stmt it will of course > > > affect other stmts/expressions using that location. > > > > > > > I think it will work. > > > > > > Yes, I think so. OTOH the uninit pass does > > > > > > /* Avoid warning if we've already done so or if the warning has been > > > suppressed. */ > > > if (((warning_suppressed_p (context, OPT_Wuninitialized) > > > || (gimple_assign_single_p (context) > > > && get_no_uninit_warning (gimple_assign_rhs1 (context) > > > || (var && get_no_uninit_warning (var)) > > > || (var_name_str > > > && warning_suppressed_p (var_def_stmt, OPT_Wuninitialized))) > > > return; > > > > > > that's a mightly complicated way to test and I'm not sure we get > > > to the bit on the stmt reliably. So maybe TREE_NO_WARNING on the > > > reference (or making sure it has UNKNOWN_LOCATION and using > > > suppress_warning on it) is a better idea after all... > > > > So > > SET_EXPR_LOCATION (tmp_dst, UNKNOWN_LOCATION); > > suppress_warning (tmp_dst, OPT_Wuninitialized); > > with a comment explaing why we do that? > > Yes. Depending on the context tmp_dst will already reliably have > UNKNOWN_LOCATION? If tmp_dst might be a plain decl we have to > be more careful of course. tmp_dst is: tree dst = build2_loc (buf->loc, MEM_REF, atype, buf->base, build_int_cst (buf->alias_type, off)); ... tmp_dst = unshare_expr (dst); We could build2 the dst and set its location only after unshare_expr too, but I guess clearing it with a comment on why we do that is better. Jakub
Re: [PATCH] i386: Use a new temp slot kind for splitter to floatdi2_i387_with_xmm [PR104674]
On Fri, Feb 25, 2022 at 10:33 AM Jakub Jelinek wrote: > > Hi! > > As mentioned in the PR, the following testcase is miscompiled for similar > reasons as the already fixed PR78791 - we use SLOT_TEMP slots in various > places during expansion and during expansion we can guarantee that the > lifetime of those temporary slot doesn't overlap. But the following > splitter uses SLOT_TEMP too and in between expansion and split1 there is > a possibility that something extends the lifetime of SLOT_TEMP created > slots across an instruction that will be split by this splitter. > > The following patch fixes it by using a new temp slot kind to make sure > it doesn't reuse a SLOT_TEMP that could be live across the instruction. > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? > > 2022-02-24 Jakub Jelinek > > PR target/104674 > * config/i386/i386.h (enum ix86_stack_slot): Add SLOT_FLOATxFDI_387. > * config/i386/i386.md (splitter to floatdi2_i387_with_xmm): Use > SLOT_FLOATxFDI_387 rather than SLOT_TEMP. > > * gcc.target/i386/pr104674.c: New test. OK. Thanks, Uros. > > --- gcc/config/i386/i386.h.jj 2022-01-18 11:58:59.118988685 +0100 > +++ gcc/config/i386/i386.h 2022-02-24 13:47:02.809289843 +0100 > @@ -2414,6 +2414,7 @@ enum ix86_stack_slot >SLOT_CW_FLOOR, >SLOT_CW_CEIL, >SLOT_STV_TEMP, > + SLOT_FLOATxFDI_387, >MAX_386_STACK_LOCALS > }; > > --- gcc/config/i386/i386.md.jj 2022-02-12 11:17:35.14347 +0100 > +++ gcc/config/i386/i386.md 2022-02-24 13:35:57.832561968 +0100 > @@ -5412,9 +5412,8 @@ (define_split > && can_create_pseudo_p ()" >[(const_int 0)] > { > - emit_insn (gen_floatdi2_i387_with_xmm > -(operands[0], operands[1], > - assign_386_stack_local (DImode, SLOT_TEMP))); > + rtx s = assign_386_stack_local (DImode, SLOT_FLOATxFDI_387); > + emit_insn (gen_floatdi2_i387_with_xmm (operands[0], operands[1], s)); >DONE; > }) > > --- gcc/testsuite/gcc.target/i386/pr104674.c.jj 2022-02-24 13:45:31.630561330 > +0100 > +++ gcc/testsuite/gcc.target/i386/pr104674.c2022-02-24 13:45:21.278705687 > +0100 > @@ -0,0 +1,31 @@ > +/* PR target/104674 */ > +/* { dg-do run { target sse2_runtime } } */ > +/* { dg-options "-O2 -msse2 -mfpmath=sse" } */ > + > +__attribute__((noipa)) double > +bar (double x, double y) > +{ > + return x + y; > +} > + > +__attribute__((noipa)) double > +foo (long long x) > +{ > + long long a = x / 1000; > + int b = x % 1000; > + double s = (double) a; > + double n = (double) b / 1e7; > + double t = s + n; > + if (t == s + 1.0) > +t = bar (t, s); > + return t; > +} > + > +int > +main () > +{ > + long long n = 88; > + n = n * 1000; > + if (foo (n) != 88.0) > +__builtin_abort (); > +} > > Jakub >
[PATCH] tree-optimization/103037 - PRE simplifying valueized expressions
This fixes a long-standing issue in PRE where we track valueized expressions in our expression sets that we use for PHI translation, code insertion but also feed into match-and-simplify via vn_nary_simplify. But that's not what is expected from vn_nary_simplify or match-and-simplify which assume we are simplifying with operands available at the point of the expression so they can use contextual information on the SSA names like ranges. While the VN side was updated to ensure this with the rewrite to RPO VN, thereby removing all workarounds that nullified such contextual info on all SSA names, the PRE side still suffers from this. The following patch tries to apply minimal surgery at this point and makes PRE track un-valueized expressions in the expression sets but only for the NARY kind (both NAME and CONSTANT do not suffer from this issue), leaving the REFERENCE kind alone. The REFERENCE kind is important when trying to remove the workarounds still in place in compute_avail for code hoisting, but that's a separate issue and we have a working workaround in place. Doing this comes at the cost of duplicating the VN IL on the PRE side for NARY and eventually some extra overhead for translated expressions that is difficult to assess. Bootstrapped and tested on x86_64-unknown-linux-gnu. 2022-02-25 Richard Biener PR tree-optimization/103037 * tree-ssa-sccvn.h (alloc_vn_nary_op_noinit): Declare. (vn_nary_length_from_stmt): Likewise. (init_vn_nary_op_from_stmt): Likewise. (vn_nary_op_compute_hash): Likewise. * tree-ssa-sccvn.cc (alloc_vn_nary_op_noinit): Export. (vn_nary_length_from_stmt): Likewise. (init_vn_nary_op_from_stmt): Likewise. (vn_nary_op_compute_hash): Likewise. * tree-ssa-pre.cc (pre_expr_obstack): New obstack. (get_or_alloc_expr_for_nary): Pass in the value-id to use, (re-)compute the hash value and if the expression is not found allocate it from pre_expr_obstack. (phi_translate_1): Do not insert the NARY found in the VN tables but build a PRE expression from the valueized NARY with the value-id we eventually found. (find_or_generate_expression): Assert we have an entry for constant values. (compute_avail): Insert not valueized expressions into EXP_GEN using the value-id from the VN tables. (init_pre): Allocate pre_expr_obstack. (fini_pre): Free pre_expr_obstack. * gcc.dg/torture/pr103037.c: New testcase. --- gcc/testsuite/gcc.dg/torture/pr103037.c | 23 + gcc/tree-ssa-pre.cc | 43 + gcc/tree-ssa-sccvn.cc | 19 ++- gcc/tree-ssa-sccvn.h| 4 +++ 4 files changed, 65 insertions(+), 24 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/torture/pr103037.c diff --git a/gcc/testsuite/gcc.dg/torture/pr103037.c b/gcc/testsuite/gcc.dg/torture/pr103037.c new file mode 100644 index 000..8b3bb1e4c8b --- /dev/null +++ b/gcc/testsuite/gcc.dg/torture/pr103037.c @@ -0,0 +1,23 @@ +/* { dg-do run } */ + +static inline const unsigned short * +min(unsigned short *d, const unsigned short *e) +{ + return *e < *d ? e : d; +} + +unsigned short __attribute__((noipa)) +test(unsigned short arr, unsigned short val) +{ + unsigned short tem = 1; + unsigned short tem2 = *min(&arr, &tem); + return tem2 / (arr ? arr : val); +} + +int +main() +{ + if (test (2, 2) != 0) +__builtin_abort(); + return 0; +} diff --git a/gcc/tree-ssa-pre.cc b/gcc/tree-ssa-pre.cc index 7c2a52e1850..4138511cab4 100644 --- a/gcc/tree-ssa-pre.cc +++ b/gcc/tree-ssa-pre.cc @@ -323,6 +323,7 @@ static unsigned int next_expression_id; static vec expressions; static hash_table *expression_to_id; static vec name_to_id; +static obstack pre_expr_obstack; /* Allocate an expression id for EXPR. */ @@ -433,7 +434,7 @@ get_or_alloc_expr_for_name (tree name) /* Given an NARY, get or create a pre_expr to represent it. */ static pre_expr -get_or_alloc_expr_for_nary (vn_nary_op_t nary, +get_or_alloc_expr_for_nary (vn_nary_op_t nary, unsigned value_id, location_t loc = UNKNOWN_LOCATION) { struct pre_expr_d expr; @@ -442,6 +443,7 @@ get_or_alloc_expr_for_nary (vn_nary_op_t nary, expr.kind = NARY; expr.id = 0; + nary->hashcode = vn_nary_op_compute_hash (nary); PRE_EXPR_NARY (&expr) = nary; result_id = lookup_expression_id (&expr); if (result_id != 0) @@ -450,8 +452,10 @@ get_or_alloc_expr_for_nary (vn_nary_op_t nary, result = pre_expr_pool.allocate (); result->kind = NARY; result->loc = loc; - result->value_id = nary->value_id; - PRE_EXPR_NARY (result) = nary; + result->value_id = value_id ? value_id : get_next_value_id (); + PRE_EXPR_NARY (result) += alloc_vn_nary_op_noinit (nary->length, &pre_expr_obstack); + memcpy (PRE_EXPR_NARY (result), nary, sizeof_vn_nary_op (nary->length));
[PATCH] s390: Change SET rtx_cost handling.
Hi, the IF_THEN_ELSE detection currently prevents us from properly costing register-register moves which causes the lower-subreg pass to assume that a VR-VR move is as expensive as two GPR-GPR moves. This patch adds handling for SETs containing REGs as well as MEMs and is inspired by the aarch64 implementation. Bootstrapped and regtested on z900 up to z15. Is it OK? Regards Robin -- gcc/ChangeLog: * config/s390/s390.cc (s390_address_cost): Declare. (s390_hard_regno_nregs): Declare. (s390_rtx_costs): Add handling for REG and MEM in SET. gcc/testsuite/ChangeLog: * gcc.target/s390/vector/vec-sum-across-no-lower-subreg-1.c: New test. >From 8c4c6f029dbf0c9db12c2877189a5ec0ce0a9c89 Mon Sep 17 00:00:00 2001 From: Robin Dapp Date: Thu, 3 Feb 2022 12:50:04 +0100 Subject: [PATCH] s390: Change SET rtx_cost handling. The IF_THEN_ELSE detection currently prevents us from properly costing register-register moves which causes the lower-subreg pass to assume that a VR-VR move is as expensive as two GPR-GPR moves. This patch adds handling for SETs containing REGs as well as MEMs and is inspired by the aarch64 implementation. gcc/ChangeLog: * config/s390/s390.cc (s390_address_cost): Declare. (s390_hard_regno_nregs): Declare. (s390_rtx_costs): Add handling for REG and MEM in SET. gcc/testsuite/ChangeLog: * gcc.target/s390/vector/vec-sum-across-no-lower-subreg-1.c: New test. --- gcc/config/s390/s390.cc | 140 +- .../vector/vec-sum-across-no-lower-subreg-1.c | 17 +++ 2 files changed, 118 insertions(+), 39 deletions(-) create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-sum-across-no-lower-subreg-1.c diff --git a/gcc/config/s390/s390.cc b/gcc/config/s390/s390.cc index d2af6d8813d..e647c90ab29 100644 --- a/gcc/config/s390/s390.cc +++ b/gcc/config/s390/s390.cc @@ -429,6 +429,14 @@ struct s390_address bytes on a z10 (or higher) CPU. */ #define PREDICT_DISTANCE (TARGET_Z10 ? 384 : 2048) +static int +s390_address_cost (rtx addr, machine_mode mode ATTRIBUTE_UNUSED, + addr_space_t as ATTRIBUTE_UNUSED, + bool speed ATTRIBUTE_UNUSED); + +static unsigned int +s390_hard_regno_nregs (unsigned int regno, machine_mode mode); + /* Masks per jump target register indicating which thunk need to be generated. */ static GTY(()) int indirect_branch_prez10thunk_mask = 0; @@ -3619,50 +3627,104 @@ s390_rtx_costs (rtx x, machine_mode mode, int outer_code, case MEM: *total = 0; return true; - case SET: - { - /* Without this a conditional move instruction would be - accounted as 3 * COSTS_N_INSNS (set, if_then_else, - comparison operator). That's a bit pessimistic. */ + { + rtx dest = SET_DEST (x); + rtx src = SET_SRC (x); - if (!TARGET_Z196 || GET_CODE (SET_SRC (x)) != IF_THEN_ELSE) - return false; + switch (GET_CODE (src)) + { + case IF_THEN_ELSE: + { + /* Without this a conditional move instruction would be +accounted as 3 * COSTS_N_INSNS (set, if_then_else, +comparison operator). That's a bit pessimistic. */ + + if (!TARGET_Z196) + return false; + + rtx cond = XEXP (src, 0); + if (!CC_REG_P (XEXP (cond, 0)) || !CONST_INT_P (XEXP (cond, 1))) + return false; + + /* It is going to be a load/store on condition. Make it +slightly more expensive than a normal load. */ + *total = COSTS_N_INSNS (1) + 2; + + rtx dst = SET_DEST (src); + rtx then = XEXP (src, 1); + rtx els = XEXP (src, 2); + + /* It is a real IF-THEN-ELSE. An additional move will be +needed to implement that. */ + if (!TARGET_Z15 + && reload_completed + && !rtx_equal_p (dst, then) + && !rtx_equal_p (dst, els)) + *total += COSTS_N_INSNS (1) / 2; + + /* A minor penalty for constants we cannot directly handle. */ + if ((CONST_INT_P (then) || CONST_INT_P (els)) + && (!TARGET_Z13 || MEM_P (dst) + || (CONST_INT_P (then) && !satisfies_constraint_K (then)) + || (CONST_INT_P (els) && !satisfies_constraint_K (els + *total += COSTS_N_INSNS (1) / 2; + + /* A store on condition can only handle register src operands. */ + if (MEM_P (dst) && (!REG_P (then) || !REG_P (els))) + *total += COSTS_N_INSNS (1) / 2; + + return true; + } + default: + break; + } - rtx cond = XEXP (SET_SRC (x), 0);
Re: [PATCH] tree-optimization/103037 - PRE simplifying valueized expressions
On Fri, 25 Feb 2022, Richard Biener wrote: > This fixes a long-standing issue in PRE where we track valueized > expressions in our expression sets that we use for PHI translation, > code insertion but also feed into match-and-simplify via > vn_nary_simplify. But that's not what is expected from vn_nary_simplify > or match-and-simplify which assume we are simplifying with operands > available at the point of the expression so they can use contextual > information on the SSA names like ranges. While the VN side was > updated to ensure this with the rewrite to RPO VN, thereby removing > all workarounds that nullified such contextual info on all SSA names, > the PRE side still suffers from this. > > The following patch tries to apply minimal surgery at this point > and makes PRE track un-valueized expressions in the expression sets > but only for the NARY kind (both NAME and CONSTANT do not suffer > from this issue), leaving the REFERENCE kind alone. The REFERENCE > kind is important when trying to remove the workarounds still in > place in compute_avail for code hoisting, but that's a separate issue > and we have a working workaround in place. > > Doing this comes at the cost of duplicating the VN IL on the PRE side > for NARY and eventually some extra overhead for translated expressions > that is difficult to assess. > > Bootstrapped and tested on x86_64-unknown-linux-gnu. I've attached the wrong version - the following is the cleaned up and minimal variant I tested and will push. Richard. 2022-02-25 Richard Biener PR tree-optimization/103037 * tree-ssa-sccvn.h (alloc_vn_nary_op_noinit): Declare. (vn_nary_length_from_stmt): Likewise. (init_vn_nary_op_from_stmt): Likewise. (vn_nary_op_compute_hash): Likewise. * tree-ssa-sccvn.cc (alloc_vn_nary_op_noinit): Export. (vn_nary_length_from_stmt): Likewise. (init_vn_nary_op_from_stmt): Likewise. (vn_nary_op_compute_hash): Likewise. * tree-ssa-pre.cc (pre_expr_obstack): New obstack. (get_or_alloc_expr_for_nary): Pass in the value-id to use, (re-)compute the hash value and if the expression is not found allocate it from pre_expr_obstack. (phi_translate_1): Do not insert the NARY found in the VN tables but build a PRE expression from the valueized NARY with the value-id we eventually found. (find_or_generate_expression): Assert we have an entry for constant values. (compute_avail): Insert not valueized expressions into EXP_GEN using the value-id from the VN tables. (init_pre): Allocate pre_expr_obstack. (fini_pre): Free pre_expr_obstack. * gcc.dg/torture/pr103037.c: New testcase. --- gcc/testsuite/gcc.dg/torture/pr103037.c | 23 +++ gcc/tree-ssa-pre.cc | 52 ++--- gcc/tree-ssa-sccvn.cc | 11 ++ gcc/tree-ssa-sccvn.h| 4 ++ 4 files changed, 68 insertions(+), 22 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/torture/pr103037.c diff --git a/gcc/testsuite/gcc.dg/torture/pr103037.c b/gcc/testsuite/gcc.dg/torture/pr103037.c new file mode 100644 index 000..8b3bb1e4c8b --- /dev/null +++ b/gcc/testsuite/gcc.dg/torture/pr103037.c @@ -0,0 +1,23 @@ +/* { dg-do run } */ + +static inline const unsigned short * +min(unsigned short *d, const unsigned short *e) +{ + return *e < *d ? e : d; +} + +unsigned short __attribute__((noipa)) +test(unsigned short arr, unsigned short val) +{ + unsigned short tem = 1; + unsigned short tem2 = *min(&arr, &tem); + return tem2 / (arr ? arr : val); +} + +int +main() +{ + if (test (2, 2) != 0) +__builtin_abort(); + return 0; +} diff --git a/gcc/tree-ssa-pre.cc b/gcc/tree-ssa-pre.cc index 7c2a52e1850..d6c83a72dd8 100644 --- a/gcc/tree-ssa-pre.cc +++ b/gcc/tree-ssa-pre.cc @@ -323,6 +323,7 @@ static unsigned int next_expression_id; static vec expressions; static hash_table *expression_to_id; static vec name_to_id; +static obstack pre_expr_obstack; /* Allocate an expression id for EXPR. */ @@ -430,18 +431,23 @@ get_or_alloc_expr_for_name (tree name) return result; } -/* Given an NARY, get or create a pre_expr to represent it. */ +/* Given an NARY, get or create a pre_expr to represent it. Assign + VALUE_ID to it or allocate a new value-id if it is zero. Record + LOC as the original location of the expression. */ static pre_expr -get_or_alloc_expr_for_nary (vn_nary_op_t nary, +get_or_alloc_expr_for_nary (vn_nary_op_t nary, unsigned value_id, location_t loc = UNKNOWN_LOCATION) { struct pre_expr_d expr; pre_expr result; unsigned int result_id; + gcc_assert (value_id == 0 || !value_id_constant_p (value_id)); + expr.kind = NARY; expr.id = 0; + nary->hashcode = vn_nary_op_compute_hash (nary); PRE_EXPR_NARY (&expr) = nary; result_id = lookup_expression_id (&expr); if (res
Re: [PATCH 2/2][middle-end/102276] Adding -Wtrivial-auto-var-init and update documentation.
On Thu, 24 Feb 2022, Qing Zhao wrote: > > > > On Feb 24, 2022, at 4:16 AM, Richard Biener wrote: > > > > On Sat, 19 Feb 2022, Qing Zhao wrote: > > > >> Hi, > >> > >> This is the 2nd patch for fixing pr102276. > >> > >> Adding -Wtrivial-auto-var-init and update documentation. > >> > >> Adding a new warning option -Wtrivial-auto-var-init to report cases when > >> -ftrivial-auto-var-init cannot initialize the auto variable. At the same > >> time, update documentation for -ftrivial-auto-var-init to connect it with > >> the new warning option -Wtrivial-auto-var-init, and add documentation > >> for -Wtrivial-auto-var-init. > >> > >> Bootstraped and regression tested on both x86 and aarch64. > >> > >> Okay for committing? > >> > >> thanks. > >> > >> Qing. > >> > >> == > >> From 4346890b8f4258489c4841f1992ba3ce816d7689 Mon Sep 17 00:00:00 2001 > >> From: Qing Zhao > >> Date: Fri, 18 Feb 2022 15:53:15 + > >> Subject: [PATCH 2/2] Adding -Wtrivial-auto-var-init and update > >> documentation. > >> > >> Adding a new warning option -Wtrivial-auto-var-init to report cases when > >> -ftrivial-auto-var-init cannot initialize the auto variable. At the same > >> time, update documentation for -ftrivial-auto-var-init to connect it with > >> the new warning option -Wtrivial-auto-var-init, and add documentation > >> for -Wtrivial-auto-var-init. > >> > >> 2022-02-18 Qing Zhao > >> gcc/ChangeLog: > >> > >>* common.opt (-Wtrivial-auto-var-init): New option. > >>* doc/invoke.texi (-Wtrivial-auto-var-init): Document new option. > >>(-ftrivial-auto-var-init): Update option; > >>* gimplify.cc (maybe_warn_switch_unreachable): Rename... > >>(maybe_warn_switch_unreachable_and_auto_init): ...to this. > >>(gimplify_switch_expr): Call new function. > >> > >> gcc/testsuite/ChangeLog: > >> > >>* gcc.dg/auto-init-pr102276-3.c: New test. > >>* gcc.dg/auto-init-pr102276-4.c: New test. > >> --- > >> gcc/common.opt | 4 + > >> gcc/doc/invoke.texi | 14 ++- > >> gcc/gimplify.cc | 100 +++- > >> gcc/testsuite/gcc.dg/auto-init-pr102276-3.c | 40 > >> gcc/testsuite/gcc.dg/auto-init-pr102276-4.c | 40 > >> 5 files changed, 175 insertions(+), 23 deletions(-) > >> create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr102276-3.c > >> create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr102276-4.c > >> > >> diff --git a/gcc/common.opt b/gcc/common.opt > >> index c21e5273ae3..22c95dbfa49 100644 > >> --- a/gcc/common.opt > >> +++ b/gcc/common.opt > >> @@ -801,6 +801,10 @@ Wtrampolines > >> Common Var(warn_trampolines) Warning > >> Warn whenever a trampoline is generated. > >> > >> +Wtrivial-auto-var-init > >> +Common Var(warn_trivial_auto_var_init) Warning Init(0) > >> +Warn about where -ftrivial-auto-var-init cannot initialize the auto > >> variable. > >> + > > > > Warn about cases where ... initialize a variable. > > Okay. > > > > >> Wtype-limits > >> Common Var(warn_type_limits) Warning EnabledBy(Wextra) > >> Warn if a comparison is always true or always false due to the limited > >> range of the data type. > >> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi > >> index e1a00c80307..c61a5b4b4a5 100644 > >> --- a/gcc/doc/invoke.texi > >> +++ b/gcc/doc/invoke.texi > >> @@ -399,7 +399,7 @@ Objective-C and Objective-C++ Dialects}. > >> -Wswitch -Wno-switch-bool -Wswitch-default -Wswitch-enum @gol > >> -Wno-switch-outside-range -Wno-switch-unreachable -Wsync-nand @gol > >> -Wsystem-headers -Wtautological-compare -Wtrampolines -Wtrigraphs @gol > >> --Wtsan -Wtype-limits -Wundef @gol > >> +-Wtrivial-auto-var-init -Wtsan -Wtype-limits -Wundef @gol > >> -Wuninitialized -Wunknown-pragmas @gol > >> -Wunsuffixed-float-constants -Wunused @gol > >> -Wunused-but-set-parameter -Wunused-but-set-variable @gol > >> @@ -6953,6 +6953,14 @@ This warning is enabled by default for C and C++ > >> programs. > >> Warn when @code{__sync_fetch_and_nand} and @code{__sync_nand_and_fetch} > >> built-in functions are used. These functions changed semantics in GCC 4.4. > >> > >> +@item -Wtrivial-auto-var-init > >> +@opindex Wtrivial-auto-var-init > >> +@opindex Wno-trivial-auto-var-init > >> +Warn when @code{-ftrivial-auto-var-init} cannot initialize the automatic > >> +variable. A common situation is an automatic variable that is declared > >> +between the controlling expression and the first case lable of a > >> @code{switch} > >> +statement. > >> + > >> @item -Wunused-but-set-parameter > >> @opindex Wunused-but-set-parameter > >> @opindex Wno-unused-but-set-parameter > >> @@ -12314,6 +12322,10 @@ initializer as uninitialized, > >> @option{-Wuninitialized} and > >> warning messages on such automatic variables. > >> With this option, GCC will also initialize any padding of automatic > >> variables > >> that have structure or union types to zeroes. > >> +H
[committed] arc: Fail conditional move expand patterns
If the movcc comparison is not valid it triggers an assert in the current implementation. This behavior is not needed as we can FAIL the movcc expand pattern. gcc/ * config/arc/arc.cc (gen_compare_reg): Return NULL_RTX if the comparison is not valid. * config/arc/arc.md (movsicc): Fail if comparison is not valid. (movdicc): Likewise. (movsfcc): Likewise. (movdfcc): Likewise. Signed-off-by: Claudiu Zissulescu --- gcc/config/arc/arc.cc | 3 ++- gcc/config/arc/arc.md | 25 - 2 files changed, 22 insertions(+), 6 deletions(-) diff --git a/gcc/config/arc/arc.cc b/gcc/config/arc/arc.cc index 8cc173519ab..c27ba99eb60 100644 --- a/gcc/config/arc/arc.cc +++ b/gcc/config/arc/arc.cc @@ -2256,7 +2256,8 @@ gen_compare_reg (rtx comparison, machine_mode omode) cmode = GET_MODE (x); if (cmode == VOIDmode) cmode = GET_MODE (y); - gcc_assert (cmode == SImode || cmode == SFmode || cmode == DFmode); + if (cmode != SImode && cmode != SFmode && cmode != DFmode) +return NULL_RTX; if (cmode == SImode) { if (!register_operand (x, SImode)) diff --git a/gcc/config/arc/arc.md b/gcc/config/arc/arc.md index ace3cb70424..39b358052c1 100644 --- a/gcc/config/arc/arc.md +++ b/gcc/config/arc/arc.md @@ -1618,8 +1618,11 @@ (define_expand "movsicc" (match_operand:SI 2 "nonmemory_operand" "") (match_operand:SI 3 "register_operand" "")))] "" - "operands[1] = gen_compare_reg (operands[1], VOIDmode);") - + " + operands[1] = gen_compare_reg (operands[1], VOIDmode); + if (operands[1] == NULL_RTX) +FAIL; + ") (define_expand "movdicc" [(set (match_operand:DI 0 "dest_reg_operand" "") @@ -1627,7 +1630,11 @@ (define_expand "movdicc" (match_operand:DI 2 "nonmemory_operand" "") (match_operand:DI 3 "register_operand" "")))] "" - "operands[1] = gen_compare_reg (operands[1], VOIDmode);") + " + operands[1] = gen_compare_reg (operands[1], VOIDmode); + if (operands[1] == NULL_RTX) +FAIL; + ") (define_expand "movsfcc" @@ -1636,7 +1643,11 @@ (define_expand "movsfcc" (match_operand:SF 2 "nonmemory_operand" "") (match_operand:SF 3 "register_operand" "")))] "" - "operands[1] = gen_compare_reg (operands[1], VOIDmode);") + " + operands[1] = gen_compare_reg (operands[1], VOIDmode); + if (operands[1] == NULL_RTX) +FAIL; + ") (define_expand "movdfcc" [(set (match_operand:DF 0 "dest_reg_operand" "") @@ -1644,7 +1655,11 @@ (define_expand "movdfcc" (match_operand:DF 2 "nonmemory_operand" "") (match_operand:DF 3 "register_operand" "")))] "" - "operands[1] = gen_compare_reg (operands[1], VOIDmode);") + " + operands[1] = gen_compare_reg (operands[1], VOIDmode); + if (operands[1] == NULL_RTX) +FAIL; + ") (define_insn "*movsicc_insn" [(set (match_operand:SI 0 "dest_reg_operand" "=w,w") -- 2.35.1
Re: [committed] arc: Fail conditional move expand patterns
> If the movcc comparison is not valid it triggers an assert in the > current implementation. This behavior is not needed as we can FAIL > the movcc expand pattern. In case of a MODE_CC comparison you can also just return it as described here https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104154 or here: https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590639.html If there already is a "CC comparison" the backend does not need to create one and ifcvt can make use of this, creating better sequences. Regards Robin
Re: [PATCH] Check if loading const from mem is faster
Richard Biener writes: > On Fri, 25 Feb 2022, Jiufu Guo wrote: > >> Richard Biener writes: >> >> > On Thu, 24 Feb 2022, Jiufu Guo wrote: >> > >> >> Jiufu Guo via Gcc-patches writes: >> >> >> >> > Segher Boessenkool writes: >> >> > >> >> >> On Wed, Feb 23, 2022 at 02:02:59PM +0100, Richard Biener wrote: >> >> >>> I'm assuming we're always dealing with >> >> >>> >> >> >>> (set (reg:MODE ..) ) >> >> >>> >> >> >>> here and CSE is not substituting into random places of an >> >> >>> instruction(?). I don't know what 'rtx_cost' should evaluate >> >> >>> to for a constant, if it should implicitely evaluate the cost >> >> >>> of putting the result into a register for example. >> >> >> >> >> >> rtx_cost is no good here (and in most places). rtx_cost should be 0 >> >> >> for anything that is used as input in a machine instruction -- but you >> >> >> need much more context to determine that. insn_cost is much simpler >> >> >> and >> >> >> much easier to use. >> >> >> >> >> >>> Using RTX_COST with SET and 1 at least looks no worse than using >> >> >>> your proposed new target hook and comparing it with the original >> >> >>> unfolded src (again with SET and 1). >> >> >> >> >> >> It is required to generate valid instructions no matter what, before >> >> >> the pass has finished that is. On all more modern architectures it is >> >> >> futile to think you can usefully consider the cost of an RTL expression >> >> >> and derive a real-world cost of the generated code from that. >> >> > >> >> > Thanks Segher for pointing out these! Here is another reason that I >> >> > did not use rtx_cost: in a few passes, there are codes to check the >> >> > constants and store them in constant pool. I'm thinking to integerate >> >> > those codes in a consistent way. >> >> >> >> Hi Segher, Richard! >> >> >> >> I'm thinking the way like: For a constant, >> >> 1. if the constant could be used as an immediate for the >> >> instruction, then retreated as an operand; >> >> 2. otherwise if the constant can not be stored into a >> >> constant pool, then handle through instructions; >> >> 3. if it is faster to access constant from pool, then emit >> >> constant as data(.rodata); >> >> 4. otherwise, handle the constant by instructions. >> >> >> >> And to store the constant into a pool, besides force_const_mem, >> >> create reference (toc) may be needed on some platforms. >> >> >> >> For this particular issue in CSE, there is already code that >> >> tries to put constant into a pool (invoke force_const_mem). >> >> While the code is too late. So, we may check the constant >> >> earlier and store it into constant pool if profitable. >> >> >> >> And another thing as Segher pointed out, CSE is doing too >> >> much work. It may be ok to separate the constant handling >> >> logic from CSE. >> > >> > Not sure - CSE just is value numbering, I don't see that it does >> > more than that. Yes, it might have developed "heuristics" over >> > the years what to CSE and to what and where to substitute and >> > where not. But in the end it does just value numbering. >> > >> >> >> >> I update a new version patch as follow (did not seprate CSE): >> > >> > How is the new target hook better in any way compared to rtx_cost >> > or insn_cost? It looks like a total hack. >> > >> > I suppose the actual way of materializing a constant is done >> > behind GCCs back and not exposed anywhere? But instead you >> > claim the constants are valid when they actually are not? >> > Isn't the problem then that the rs6000 backend lies? >> >> Hi Richard, >> >> Thanks for your comments and sugguestions! >> >> Materializing a constant should be done behind GCC. >> On rs6000, in expand pass, during emit_move, the constant is >> checked and store into constant pool if necessary. >> Some other platforms are doing a similar thing, e.g. >> ix86_expand_vector_move, alpha_expand_mov,... >> mips_legitimize_const_move. >> >> But, it does not as we expect, force_const_mem is also >> exposed other places (not only ira/reload for stack reference). >> >> CSE is one place, for example, CSE first retrieve the constant >> from insn's equal, but it also tries to put constant into >> pool for some condition (the condition was introduced at >> early age of cse.c, and it is rare to run into in trunk). >> In some aspects, IMHO, this seems not a great work of CSE. >> >> And this is how the 'invalid(or say slow)' constant occurs. >> e.g. before cse: >> 7: r119:DI=[unspec[`*.LC0',%r2:DI] 47] >> REG_EQUAL 0x100803004101001 >> after cse: >> 7: r119:DI=0x100803004101001 >> REG_EQUAL 0x100803004101001 >> >> As you pointed out: we can also avoid this transformation through >> rtx_cost/insn_cost by estimating the COST more accurately for >> "r119:DI=0x100803004101001". (r119:DI=0x100803004101001 will not >> be a real final instruction.) > > At which point does this become the final instruction? I suppose > CSE belives this constant is legitimate and the ins
Re: [PATCH] Check if loading const from mem is faster
On Fri, 25 Feb 2022, Jiufu Guo wrote: > Richard Biener writes: > > > On Fri, 25 Feb 2022, Jiufu Guo wrote: > > > >> Richard Biener writes: > >> > >> > On Thu, 24 Feb 2022, Jiufu Guo wrote: > >> > > >> >> Jiufu Guo via Gcc-patches writes: > >> >> > >> >> > Segher Boessenkool writes: > >> >> > > >> >> >> On Wed, Feb 23, 2022 at 02:02:59PM +0100, Richard Biener wrote: > >> >> >>> I'm assuming we're always dealing with > >> >> >>> > >> >> >>> (set (reg:MODE ..) ) > >> >> >>> > >> >> >>> here and CSE is not substituting into random places of an > >> >> >>> instruction(?). I don't know what 'rtx_cost' should evaluate > >> >> >>> to for a constant, if it should implicitely evaluate the cost > >> >> >>> of putting the result into a register for example. > >> >> >> > >> >> >> rtx_cost is no good here (and in most places). rtx_cost should be 0 > >> >> >> for anything that is used as input in a machine instruction -- but > >> >> >> you > >> >> >> need much more context to determine that. insn_cost is much simpler > >> >> >> and > >> >> >> much easier to use. > >> >> >> > >> >> >>> Using RTX_COST with SET and 1 at least looks no worse than using > >> >> >>> your proposed new target hook and comparing it with the original > >> >> >>> unfolded src (again with SET and 1). > >> >> >> > >> >> >> It is required to generate valid instructions no matter what, before > >> >> >> the pass has finished that is. On all more modern architectures it > >> >> >> is > >> >> >> futile to think you can usefully consider the cost of an RTL > >> >> >> expression > >> >> >> and derive a real-world cost of the generated code from that. > >> >> > > >> >> > Thanks Segher for pointing out these! Here is another reason that I > >> >> > did not use rtx_cost: in a few passes, there are codes to check the > >> >> > constants and store them in constant pool. I'm thinking to integerate > >> >> > those codes in a consistent way. > >> >> > >> >> Hi Segher, Richard! > >> >> > >> >> I'm thinking the way like: For a constant, > >> >> 1. if the constant could be used as an immediate for the > >> >> instruction, then retreated as an operand; > >> >> 2. otherwise if the constant can not be stored into a > >> >> constant pool, then handle through instructions; > >> >> 3. if it is faster to access constant from pool, then emit > >> >> constant as data(.rodata); > >> >> 4. otherwise, handle the constant by instructions. > >> >> > >> >> And to store the constant into a pool, besides force_const_mem, > >> >> create reference (toc) may be needed on some platforms. > >> >> > >> >> For this particular issue in CSE, there is already code that > >> >> tries to put constant into a pool (invoke force_const_mem). > >> >> While the code is too late. So, we may check the constant > >> >> earlier and store it into constant pool if profitable. > >> >> > >> >> And another thing as Segher pointed out, CSE is doing too > >> >> much work. It may be ok to separate the constant handling > >> >> logic from CSE. > >> > > >> > Not sure - CSE just is value numbering, I don't see that it does > >> > more than that. Yes, it might have developed "heuristics" over > >> > the years what to CSE and to what and where to substitute and > >> > where not. But in the end it does just value numbering. > >> > > >> >> > >> >> I update a new version patch as follow (did not seprate CSE): > >> > > >> > How is the new target hook better in any way compared to rtx_cost > >> > or insn_cost? It looks like a total hack. > >> > > >> > I suppose the actual way of materializing a constant is done > >> > behind GCCs back and not exposed anywhere? But instead you > >> > claim the constants are valid when they actually are not? > >> > Isn't the problem then that the rs6000 backend lies? > >> > >> Hi Richard, > >> > >> Thanks for your comments and sugguestions! > >> > >> Materializing a constant should be done behind GCC. > >> On rs6000, in expand pass, during emit_move, the constant is > >> checked and store into constant pool if necessary. > >> Some other platforms are doing a similar thing, e.g. > >> ix86_expand_vector_move, alpha_expand_mov,... > >> mips_legitimize_const_move. > >> > >> But, it does not as we expect, force_const_mem is also > >> exposed other places (not only ira/reload for stack reference). > >> > >> CSE is one place, for example, CSE first retrieve the constant > >> from insn's equal, but it also tries to put constant into > >> pool for some condition (the condition was introduced at > >> early age of cse.c, and it is rare to run into in trunk). > >> In some aspects, IMHO, this seems not a great work of CSE. > >> > >> And this is how the 'invalid(or say slow)' constant occurs. > >> e.g. before cse: > >> 7: r119:DI=[unspec[`*.LC0',%r2:DI] 47] > >> REG_EQUAL 0x100803004101001 > >> after cse: > >> 7: r119:DI=0x100803004101001 > >> REG_EQUAL 0x100803004101001 > >> > >> As you pointed out: we can also avoid this transformation thr
[PATCH][pushed] testsuite: Fix ASAN error [PR104687]
Installed as obvious. Martin PR testsuite/104687 gcc/testsuite/ChangeLog: * gcc.dg/lto/20090717_0.c: Fix asan error. --- gcc/testsuite/gcc.dg/lto/20090717_0.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/testsuite/gcc.dg/lto/20090717_0.c b/gcc/testsuite/gcc.dg/lto/20090717_0.c index fe13bdf78d4..41558df7a7c 100644 --- a/gcc/testsuite/gcc.dg/lto/20090717_0.c +++ b/gcc/testsuite/gcc.dg/lto/20090717_0.c @@ -1,4 +1,4 @@ struct variable { const char *string; }; -struct variable table[] = { }; +struct variable table[] = { { 0 } }; -- 2.35.1
[PATCH] rtl-optimization/104686 - speed up conflict iteration
The following replaces /* Skip bits that are zero. */ for (; (word & 1) == 0; word >>= 1) bit_num++; idioms in ira-int.h in the attempt to speedup update_conflict_hard_regno_costs which we're bound on in PR104686. The trick is to use ctz_hwi here which should pay off even with dense bitmaps on architectures that have HW support for this. For the PR in question this speeds up compile-time from 31s to 24s for me. Bootstrapped and tested on x86_64-unknown-linux-gnu. OK for trunk? Thanks, Richard. 2022-02-25 Richard Biener PR rtl-optimization/104686 * ira-int.h (minmax_set_iter_cond): Use ctz_hwi to elide loop skipping bits that are zero. (ira_object_conflict_iter_cond): Likewise. --- gcc/ira-int.h | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/gcc/ira-int.h b/gcc/ira-int.h index 957604b22e9..f42a314fa7f 100644 --- a/gcc/ira-int.h +++ b/gcc/ira-int.h @@ -764,8 +764,9 @@ minmax_set_iter_cond (minmax_set_iterator *i, int *n) } /* Skip bits that are zero. */ - for (; (i->word & 1) == 0; i->word >>= 1) -i->bit_num++; + int off = ctz_hwi (i->word); + i->bit_num += off; + i->word >>= off; *n = (int) i->bit_num + i->start_val; @@ -1379,8 +1380,9 @@ ira_object_conflict_iter_cond (ira_object_conflict_iterator *i, } /* Skip bits that are zero. */ - for (; (word & 1) == 0; word >>= 1) - bit_num++; + int off = ctz_hwi (word); + bit_num += off; + word >>= off; obj = ira_object_id_map[bit_num + i->base_conflict_id]; i->bit_num = bit_num + 1; -- 2.34.1
Re: [PATCH][V3][middle-end/104550]Suppress uninitialized warnings for new created uses from __builtin_clear_padding folding
> On Feb 25, 2022, at 2:38 AM, Richard Biener wrote: > > On Fri, 25 Feb 2022, Qing Zhao wrote: > >> Hi, Jakub and Richard: >> >> This is the 3rd version of the patch, the major change compared to the >> previous version are: >> >> 1. Add warning_enabled_at guard before “suppress_warning” >> 2. Add location to the call to __builtin_clear_padding for auto init. >> >> The patch has been bootstrapped and regress tested on both x86 and aarch64. >> Okay for trunk? >> >> Thanks. >> >> Qing >> >> == >> From 8314ded4ca0f59c5a3ec431c9c3768fcaf2a0949 Mon Sep 17 00:00:00 2001 >> From: Qing Zhao >> Date: Thu, 24 Feb 2022 22:38:38 + >> Subject: [PATCH] Suppress uninitialized warnings for new created uses from >> __builtin_clear_padding folding [PR104550] >> >> __builtin_clear_padding(&object) will clear all the padding bits of the >> object. >> actually, it doesn't involve any use of an user variable. Therefore, users do >> not expect any uninitialized warning from it. It's reasonable to suppress >> uninitialized warnings for all new created uses from __builtin_clear_padding >> folding. >> >> PR middle-end/104550 >> >> gcc/ChangeLog: >> >> * gimple-fold.cc (clear_padding_flush): Suppress warnings for new >> created uses. >> * gimplify.cc (gimple_add_padding_init_for_auto_var): Set >> location for new created call to __builtin_clear_padding. >> >> gcc/testsuite/ChangeLog: >> >> * gcc.dg/auto-init-pr104550-1.c: New test. >> * gcc.dg/auto-init-pr104550-2.c: New test. >> * gcc.dg/auto-init-pr104550-3.c: New test. >> --- >> gcc/gimple-fold.cc | 11 ++- >> gcc/gimplify.cc | 1 + >> gcc/testsuite/gcc.dg/auto-init-pr104550-1.c | 10 ++ >> gcc/testsuite/gcc.dg/auto-init-pr104550-2.c | 11 +++ >> gcc/testsuite/gcc.dg/auto-init-pr104550-3.c | 11 +++ >> 5 files changed, 43 insertions(+), 1 deletion(-) >> create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr104550-1.c >> create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr104550-2.c >> create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr104550-3.c >> >> diff --git a/gcc/gimple-fold.cc b/gcc/gimple-fold.cc >> index 16f02c2d098d..67b4963ffd96 100644 >> --- a/gcc/gimple-fold.cc >> +++ b/gcc/gimple-fold.cc >> @@ -62,6 +62,7 @@ along with GCC; see the file COPYING3. If not see >> #include "attribs.h" >> #include "asan.h" >> #include "diagnostic-core.h" >> +#include "diagnostic.h" >> #include "intl.h" >> #include "calls.h" >> #include "tree-vector-builder.h" >> @@ -4379,7 +4380,15 @@ clear_padding_flush (clear_padding_struct *buf, bool >> full) >>else >> { >>src = make_ssa_name (type); >> - g = gimple_build_assign (src, unshare_expr (dst)); >> + tree tmp_dst = unshare_expr (dst); >> + /* The folding introduces a read from the tmp_dst, we should >> + prevent uninitialized warning analysis from issuing warning >> + for such fake read. */ >> + if (warning_enabled_at (buf->loc, OPT_Wuninitialized) >> + || warning_enabled_at (buf->loc, >> + OPT_Wmaybe_uninitialized)) >> +suppress_warning (tmp_dst, OPT_Wuninitialized); >> + g = gimple_build_assign (src, tmp_dst); > > So what about just gimple_set_no_warning (g, true); ? (sorry for > the ping-pong between us three...) This didn’t work. The small testing case still failed. This is due to in tree-ssa-uninit.cc, it checks get_no_uninit_warning (RHS), not for the whole stmt. We can update tree-sea-uninit.cc to check the whole stmt, but I am not sure whether doing this might introduce other issue. Qing > >>gimple_set_location (g, buf->loc); >>gsi_insert_before (buf->gsi, g, GSI_SAME_STMT); >>tree mask = native_interpret_expr (type, >> diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc >> index f570daa015a5..977cf458f858 100644 >> --- a/gcc/gimplify.cc >> +++ b/gcc/gimplify.cc >> @@ -1823,6 +1823,7 @@ gimple_add_padding_init_for_auto_var (tree decl, bool >> is_vla, >> >> gimple *call = gimple_build_call (fn, 2, addr_of_decl, >> build_one_cst (TREE_TYPE (addr_of_decl))); >> + gimple_set_location (call, EXPR_LOCATION (decl)); > > I believe EXPR_LOCATION (decl) is bogus, 'decl's have DECL_LOCATION, > EXPR_LOCATION here will just return UNKNOWN_LOCATION. > >> gimplify_seq_add_stmt (seq_p, call); >> } >> >> diff --git a/gcc/testsuite/gcc.dg/auto-init-pr104550-1.c >> b/gcc/testsuite/gcc.dg/auto-init-pr104550-1.c >> new file mode 100644 >> index ..a08110c3a170 >> --- /dev/null >> +++ b/gcc/testsuite/gcc.dg/auto-init-pr104550-1.c >> @@ -0,0 +1,10 @@ >> +/* PR 104550*/ >> +/* { dg-do compile } */ >> +/* { dg-options "-O -Wuninitialized -ftrivial-auto-var-init=pattern
Re: [PATCH 2/2][middle-end/102276] Adding -Wtrivial-auto-var-init and update documentation.
> On Feb 25, 2022, at 6:43 AM, Richard Biener wrote: > > On Thu, 24 Feb 2022, Qing Zhao wrote: > >> >> >>> On Feb 24, 2022, at 4:16 AM, Richard Biener wrote: >>> >>> On Sat, 19 Feb 2022, Qing Zhao wrote: >>> Hi, This is the 2nd patch for fixing pr102276. Adding -Wtrivial-auto-var-init and update documentation. Adding a new warning option -Wtrivial-auto-var-init to report cases when -ftrivial-auto-var-init cannot initialize the auto variable. At the same time, update documentation for -ftrivial-auto-var-init to connect it with the new warning option -Wtrivial-auto-var-init, and add documentation for -Wtrivial-auto-var-init. Bootstraped and regression tested on both x86 and aarch64. Okay for committing? thanks. Qing. == From 4346890b8f4258489c4841f1992ba3ce816d7689 Mon Sep 17 00:00:00 2001 From: Qing Zhao Date: Fri, 18 Feb 2022 15:53:15 + Subject: [PATCH 2/2] Adding -Wtrivial-auto-var-init and update documentation. Adding a new warning option -Wtrivial-auto-var-init to report cases when -ftrivial-auto-var-init cannot initialize the auto variable. At the same time, update documentation for -ftrivial-auto-var-init to connect it with the new warning option -Wtrivial-auto-var-init, and add documentation for -Wtrivial-auto-var-init. 2022-02-18 Qing Zhao gcc/ChangeLog: * common.opt (-Wtrivial-auto-var-init): New option. * doc/invoke.texi (-Wtrivial-auto-var-init): Document new option. (-ftrivial-auto-var-init): Update option; * gimplify.cc (maybe_warn_switch_unreachable): Rename... (maybe_warn_switch_unreachable_and_auto_init): ...to this. (gimplify_switch_expr): Call new function. gcc/testsuite/ChangeLog: * gcc.dg/auto-init-pr102276-3.c: New test. * gcc.dg/auto-init-pr102276-4.c: New test. --- gcc/common.opt | 4 + gcc/doc/invoke.texi | 14 ++- gcc/gimplify.cc | 100 +++- gcc/testsuite/gcc.dg/auto-init-pr102276-3.c | 40 gcc/testsuite/gcc.dg/auto-init-pr102276-4.c | 40 5 files changed, 175 insertions(+), 23 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr102276-3.c create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr102276-4.c diff --git a/gcc/common.opt b/gcc/common.opt index c21e5273ae3..22c95dbfa49 100644 --- a/gcc/common.opt +++ b/gcc/common.opt @@ -801,6 +801,10 @@ Wtrampolines Common Var(warn_trampolines) Warning Warn whenever a trampoline is generated. +Wtrivial-auto-var-init +Common Var(warn_trivial_auto_var_init) Warning Init(0) +Warn about where -ftrivial-auto-var-init cannot initialize the auto variable. + >>> >>> Warn about cases where ... initialize a variable. >> >> Okay. >> >>> Wtype-limits Common Var(warn_type_limits) Warning EnabledBy(Wextra) Warn if a comparison is always true or always false due to the limited range of the data type. diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index e1a00c80307..c61a5b4b4a5 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -399,7 +399,7 @@ Objective-C and Objective-C++ Dialects}. -Wswitch -Wno-switch-bool -Wswitch-default -Wswitch-enum @gol -Wno-switch-outside-range -Wno-switch-unreachable -Wsync-nand @gol -Wsystem-headers -Wtautological-compare -Wtrampolines -Wtrigraphs @gol --Wtsan -Wtype-limits -Wundef @gol +-Wtrivial-auto-var-init -Wtsan -Wtype-limits -Wundef @gol -Wuninitialized -Wunknown-pragmas @gol -Wunsuffixed-float-constants -Wunused @gol -Wunused-but-set-parameter -Wunused-but-set-variable @gol @@ -6953,6 +6953,14 @@ This warning is enabled by default for C and C++ programs. Warn when @code{__sync_fetch_and_nand} and @code{__sync_nand_and_fetch} built-in functions are used. These functions changed semantics in GCC 4.4. +@item -Wtrivial-auto-var-init +@opindex Wtrivial-auto-var-init +@opindex Wno-trivial-auto-var-init +Warn when @code{-ftrivial-auto-var-init} cannot initialize the automatic +variable. A common situation is an automatic variable that is declared +between the controlling expression and the first case lable of a @code{switch} +statement. + @item -Wunused-but-set-parameter @opindex Wunused-but-set-parameter @opindex Wno-unused-but-set-parameter @@ -12314,6 +12322,10 @@ initializer as uninitialized, @option{-Wuninitialized} and warning messages on such automatic variables. With this option, GCC will also initialize any padding of automatic
Re: [PATCH] rtl-optimization/104686 - speed up conflict iteration
On 2022-02-25 09:14, Richard Biener wrote: The following replaces /* Skip bits that are zero. */ for (; (word & 1) == 0; word >>= 1) bit_num++; idioms in ira-int.h in the attempt to speedup update_conflict_hard_regno_costs which we're bound on in PR104686. The trick is to use ctz_hwi here which should pay off even with dense bitmaps on architectures that have HW support for this. For the PR in question this speeds up compile-time from 31s to 24s for me. It is a really significant improvement. Bootstrapped and tested on x86_64-unknown-linux-gnu. OK for trunk? Yes. Thank you for working on this PR, Richard. 2022-02-25 Richard Biener PR rtl-optimization/104686 * ira-int.h (minmax_set_iter_cond): Use ctz_hwi to elide loop skipping bits that are zero. (ira_object_conflict_iter_cond): Likewise.
Re: [PATCH] s390: Change SET rtx_cost handling.
On 2/25/22 12:38, Robin Dapp wrote: > Hi, > > the IF_THEN_ELSE detection currently prevents us from properly costing > register-register moves which causes the lower-subreg pass to assume > that a VR-VR move is as expensive as two GPR-GPR moves. > > This patch adds handling for SETs containing REGs as well as MEMs and is > inspired by the aarch64 implementation. > > Bootstrapped and regtested on z900 up to z15. Is it OK? > > Regards > Robin > > -- > > gcc/ChangeLog: > > * config/s390/s390.cc (s390_address_cost): Declare. > (s390_hard_regno_nregs): Declare. > (s390_rtx_costs): Add handling for REG and MEM in SET. > > gcc/testsuite/ChangeLog: > > * gcc.target/s390/vector/vec-sum-across-no-lower-subreg-1.c: New > test. Ok. Thanks Andreas
[committed] testsuite: Move pr104540.C test to g++.target/i386/
On Wed, Feb 23, 2022 at 07:45:20PM -0300, Alexandre Oliva via Gcc-patches wrote: > PR middle-end/104540 > * g++.dg/PR104540.C: New. Both -mforce-drap and -mstackrealign options are x86 specific. Tested on x86_64-linux, i686-linux and powerpc64le-linux, committed to trunk. 2022-02-25 Jakub Jelinek * g++.dg/pr104540.C: Move to ... * g++.target/i386/pr104540.C: ... here. diff --git a/gcc/testsuite/g++.dg/pr104540.C b/gcc/testsuite/g++.target/i386/pr104540.C similarity index 100% rename from gcc/testsuite/g++.dg/pr104540.C rename to gcc/testsuite/g++.target/i386/pr104540.C Jakub
[PATCH] match.pd: Further complex simplification fixes [PR104675]
Hi! Mark mentioned in the PR further 2 simplifications that also ICE with complex types. For these, eventually (but IMO GCC 13 materials) we could support it for vector types if it would be uniform vector constants. Currently integer_pow2p is true only for INTEGER_CSTs and COMPLEX_CSTs and we can't use bit_and etc. for complex type. Bootstrapped/regtested on powerpc64le-linux and powerpc64-linux, ok for trunk? 2022-02-25 Jakub Jelinek Marc Glisse PR tree-optimization/104675 * match.pd (t * 2U / 2 -> t & (~0 / 2), t / 2U * 2 -> t & ~1): Restrict simplifications to INTEGRAL_TYPE_P. * gcc.dg/pr104675-3.c : New test. --- gcc/match.pd.jj 2022-02-25 10:55:08.0 +0100 +++ gcc/match.pd2022-02-25 11:48:04.730110154 +0100 @@ -731,7 +731,7 @@ (define_operator_list SYNC_FETCH_AND_AND /* Simplify (unsigned t * 2)/2 -> unsigned t & 0x7FFF. */ (simplify (trunc_div (mult @0 integer_pow2p@1) @1) - (if (TYPE_UNSIGNED (TREE_TYPE (@0))) + (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && TYPE_UNSIGNED (TREE_TYPE (@0))) (bit_and @0 { wide_int_to_tree (type, wi::mask (TYPE_PRECISION (type) - wi::exact_log2 (wi::to_wide (@1)), @@ -740,7 +740,7 @@ (define_operator_list SYNC_FETCH_AND_AND /* Simplify (unsigned t / 2) * 2 -> unsigned t & ~1. */ (simplify (mult (trunc_div @0 integer_pow2p@1) @1) - (if (TYPE_UNSIGNED (TREE_TYPE (@0))) + (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && TYPE_UNSIGNED (TREE_TYPE (@0))) (bit_and @0 (negate @1 /* Simplify (t * 2) / 2) -> t. */ --- gcc/testsuite/gcc.dg/pr104675-3.c.jj2022-02-25 11:24:43.111619659 +0100 +++ gcc/testsuite/gcc.dg/pr104675-3.c 2022-02-25 11:26:35.406056600 +0100 @@ -0,0 +1,29 @@ +/* PR tree-optimization/104675 */ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ + +_Complex unsigned int +foo (_Complex unsigned int x) +{ + return (x / 2) * 2; +} + +_Complex unsigned int +bar (_Complex unsigned int x) +{ + return (x * 2) / 2; +} + +_Complex unsigned int +baz (_Complex unsigned int x) +{ + _Complex unsigned int y = x / 2; + return y * 2; +} + +_Complex unsigned int +qux (_Complex unsigned int x) +{ + _Complex unsigned int y = x * 2; + return y / 2; +} Jakub
[committed] rs6000: Use rs6000_emit_move in movmisalign expander [PR104681]
Hi! The following testcase ICEs, because for some strange reason it decides to use movmisaligntf during expansion where the destination is MEM and source is CONST_DOUBLE. For normal mov expanders the rs6000 backend uses rs6000_emit_move to ensure that if one operand is a MEM, the other is a REG and a few other things, but for movmisalign nothing enforced this. The middle-end documents that movmisalign shouldn't fail, so we can't force that through predicates or condition on the expander. Bootstrapped/regtested on powerpc64le-linux and powerpc64-linux (the latter with -m32/-m64 testing), preapproved in the PR by Segher, committed to trunk (so far). 2022-02-25 Jakub Jelinek PR target/104681 * config/rs6000/vector.md (movmisalign): Use rs6000_emit_move. * g++.dg/opt/pr104681.C: New test. --- gcc/config/rs6000/vector.md.jj 2022-01-11 23:11:21.974296000 +0100 +++ gcc/config/rs6000/vector.md 2022-02-25 12:36:53.763307488 +0100 @@ -1519,7 +1519,10 @@ (define_expand "movmisalign" [(set (match_operand:VEC_N 0 "nonimmediate_operand") (match_operand:VEC_N 1 "any_operand"))] "VECTOR_MEM_VSX_P (mode) && TARGET_ALLOW_MOVMISALIGN" - "") +{ + rs6000_emit_move (operands[0], operands[1], mode); + DONE; +}) ;; Vector shift right in bits. Currently supported ony for shift ;; amounts that can be expressed as byte shifts (divisible by 8). --- gcc/testsuite/g++.dg/opt/pr104681.C.jj 2022-02-25 12:38:39.419835845 +0100 +++ gcc/testsuite/g++.dg/opt/pr104681.C 2022-02-25 12:35:30.137472275 +0100 @@ -0,0 +1,19 @@ +// PR target/104681 +// { dg-do compile } +// { dg-options "-O2" } + +void bar (); +struct A { + A (bool) : a(7.0L), b(0) {} + long double a; + long b; +}; +struct B { + void foo () { c = bar; } + A c; +}; +struct C { + void baz (); + B d; +}; +void C::baz () { d.foo (); } Jakub
Re: [PATCH] match.pd: Further complex simplification fixes [PR104675]
> Am 25.02.2022 um 18:58 schrieb Jakub Jelinek via Gcc-patches > : > > Hi! > > Mark mentioned in the PR further 2 simplifications that also ICE > with complex types. > For these, eventually (but IMO GCC 13 materials) we could support it > for vector types if it would be uniform vector constants. > Currently integer_pow2p is true only for INTEGER_CSTs and COMPLEX_CSTs > and we can't use bit_and etc. for complex type. > > Bootstrapped/regtested on powerpc64le-linux and powerpc64-linux, ok > for trunk? Ok. Richard > 2022-02-25 Jakub Jelinek >Marc Glisse > >PR tree-optimization/104675 >* match.pd (t * 2U / 2 -> t & (~0 / 2), t / 2U * 2 -> t & ~1): >Restrict simplifications to INTEGRAL_TYPE_P. > >* gcc.dg/pr104675-3.c : New test. > > --- gcc/match.pd.jj2022-02-25 10:55:08.0 +0100 > +++ gcc/match.pd2022-02-25 11:48:04.730110154 +0100 > @@ -731,7 +731,7 @@ (define_operator_list SYNC_FETCH_AND_AND > /* Simplify (unsigned t * 2)/2 -> unsigned t & 0x7FFF. */ > (simplify > (trunc_div (mult @0 integer_pow2p@1) @1) > - (if (TYPE_UNSIGNED (TREE_TYPE (@0))) > + (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && TYPE_UNSIGNED (TREE_TYPE (@0))) > (bit_and @0 { wide_int_to_tree >(type, wi::mask (TYPE_PRECISION (type) > - wi::exact_log2 (wi::to_wide (@1)), > @@ -740,7 +740,7 @@ (define_operator_list SYNC_FETCH_AND_AND > /* Simplify (unsigned t / 2) * 2 -> unsigned t & ~1. */ > (simplify > (mult (trunc_div @0 integer_pow2p@1) @1) > - (if (TYPE_UNSIGNED (TREE_TYPE (@0))) > + (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && TYPE_UNSIGNED (TREE_TYPE (@0))) > (bit_and @0 (negate @1 > > /* Simplify (t * 2) / 2) -> t. */ > --- gcc/testsuite/gcc.dg/pr104675-3.c.jj2022-02-25 11:24:43.111619659 > +0100 > +++ gcc/testsuite/gcc.dg/pr104675-3.c2022-02-25 11:26:35.406056600 +0100 > @@ -0,0 +1,29 @@ > +/* PR tree-optimization/104675 */ > +/* { dg-do compile } */ > +/* { dg-options "-O2" } */ > + > +_Complex unsigned int > +foo (_Complex unsigned int x) > +{ > + return (x / 2) * 2; > +} > + > +_Complex unsigned int > +bar (_Complex unsigned int x) > +{ > + return (x * 2) / 2; > +} > + > +_Complex unsigned int > +baz (_Complex unsigned int x) > +{ > + _Complex unsigned int y = x / 2; > + return y * 2; > +} > + > +_Complex unsigned int > +qux (_Complex unsigned int x) > +{ > + _Complex unsigned int y = x * 2; > + return y / 2; > +} > >Jakub >
Re: [PATCH][V2][middle-end/104550]Suppress uninitialized warnings for new created uses from __builtin_clear_padding folding
Hi, After more study of all the discussion so far and the corresponding code for suppress_warning, I think the following suggestion Should be the best approach right now for this issue: > SET_EXPR_LOCATION (tmp_dst, UNKNOWN_LOCATION); > suppress_warning (tmp_dst, OPT_Wuninitialized); > with a comment explaing why we do that? The reason is: After “SET_EXPR_LOCATION (tmp_dst, UNKNOWN_LOCATION)”, 152 /* Enable, or by default disable, a warning for the expression. 153The wildcard OPT of -1 controls all warnings. */ 154 155 void 156 suppress_warning (tree expr, opt_code opt /* = all_warnings */, 157 bool supp /* = true */) 158 { 159 if (opt == no_warning) 160 return; 161 162 const location_t loc = get_location (expr); 163 164 if (!RESERVED_LOCATION_P (loc)) 165 supp = suppress_warning_at (loc, opt, supp) || supp; 166 set_no_warning_bit (expr, supp); 167 } Suppress_warning will NOT call “suppress_warning_at” to involve any operation on the hash tables. It just simply call “set_no_warning_bit” to set the no_warning bit for the MEM_REF expr. And later during the routine “maybe_warn_operand” in tree-sea-uninit.cc, “get_no_uninit_warning” will also Simply check the no_warning bit of the MEM_REF to see whether the warning need to be issued. This resolved all the concerns we have so far. I will prepare the patch based on this approach. Let me know your opinion. Qing > On Feb 25, 2022, at 4:04 AM, Jakub Jelinek via Gcc-patches > wrote: > > On Fri, Feb 25, 2022 at 10:31:50AM +0100, Richard Biener wrote: >> I think it's used as fallback for UNKNOWN_LOCATION, but if we "invent" >> a creative location for the artificial stmt it will of course >> affect other stmts/expressions using that location. >> >>> I think it will work. >> >> Yes, I think so. OTOH the uninit pass does >> >> /* Avoid warning if we've already done so or if the warning has been >> suppressed. */ >> if (((warning_suppressed_p (context, OPT_Wuninitialized) >>|| (gimple_assign_single_p (context) >>&& get_no_uninit_warning (gimple_assign_rhs1 (context) >> || (var && get_no_uninit_warning (var)) >> || (var_name_str >> && warning_suppressed_p (var_def_stmt, OPT_Wuninitialized))) >>return; >> >> that's a mightly complicated way to test and I'm not sure we get >> to the bit on the stmt reliably. So maybe TREE_NO_WARNING on the >> reference (or making sure it has UNKNOWN_LOCATION and using >> suppress_warning on it) is a better idea after all... > > So > SET_EXPR_LOCATION (tmp_dst, UNKNOWN_LOCATION); > suppress_warning (tmp_dst, OPT_Wuninitialized); > with a comment explaing why we do that? > LGTM. > > Jakub >
[PATCH] c++: ICE with attribute on enumerator [PR104667]
When processing a template, the enumerators we build don't have a type yet. But is_late_template_attribute is not prepared to see a _DECL without a type, so we crash on enum tree_code code = TREE_CODE (type); (I found that we don't give the "is deprecated" warning for the enumerator 'f' in the test. Reported as PR104682.) Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/11? PR c++/104667 gcc/cp/ChangeLog: * decl2.cc (is_late_template_attribute): Cope with a decl without a type. gcc/testsuite/ChangeLog: * g++.dg/ext/attrib64.C: New test. --- gcc/cp/decl2.cc | 2 +- gcc/testsuite/g++.dg/ext/attrib64.C | 11 +++ 2 files changed, 12 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/g++.dg/ext/attrib64.C diff --git a/gcc/cp/decl2.cc b/gcc/cp/decl2.cc index 2e58419ea51..dc7710660d0 100644 --- a/gcc/cp/decl2.cc +++ b/gcc/cp/decl2.cc @@ -1300,7 +1300,7 @@ is_late_template_attribute (tree attr, tree decl) /* We can't apply any attributes to a completely unknown type until instantiation time. */ - enum tree_code code = TREE_CODE (type); + enum tree_code code = type ? TREE_CODE (type) : ERROR_MARK; if (code == TEMPLATE_TYPE_PARM || code == BOUND_TEMPLATE_TEMPLATE_PARM || code == TYPENAME_TYPE) diff --git a/gcc/testsuite/g++.dg/ext/attrib64.C b/gcc/testsuite/g++.dg/ext/attrib64.C new file mode 100644 index 000..4a4505fc4b2 --- /dev/null +++ b/gcc/testsuite/g++.dg/ext/attrib64.C @@ -0,0 +1,11 @@ +// PR c++/104667 +// { dg-do compile } + +template struct A { + enum E { // { dg-warning "only applies to function types" } +e __attribute__ ((access(read_only))), +f __attribute__ ((deprecated)) + }; +}; + +A a; base-commit: ae3c4e521dd0b66db712639298cd08331d62f315 -- 2.35.1
[PATCH] c++: Lost deprecated/unavailable attr in class tmpl [PR104682]
[ Most likely a GCC 13 patch, but I'm posting it now so that I don't lose it. ] When looking into the other PR I noticed that we fail to give a warning for a deprecated enumerator when the enum is in a class template. This only happens when the attribute doesn't have an argument. The reason is that when we tsubst_enum, we create a new enumerator: build_enumerator (DECL_NAME (decl), value, newtag, DECL_ATTRIBUTES (decl), DECL_SOURCE_LOCATION (decl)); but DECL_ATTRIBUTES (decl) is null when the attribute was provided without an argument -- in that case it simply melts into a tree flag. handle_deprecated_attribute has: if (!args) *no_add_attrs = true; so the attribute isn't retained and we lose it when tsubsting. Same thing when the attribute is on the enum itself. Attribute unavailable is a similar case, but it's different in that it can be a late attribute whereas "deprecated" can't: is_late_template_attribute has /* But some attributes specifically apply to templates. */ && !is_attribute_p ("abi_tag", name) && !is_attribute_p ("deprecated", name) && !is_attribute_p ("visibility", name)) return true; else return false; which looks strange, but attr-unavailable-9.C tests that we don't error when the attribute is applied on a template. Bootstrapped/regtested on x86_64-pc-linux-gnu. PR c++/104682 gcc/cp/ChangeLog: * cp-tree.h (build_enumerator): Adjust. * decl.cc (finish_enum): Make it return the new decl. * pt.cc (tsubst_enum): Propagate TREE_DEPRECATED and TREE_UNAVAILABLE. gcc/testsuite/ChangeLog: * g++.dg/ext/attr-unavailable-10.C: New test. * g++.dg/ext/attr-unavailable-11.C: New test. * g++.dg/warn/deprecated-17.C: New test. * g++.dg/warn/deprecated-18.C: New test. --- gcc/cp/cp-tree.h | 2 +- gcc/cp/decl.cc| 4 +- gcc/cp/pt.cc | 17 +++-- .../g++.dg/ext/attr-unavailable-10.C | 22 +++ .../g++.dg/ext/attr-unavailable-11.C | 22 +++ gcc/testsuite/g++.dg/warn/deprecated-17.C | 35 ++ gcc/testsuite/g++.dg/warn/deprecated-18.C | 37 +++ 7 files changed, 133 insertions(+), 6 deletions(-) create mode 100644 gcc/testsuite/g++.dg/ext/attr-unavailable-10.C create mode 100644 gcc/testsuite/g++.dg/ext/attr-unavailable-11.C create mode 100644 gcc/testsuite/g++.dg/warn/deprecated-17.C create mode 100644 gcc/testsuite/g++.dg/warn/deprecated-18.C diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h index 37d462fca6e..80994e94793 100644 --- a/gcc/cp/cp-tree.h +++ b/gcc/cp/cp-tree.h @@ -6833,7 +6833,7 @@ extern void xref_basetypes(tree, tree); extern tree start_enum (tree, tree, tree, tree, bool, bool *); extern void finish_enum_value_list (tree); extern void finish_enum(tree); -extern void build_enumerator (tree, tree, tree, tree, location_t); +extern tree build_enumerator (tree, tree, tree, tree, location_t); extern tree lookup_enumerator (tree, tree); extern bool start_preparsed_function (tree, tree, int); extern bool start_function (cp_decl_specifier_seq *, diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc index 7b48b56231b..7f80f9d4d7a 100644 --- a/gcc/cp/decl.cc +++ b/gcc/cp/decl.cc @@ -16409,7 +16409,7 @@ finish_enum (tree enumtype) Apply ATTRIBUTES if available. LOC is the location of NAME. Assignment of sequential values by default is handled here. */ -void +tree build_enumerator (tree name, tree value, tree enumtype, tree attributes, location_t loc) { @@ -16611,6 +16611,8 @@ incremented enumerator value is too large for %")); /* Add this enumeration constant to the list for this type. */ TYPE_VALUES (enumtype) = tree_cons (name, decl, TYPE_VALUES (enumtype)); + + return decl; } /* Look for an enumerator with the given NAME within the enumeration diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc index 70f02db8757..8fb17349ee1 100644 --- a/gcc/cp/pt.cc +++ b/gcc/cp/pt.cc @@ -26944,9 +26944,8 @@ tsubst_enum (tree tag, tree newtag, tree args) for (e = TYPE_VALUES (tag); e; e = TREE_CHAIN (e)) { tree value; - tree decl; + tree decl = TREE_VALUE (e); - decl = TREE_VALUE (e); /* Note that in a template enum, the TREE_VALUE is the CONST_DECL, not the corresponding INTEGER_CST. */ value = tsubst_expr (DECL_INITIAL (decl), @@ -26958,8 +26957,14 @@ tsubst_enum (tree tag, tree newtag, tree args) /* Actually build the enumerator itself. Here we're assuming that enumerators can't have dependent attributes. */ - build_enumerator (DECL_NAME (decl), value, newtag
[PATCH][V4][middle-end/104550]Suppress uninitialized warnings for new created uses from __builtin_clear_padding folding
Hi, This is the 4th version based on the discussion so far. The major change is: > SET_EXPR_LOCATION (tmp_dst, UNKNOWN_LOCATION); > suppress_warning (tmp_dst, OPT_Wuninitialized); > with a comment explaing why we do that. The patch has been bootstrapped and regress tested on both x86 and aarch64. Okay for trunk? Thanks. Qing = >From 276975e60827942f0dd4043ce5f52e600327d6a8 Mon Sep 17 00:00:00 2001 From: Qing Zhao Date: Thu, 24 Feb 2022 22:38:38 + Subject: [PATCH] Suppress uninitialized warnings for new created uses from __builtin_clear_padding folding [PR104550] __builtin_clear_padding(&object) will clear all the padding bits of the object. actually, it doesn't involve any use of an user variable. Therefore, users do not expect any uninitialized warning from it. It's reasonable to suppress uninitialized warnings for all new created uses from __builtin_clear_padding folding. PR middle-end/104550 gcc/ChangeLog: * gimple-fold.cc (clear_padding_flush): Suppress warnings for new created uses. gcc/testsuite/ChangeLog: * gcc.dg/auto-init-pr104550-1.c: New test. * gcc.dg/auto-init-pr104550-2.c: New test. * gcc.dg/auto-init-pr104550-3.c: New test. --- gcc/gimple-fold.cc | 12 +++- gcc/testsuite/gcc.dg/auto-init-pr104550-1.c | 10 ++ gcc/testsuite/gcc.dg/auto-init-pr104550-2.c | 11 +++ gcc/testsuite/gcc.dg/auto-init-pr104550-3.c | 11 +++ 4 files changed, 43 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr104550-1.c create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr104550-2.c create mode 100644 gcc/testsuite/gcc.dg/auto-init-pr104550-3.c diff --git a/gcc/gimple-fold.cc b/gcc/gimple-fold.cc index 16f02c2d098d..c9179abb27ed 100644 --- a/gcc/gimple-fold.cc +++ b/gcc/gimple-fold.cc @@ -4379,7 +4379,17 @@ clear_padding_flush (clear_padding_struct *buf, bool full) else { src = make_ssa_name (type); - g = gimple_build_assign (src, unshare_expr (dst)); + tree tmp_dst = unshare_expr (dst); + /* The folding introduces a read from the tmp_dst, we should +prevent uninitialized warning analysis from issuing warning +for such fake read. In order to suppress warning only for +this expr, we should set the location of tmp_dst to +UNKNOWN_LOCATION first, then suppress_warning will call +set_no_warning_bit to set the no_warning flag only for +tmp_dst. */ + SET_EXPR_LOCATION (tmp_dst, UNKNOWN_LOCATION); + suppress_warning (tmp_dst, OPT_Wuninitialized); + g = gimple_build_assign (src, tmp_dst); gimple_set_location (g, buf->loc); gsi_insert_before (buf->gsi, g, GSI_SAME_STMT); tree mask = native_interpret_expr (type, diff --git a/gcc/testsuite/gcc.dg/auto-init-pr104550-1.c b/gcc/testsuite/gcc.dg/auto-init-pr104550-1.c new file mode 100644 index ..a08110c3a170 --- /dev/null +++ b/gcc/testsuite/gcc.dg/auto-init-pr104550-1.c @@ -0,0 +1,10 @@ +/* PR 104550*/ +/* { dg-do compile } */ +/* { dg-options "-O -Wuninitialized -ftrivial-auto-var-init=pattern" } */ +struct vx_audio_level { + int has_monitor_level : 1; +}; + +void vx_set_monitor_level() { + struct vx_audio_level info; /* { dg-bogus "info" "is used uninitialized" } */ +} diff --git a/gcc/testsuite/gcc.dg/auto-init-pr104550-2.c b/gcc/testsuite/gcc.dg/auto-init-pr104550-2.c new file mode 100644 index ..2c395b32d322 --- /dev/null +++ b/gcc/testsuite/gcc.dg/auto-init-pr104550-2.c @@ -0,0 +1,11 @@ +/* PR 104550 */ +/* { dg-do compile } */ +/* { dg-options "-O -Wuninitialized -ftrivial-auto-var-init=zero" } */ +struct vx_audio_level { + int has_monitor_level : 1; +}; + +void vx_set_monitor_level() { + struct vx_audio_level info; + __builtin_clear_padding (&info); /* { dg-bogus "info" "is used uninitialized" } */ +} diff --git a/gcc/testsuite/gcc.dg/auto-init-pr104550-3.c b/gcc/testsuite/gcc.dg/auto-init-pr104550-3.c new file mode 100644 index ..9893e37f12d8 --- /dev/null +++ b/gcc/testsuite/gcc.dg/auto-init-pr104550-3.c @@ -0,0 +1,11 @@ +/* PR 104550 */ +/* { dg-do compile } */ +/* { dg-options "-O -Wuninitialized -ftrivial-auto-var-init=pattern" } */ +struct vx_audio_level { + int has_monitor_level : 1; +}; + +void vx_set_monitor_level() { + struct vx_audio_level info; /* { dg-bogus "info" "is used uninitialized" } */ + __builtin_clear_padding (&info); /* { dg-bogus "info" "is used uninitialized" } */ +} -- 2.27.0