Martin, The following is the patch to fix the issues you raised in the previous email, let me know if I still miss anything:
Thanks a lot. Qing ===== From 14524a228b4b41b4eaaa2497455725e075126c2c Mon Sep 17 00:00:00 2001 From: Qing Zhao <qing.z...@oracle.com> Date: Mon, 26 Jul 2021 15:46:59 +0000 Subject: [PATCH] Fix tree-sra.c issue raised by Martin Jambor --- gcc/testsuite/gcc.dg/auto-init-sra-1.c | 24 ++++++++++++++++++++++++ gcc/testsuite/gcc.dg/auto-init-sra-2.c | 24 ++++++++++++++++++++++++ gcc/tree-sra.c | 8 ++++++-- 3 files changed, 53 insertions(+), 3 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/auto-init-sra-1.c create mode 100644 gcc/testsuite/gcc.dg/auto-init-sra-2.c diff --git a/gcc/testsuite/gcc.dg/auto-init-sra-1.c b/gcc/testsuite/gcc.dg/auto-init-sra-1.c new file mode 100644 index 000000000000..88fd66678f29 --- /dev/null +++ b/gcc/testsuite/gcc.dg/auto-init-sra-1.c @@ -0,0 +1,24 @@ +/* Verify that SRA total scalarization will not be confused by padding + and also not confused by auto initialization. */ +/* { dg-do compile } */ +/* { dg-options "-O1 --param sra-max-scalarization-size-Ospeed=16 -fdump-tree-release_ssa -ftrivial-auto-var-init=zero" } */ + +struct S +{ + int i; + unsigned short f1; + char f2; + unsigned short f3, f4; +}; + + +int foo (struct S *p) +{ + struct S l; + + l = *p; + l.i++; + *p = l; +} + +/* { dg-final { scan-tree-dump-times "l;" 0 "release_ssa" } } */ diff --git a/gcc/testsuite/gcc.dg/auto-init-sra-2.c b/gcc/testsuite/gcc.dg/auto-init-sra-2.c new file mode 100644 index 000000000000..d260f5ae934e --- /dev/null +++ b/gcc/testsuite/gcc.dg/auto-init-sra-2.c @@ -0,0 +1,24 @@ +/* Verify that SRA total scalarization will not be confused by padding + and also not confused by auto initialization. */ +/* { dg-do compile } */ +/* { dg-options "-O1 --param sra-max-scalarization-size-Ospeed=16 -fdump-tree-release_ssa -ftrivial-auto-var-init=pattern" } */ + +struct S +{ + int i; + unsigned short f1; + char f2; + unsigned short f3, f4; +}; + + +int foo (struct S *p) +{ + struct S l; + + l = *p; + l.i++; + *p = l; +} + +/* { dg-final { scan-tree-dump-times "l;" 0 "release_ssa" } } */ diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c index d1280e5f8848..bebba4deaf94 100644 --- a/gcc/tree-sra.c +++ b/gcc/tree-sra.c @@ -1195,8 +1195,10 @@ build_access_from_expr (tree expr, gimple *stmt, bool write) { /* This means the aggregate is accesses as a whole in a way other than an assign statement and thus cannot be removed even if we had a scalar - replacement for everything. */ - if (cannot_scalarize_away_bitmap) + replacement for everything. However, when the STMT is a call to + DEFERRED_INIT, we should not set this bit. */ + if (cannot_scalarize_away_bitmap + && !gimple_call_internal_p (stmt, IFN_DEFERRED_INIT)) bitmap_set_bit (cannot_scalarize_away_bitmap, DECL_UID (access->base)); return true; } @@ -4167,6 +4169,8 @@ sra_modify_deferred_init (gimple *stmt, gimple_stmt_iterator *gsi) tree arg0_repl = TYPE_SIZE_UNIT (TREE_TYPE (lhs_repl)); gimple_call_set_arg (stmt, 0, arg0_repl); sra_stats.deferred_init++; + gcc_assert (!lhs_access->first_child); + return SRA_AM_MODIFIED; } if (lhs_access->first_child) -- 2.27.0 > On Jul 26, 2021, at 10:25 AM, Qing Zhao via Gcc-patches > <gcc-patches@gcc.gnu.org> wrote: > > HI, Martin, > > Thank you for the comments and suggestions on tree-sra.c changes. > >>> ******Compare with the 4th version, the following are the major changes: >>> >>> 1. delete the code for handling "grp_to_be_debug_replaced" since they are >>> not needed per Martin Jambor's suggestion. >> >> sorry if I did not make myself clear in my last email, but the deferred >> init calls should not result into setting any bits in >> cannot_scalarize_away_bitmap in the SRA pass, otherwise you'll get >> different optimization with and without -ftrivial-auto-var-init. > > It’s my bad that I missed this part of your comments… > >> >> So you either need to change build_access_from_expr like I described in >> my email > > Is the following the change you suggested previously: > > [opc@qinzhao-ol8u3-x86 gcc]$ git diff tree-sra.c > diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c > index d1280e5f8848..c2597b705169 100644 > --- a/gcc/tree-sra.c > +++ b/gcc/tree-sra.c > @@ -1195,8 +1195,10 @@ build_access_from_expr (tree expr, gimple *stmt, bool > write) > { > /* This means the aggregate is accesses as a whole in a way other than > an > assign statement and thus cannot be removed even if we had a scalar > - replacement for everything. */ > - if (cannot_scalarize_away_bitmap) > + replacement for everything. However, when the STMT is a call to > + DEFERRED_INIT, we should not set this bit. */ > + if (cannot_scalarize_away_bitmap > + && !gimple_call_internal_p (stmt, IFN_DEFERRED_INIT)) > bitmap_set_bit (cannot_scalarize_away_bitmap, DECL_UID (access->base)); > return true; > } > > >> or add the following to your patch, which is probably slightly >> mor efficient (but it has been only very mildly tested). >> >> >> diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c >> index d1280e5f884..602b0fb3a2d 100644 >> --- a/gcc/tree-sra.c >> +++ b/gcc/tree-sra.c >> @@ -1395,7 +1395,12 @@ scan_function (void) >> >> t = gimple_call_lhs (stmt); >> if (t && !disqualify_if_bad_bb_terminating_stmt (stmt, t, NULL)) >> - ret |= build_access_from_expr (t, stmt, true); >> + { >> + if (gimple_call_internal_p (stmt, IFN_DEFERRED_INIT)) >> + ret |= !!build_access_from_expr_1 (t, stmt, true); >> + else >> + ret |= build_access_from_expr (t, stmt, true); >> + } >> break; > > Thanks for the patch, but I don’t quite understand the above change: > > When the call is IFN_DEFERRED_INIT, you used build_access_from_expr_1 instead > of build_access_from_expr to avoid setting “cannot_scalarize_away_bitmap” bit. > But why adding “!” To this call? > > >> >> case GIMPLE_ASM: >> >> >> >> And unfortunately I just spotted another potential problem in: >> >>> +static enum assignment_mod_result >>> +sra_modify_deferred_init (gimple *stmt, gimple_stmt_iterator *gsi) >>> +{ >>> + tree lhs = gimple_call_lhs (stmt); >>> + tree init_type = gimple_call_arg (stmt, 1); >>> + tree is_vla = gimple_call_arg (stmt, 2); >>> + >>> + struct access *lhs_access = get_access_for_expr (lhs); >>> + if (!lhs_access) >>> + return SRA_AM_NONE; >>> + >>> + location_t loc = gimple_location (stmt); >>> + >>> + if (lhs_access->grp_to_be_replaced) >>> + { >>> + tree lhs_repl = get_access_replacement (lhs_access); >>> + gimple_call_set_lhs (stmt, lhs_repl); >>> + tree arg0_repl = TYPE_SIZE_UNIT (TREE_TYPE (lhs_repl)); >>> + gimple_call_set_arg (stmt, 0, arg0_repl); >>> + sra_stats.deferred_init++; >> >> I think you want to add: >> >> gcc_assert (!lhs_access->first_child); >> return SRA_AM_MODIFIED; > > Okay. >> >> here, otherwise you risk that (in a case of a single-field structure) >> the call you have just modified here... >> >>> + } >>> + >>> + if (lhs_access->first_child) >>> + generate_subtree_deferred_init (lhs_access->first_child, >>> + init_type, is_vla, gsi, loc); >>> + if (lhs_access->grp_covered) >>> + { >>> + unlink_stmt_vdef (stmt); >>> + gsi_remove (gsi, true); >> >> ...will be deleted here. > > I see. Thanks a lot for spotting this issue. >> >>> + release_defs (stmt); >>> + return SRA_AM_REMOVED; >>> + } >>> + >>> + return SRA_AM_MODIFIED; >>> +} >> >> Sorry again for spotting this late and perhaps mis-communicating about >> the cannot_scalarize_away_bitmap issue earlier. Apart from these two >> things, I believe the tree-sra.c bits are fin. > > Thank you for the detailed review. > > Qing >> >> Martin >> >> >> >> >>> 2. for Pattern init, call __builtin_clear_padding after the call to >>> .DEFERRED_INIT to initialize the paddings to zeroes; >>> 3. for partially or fully initialized auto variables, call >>> __builtin_clear_padding before the real initialization to initialize >>> the paddings to zeroes. >>> 4. Update the documentation with padding initialization to zeroes. >>> 5. in order to reuse __builtin_clear_padding for auto init purpose, add one >>> more dummy argument to indiciate whether it's for auto init or not, >>> if for auto init, do not emit error messages to avoid confusing users. >>> 6. Add new testing cases to verify padding initializations. >>> 7. rename some of the old testing cases to make the file name reflecting >>> the testing purpose per Kees Cook's suggestions. >>> >>> ******Please see version 4 at: >>> https://gcc.gnu.org/pipermail/gcc-patches/2021-July/574642.html >>> >>> ******ChangeLog is: >>> gcc/ChangeLog: >>> >>> 2021-07-23 qing zhao <qing.z...@oracle.com> >>> >>> * builtins.c (expand_builtin_memset): Make external visible. >>> * builtins.h (expand_builtin_memset): Declare extern. >>> * common.opt (ftrivial-auto-var-init=): New option. >>> * doc/extend.texi: Document the uninitialized attribute. >>> * doc/invoke.texi: Document -ftrivial-auto-var-init. >>> * flag-types.h (enum auto_init_type): New enumerated type >>> auto_init_type. >>> * gimple-fold.c (clear_padding_type): Add one new parameter. >>> (clear_padding_union): Likewise. >>> (clear_padding_emit_loop): Likewise. >>> (clear_type_padding_in_mask): Likewise. >>> (gimple_fold_builtin_clear_padding): Handle this new parameter. >>> * gimplify.c (gimple_add_init_for_auto_var): New function. >>> (maybe_with_size_expr): Forword declaration. >>> (build_deferred_init): New function. >>> (gimple_add_padding_init_for_auto_var): New function. >>> (gimplify_decl_expr): Add initialization to automatic variables per >>> users' requests. >>> (gimplify_call_expr): Add one new parameter for call to >>> __builtin_clear_padding. >>> (gimplify_modify_expr_rhs): Add padding initialization before >>> gimplify_init_constructor. >>> * internal-fn.c (INIT_PATTERN_VALUE): New macro. >>> (expand_DEFERRED_INIT): New function. >>> * internal-fn.def (DEFERRED_INIT): New internal function. >>> * tree-cfg.c (verify_gimple_call): Verify calls to .DEFERRED_INIT. >>> * tree-sra.c (generate_subtree_deferred_init): New function. >>> (sra_modify_deferred_init): Likewise. >>> (sra_modify_function_body): Handle calls to DEFERRED_INIT specially. >>> * tree-ssa-structalias.c (find_func_aliases_for_call): Likewise. >>> * tree-ssa-uninit.c (warn_uninit): Handle calls to DEFERRED_INIT >>> specially. >>> (check_defs): Likewise. >>> (warn_uninitialized_vars): Likewise. >>> * tree-ssa.c (ssa_undefined_value_p): Likewise. >>> >>> gcc/c-family/ChangeLog: >>> >>> 2021-07-23 qing zhao <qing.z...@oracle.com> >>> >>> * c-attribs.c (handle_uninitialized_attribute): New function. >>> (c_common_attribute_table): Add "uninitialized" attribute.