On Thu, Jun 4, 2020 at 5:34 AM Jiufu Guo <guoji...@linux.ibm.com> wrote: > > Jiufu Guo <guoji...@linux.ibm.com> writes: > > Hi, > > Patch is updated a little according to comments. > Please see if this is ok to commit.
OK with a proper ChangeLog after bootstrap / testing. It's also OK to backport this to the GCC 10 branch if powerpc folks want to restore GIMPLE cunroll behavior for their target. Thanks, Richard. > > diff --git a/gcc/common.opt b/gcc/common.opt > index 4464049fc1f..570e2aa53c8 100644 > --- a/gcc/common.opt > +++ b/gcc/common.opt > @@ -2856,6 +2856,10 @@ funroll-all-loops > Common Report Var(flag_unroll_all_loops) Optimization > Perform loop unrolling for all loops. > > +funroll-completely-grow-size > +Undocumented Var(flag_cunroll_grow_size) Init(2) Optimization > +; Internal undocumented flag, allow size growth during complete unrolling > + > ; Nonzero means that loop optimizer may assume that the induction variables > ; that control loops do not overflow and that the loops with nontrivial > ; exit condition are not infinite > diff --git a/gcc/toplev.c b/gcc/toplev.c > index 96316fbd23b..e1010077ddb 100644 > --- a/gcc/toplev.c > +++ b/gcc/toplev.c > @@ -1474,6 +1474,11 @@ process_options (void) > if (flag_unroll_all_loops) > flag_unroll_loops = 1; > > + /* Allow cunroll to grow size accordingly. */ > + if (flag_cunroll_grow_size == AUTODETECT_VALUE) > + flag_cunroll_grow_size > + = flag_unroll_loops || flag_peel_loops || optimize >= 3; > + > /* web and rename-registers help when run after loop unrolling. */ > if (flag_web == AUTODETECT_VALUE) > flag_web = flag_unroll_loops; > diff --git a/gcc/tree-ssa-loop-ivcanon.c b/gcc/tree-ssa-loop-ivcanon.c > index 8ab6ab3330c..298ab215530 100644 > --- a/gcc/tree-ssa-loop-ivcanon.c > +++ b/gcc/tree-ssa-loop-ivcanon.c > @@ -1603,9 +1603,8 @@ pass_complete_unroll::execute (function *fun) > re-peeling the same loop multiple times. */ > if (flag_peel_loops) > peeled_loops = BITMAP_ALLOC (NULL); > - unsigned int val = tree_unroll_loops_completely (flag_unroll_loops > - || flag_peel_loops > - || optimize >= 3, true); > + unsigned int val = tree_unroll_loops_completely (flag_cunroll_grow_size, > + true); > if (peeled_loops) > { > BITMAP_FREE (peeled_loops); > > BR, > Jiufu > > > > Richard Biener <richard.guent...@gmail.com> writes: > > > >> On Tue, Jun 2, 2020 at 4:10 AM Jiufu Guo <guoji...@linux.ibm.com> wrote: > >>> > >>> Jiufu Guo <guoji...@linux.ibm.com> writes: > >>> > >>> Hi, > >>> > >>> I updated the patch just a little accordinlgy. Thanks! > >>> > >>> diff --git a/gcc/common.opt b/gcc/common.opt > >>> index 4464049fc1f..570e2aa53c8 100644 > >>> --- a/gcc/common.opt > >>> +++ b/gcc/common.opt > >>> @@ -2856,6 +2856,10 @@ funroll-all-loops > >>> Common Report Var(flag_unroll_all_loops) Optimization > >>> Perform loop unrolling for all loops. > >>> > >>> +funroll-completely-grow-size > >>> +Common Undocumented Var(flag_cunroll_grow_size) Init(2) Optimization > >>> +; Internal undocumented flag, allow size growth during complete unrolling > >>> + > >>> ; Nonzero means that loop optimizer may assume that the induction > >>> variables > >>> ; that control loops do not overflow and that the loops with nontrivial > >>> ; exit condition are not infinite > >>> diff --git a/gcc/toplev.c b/gcc/toplev.c > >>> index 96316fbd23b..8d52358efdd 100644 > >>> --- a/gcc/toplev.c > >>> +++ b/gcc/toplev.c > >>> @@ -1474,6 +1474,10 @@ process_options (void) > >>> if (flag_unroll_all_loops) > >>> flag_unroll_loops = 1; > >>> > >>> + /* Allow cunroll to grow size accordingly. */ > >>> + if (flag_cunroll_grow_size == AUTODETECT_VALUE) > >>> + flag_cunroll_grow_size = flag_unroll_loops || flag_peel_loops; > >>> + > >> > >> (*) > >> > >>> /* web and rename-registers help when run after loop unrolling. */ > >>> if (flag_web == AUTODETECT_VALUE) > >>> flag_web = flag_unroll_loops; > >>> diff --git a/gcc/tree-ssa-loop-ivcanon.c b/gcc/tree-ssa-loop-ivcanon.c > >>> index 8ab6ab3330c..298ab215530 100644 > >>> --- a/gcc/tree-ssa-loop-ivcanon.c > >>> +++ b/gcc/tree-ssa-loop-ivcanon.c > >>> @@ -1603,9 +1603,8 @@ pass_complete_unroll::execute (function *fun) > >>> re-peeling the same loop multiple times. */ > >>> if (flag_peel_loops) > >>> peeled_loops = BITMAP_ALLOC (NULL); > >>> - unsigned int val = tree_unroll_loops_completely (flag_unroll_loops > >>> - || flag_peel_loops > >>> - || optimize >= 3, > >>> true); > >>> + unsigned int val = tree_unroll_loops_completely > >>> (flag_cunroll_grow_size, > >> > >> With -O3 -fno-peel-loops this does not have the same effect anymore. > >> So above (*) you'd need to check || optimize >= 3 as well. > > > > Oh, yes. Thanks for your kindly review! > > > > BR, > > Jiufu > > > >> > >>> + true); > >>> if (peeled_loops) > >>> { > >>> BITMAP_FREE (peeled_loops); > >>> > >>> BR, > >>> Jiufu > >>> > >>> > Richard Biener <richard.guent...@gmail.com> writes: > >>> > > >>> > >>> >>> >> From: Jiufu Guo <guoji...@linux.ibm.com> > >>> >>> >> > >>> >>> >> Currently GIMPLE complete unroller(cunroll) is checking > >>> >>> >> flag_unroll_loops and flag_peel_loops to see if allow size growth. > >>> >>> >> Beside affects curnoll, flag_unroll_loops also controls RTL > >>> >>> >> unroler. > >>> >>> >> To have more freedom to control cunroll and RTL unroller, this > >>> >>> >> patch > >>> >>> >> introduces flag_cunroll_grow_size. With this patch, we can control > >>> >>> >> cunroll and RTL unroller indepently. > >>> >>> >> > >>> >>> >> Bootstrap/regtest pass on powerpc64le. OK for trunk? And backport > >>> >>> >> to > >>> >>> >> GCC10 after week? > >>> >>> >> > >>> >>> >> > >>> >>> >> +funroll-completely-grow-size > >>> >>> >> +Var(flag_cunroll_grow_size) Init(2) > >>> >>> >> +; Control cunroll to allow size growth during complete unrolling > >>> >>> >> + > >>> >>> > > >>> >> > >>> >> It won't work without adjusting the awk scripts. So go with > >>> >> > >>> >> funroll-completely-grow-size > >>> >> Undocumented Optimization Var(flag_cunroll_grow_size) > >>> >> EnabledBy(funroll-loops || fpeel-loops) > >>> >> ; ... > >>> >> > >>> > EnabledBy(funroll-loops || fpeel-loops) does not works as we expected: > >>> > "-funroll-loops -fno-peel-loops" turns off flag_cunroll_grow_size. > >>> > > >>> > Through "EnabledBy", a flag can be turned, and also can be turned off by > >>> > the "EnabledBy option", only if the flag is not specifed through commond > >>> > line. > >>> > > >>> >> and enable it at O3+. AUTODETECT_VALUE doesn't make sense for > >>> >> an option not supposed to be set by users? > >>> >> > >>> > > >>> > global_options_set.x_flagxxx can be used to check if option is set by > >>> > user. But it does not work well here neither, because we also care of > >>> > if the flag is override by OPTION_OPTIMIZATION_TABLE or > >>> > OPTION_OVERRIDE. > >>> > > >>> > AUTODETECT_VALUE(value is 2) is used for some flags like flag_web, > >>> > flag_rename_registers, flag_var_tracking, flag_tree_cselim... > >>> > And this way could be used to check if the flag is effective(on/off) > >>> > either explicit set by command line or implicit set through > >>> > OPTION_OVERRIDE or OPTION_OPTIMIZATION_TABLE. > >>> > So, I use it here. > >>> > >>>