"Bin.Cheng" <amker.ch...@gmail.com> writes: > On Tue, Aug 15, 2017 at 6:33 PM, Richard Sandiford > <richard.sandif...@linaro.org> wrote: >> Richard Biener <richard.guent...@gmail.com> writes: >>> On Tue, Aug 15, 2017 at 11:28 AM, Bin Cheng <bin.ch...@arm.com> wrote: >>>> Hi, >>>> This patch fixes PR81832. Root cause for the ICE is: >>>> 1) Loop has distributed inner loop. >>>> 2) The guarding function call IFN_LOOP_DIST_CALL happens to be in loop's >>>> header. >>>> 3) IFN_LOOP_DIST_CALL (int loop's header) is duplicated by pass_ch_vect >>>> thus >>>> not eliminated. >>>> >>>> Given pass_ch_vect copies loop header to enable more vectorization, we >>>> should >>>> skip loop in this case because distributed inner loop means this loop can >>>> not >>>> be vectorized anyway. One point to mention is name >>>> inner_loop_distributed_p >>>> is a little misleading. The name indicates that each basic block is >>>> checked, >>>> but the patch only checks loop's header for simplicity/efficiency's >>>> purpose. >>>> Any comment? >>> >>> My comment would be to question pass_ch_vect placement -- what was the >>> reason to place it so late? >>> >>> I also see GRAPHITE runs inbetween loop distribution and vectorization -- >>> what prevents GRAPHITE from messing up things here? Or autopar? >>> >>> The patch itself shows should_duplicate_loop_header_p should >>> handle this IFN specially (somehow all IFNs are considered "inexpensive"). >>> >>> So can you please adjust should_duplicate_loop_header_p instead and/or >>> gimple_inexpensive_call_p? Since we have IFNs for stuff like EXP10 I'm not >>> sure if that default is so good. >> >> I think the default itself is OK: we only use IFNs for libm functions >> like exp10 if an underlying optab exists (unlike __builtin_exp10, which >> is useful as the non-errno setting version of exp10), so the target must >> have something that it thinks is worth open-coding. Also, we currently >> treat all MD built-ins as "simple" and thus "inexpensive", so the IFN >> handling is consistent with that. >> >> Maybe there are some IFNs that are worth special-casing as expensive, >> but IMO doing that to solve this problem would be a bit hacky. It seems >> like "inexpensive" should be more of a cost thing than a correctness thing. > Hi all,
> This is updated patch. It only adds a single check on IFN_LOOP_DIST_ALIAS > in function should_duplicate_loop_header_p. Thanks. > As for gimple_inexpensive_call_p, I think it's natural to consider > functions like IFN_LOOP_VECTORIZED and IFN_LOOP_DIST_ALIAS as > expensive because they are only meant to indicate temporary > arrangement of optimizations and are never used in code generation. > I will send a standalone patch for that. Is that enough to consider them expensive though? To me, "expensive" should mean that they cost a lot in terms of size or speed (whichever is most important in context). Both functions are really cheap in that sense, since they eventually expand to constants. Thanks, Richard > Another thing is this patch doesn't check IFN_LOOP_VECTORIZED because > it's impossible to have it with current order of passes. Bootstrap > and test ongoing. Further comments? > > Thanks, > bin > 2017-08-15 Bin Cheng <bin.ch...@arm.com> > > PR tree-optimization/81832 > * tree-ssa-loop-ch.c (should_duplicate_loop_header_p): Don't > copy loop header which has IFN_LOOP_DIST_ALIAS call. > > gcc/testsuite > 2017-08-15 Bin Cheng <bin.ch...@arm.com> > > PR tree-optimization/81832 > * gcc.dg/tree-ssa/pr81832.c: New test. > >> >> Thanks, >> Richard >> >>> Thanks, >>> Richard. >>> >>>> Bootstrap and test on x86_64. >>>> >>>> Thanks, >>>> bin >>>> 2017-08-15 Bin Cheng <bin.ch...@arm.com> >>>> >>>> PR tree-optimization/81832 >>>> * tree-ssa-loop-ch.c (inner_loop_distributed_p): New function. >>>> (pass_ch_vect::process_loop_p): Call above function. >>>> >>>> gcc/testsuite >>>> 2017-08-15 Bin Cheng <bin.ch...@arm.com> >>>> >>>> PR tree-optimization/81832 >>>> * gcc.dg/tree-ssa/pr81832.c: New test. > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr81832.c > b/gcc/testsuite/gcc.dg/tree-ssa/pr81832.c > new file mode 100644 > index 0000000..893124e > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr81832.c > @@ -0,0 +1,22 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O3" } */ > + > +int a, b, *c; > +void d(void) > +{ > + int **e; > + for(;;) > + for(int f = 1; f <= 6; f++) > + { > + b = 0; > + if(a) > +g: > + while(a++); > + if (**e); > + else > + { > + *c = a; > + goto g; > + } > + } > +} > diff --git a/gcc/tree-ssa-loop-ch.c b/gcc/tree-ssa-loop-ch.c > index 14cc6d8d..6bb0220 100644 > --- a/gcc/tree-ssa-loop-ch.c > +++ b/gcc/tree-ssa-loop-ch.c > @@ -119,7 +119,10 @@ should_duplicate_loop_header_p (basic_block header, > struct loop *loop, > continue; > > if (gimple_code (last) == GIMPLE_CALL > - && !gimple_inexpensive_call_p (as_a <gcall *> (last))) > + && (!gimple_inexpensive_call_p (as_a <gcall *> (last)) > + /* IFN_LOOP_DIST_ALIAS means that inner loop is distributed > + at current loop's header. Don't copy in this case. */ > + || gimple_call_internal_p (last, IFN_LOOP_DIST_ALIAS))) > { > if (dump_file && (dump_flags & TDF_DETAILS)) > fprintf (dump_file,