On Wed, Aug 16, 2017 at 11:22 AM, Bin.Cheng <amker.ch...@gmail.com> wrote: > On Tue, Aug 15, 2017 at 6:33 PM, Richard Sandiford > <richard.sandif...@linaro.org> wrote: >> Richard Biener <richard.guent...@gmail.com> writes: >>> On Tue, Aug 15, 2017 at 11:28 AM, Bin Cheng <bin.ch...@arm.com> wrote: >>>> Hi, >>>> This patch fixes PR81832. Root cause for the ICE is: >>>> 1) Loop has distributed inner loop. >>>> 2) The guarding function call IFN_LOOP_DIST_CALL happens to be in loop's >>>> header. >>>> 3) IFN_LOOP_DIST_CALL (int loop's header) is duplicated by pass_ch_vect >>>> thus >>>> not eliminated. >>>> >>>> Given pass_ch_vect copies loop header to enable more vectorization, we >>>> should >>>> skip loop in this case because distributed inner loop means this loop can >>>> not >>>> be vectorized anyway. One point to mention is name >>>> inner_loop_distributed_p >>>> is a little misleading. The name indicates that each basic block is >>>> checked, >>>> but the patch only checks loop's header for simplicity/efficiency's >>>> purpose. >>>> Any comment? >>> >>> My comment would be to question pass_ch_vect placement -- what was the >>> reason to place it so late? >>> >>> I also see GRAPHITE runs inbetween loop distribution and vectorization -- >>> what prevents GRAPHITE from messing up things here? Or autopar? >>> >>> The patch itself shows should_duplicate_loop_header_p should >>> handle this IFN specially (somehow all IFNs are considered "inexpensive"). >>> >>> So can you please adjust should_duplicate_loop_header_p instead and/or >>> gimple_inexpensive_call_p? Since we have IFNs for stuff like EXP10 I'm not >>> sure if that default is so good. >> >> I think the default itself is OK: we only use IFNs for libm functions >> like exp10 if an underlying optab exists (unlike __builtin_exp10, which >> is useful as the non-errno setting version of exp10), so the target must >> have something that it thinks is worth open-coding. Also, we currently >> treat all MD built-ins as "simple" and thus "inexpensive", so the IFN >> handling is consistent with that. >> >> Maybe there are some IFNs that are worth special-casing as expensive, >> but IMO doing that to solve this problem would be a bit hacky. It seems >> like "inexpensive" should be more of a cost thing than a correctness thing. > Hi all, > This is updated patch. It only adds a single check on IFN_LOOP_DIST_ALIAS > in function should_duplicate_loop_header_p. As for gimple_inexpensive_call_p, > I think it's natural to consider functions like IFN_LOOP_VECTORIZED and > IFN_LOOP_DIST_ALIAS as expensive because they are only meant to indicate > temporary arrangement of optimizations and are never used in code generation. > I will send a standalone patch for that. Another thing is this patch > doesn't check > IFN_LOOP_VECTORIZED because it's impossible to have it with current order > of passes. > Bootstrap and test ongoing. Further comments?
Ok. Thanks, Richard. > Thanks, > bin > 2017-08-15 Bin Cheng <bin.ch...@arm.com> > > PR tree-optimization/81832 > * tree-ssa-loop-ch.c (should_duplicate_loop_header_p): Don't > copy loop header which has IFN_LOOP_DIST_ALIAS call. > > gcc/testsuite > 2017-08-15 Bin Cheng <bin.ch...@arm.com> > > PR tree-optimization/81832 > * gcc.dg/tree-ssa/pr81832.c: New test. > >> >> Thanks, >> Richard >> >>> Thanks, >>> Richard. >>> >>>> Bootstrap and test on x86_64. >>>> >>>> Thanks, >>>> bin >>>> 2017-08-15 Bin Cheng <bin.ch...@arm.com> >>>> >>>> PR tree-optimization/81832 >>>> * tree-ssa-loop-ch.c (inner_loop_distributed_p): New function. >>>> (pass_ch_vect::process_loop_p): Call above function. >>>> >>>> gcc/testsuite >>>> 2017-08-15 Bin Cheng <bin.ch...@arm.com> >>>> >>>> PR tree-optimization/81832 >>>> * gcc.dg/tree-ssa/pr81832.c: New test.