Richard,

I attached updated patch.
You asked me to explain why I did changes for renaming.
If we do not change PHI arguments for inner loop header we can get the
following IR:
source outer loop:
  <bb 5>: outer-loop header
  # i_45 = PHI <0(4), i_18(9)>
  # .MEM_17 = PHI <.MEM_4(D)(4), .MEM_44(9)>

  <bb 6>:inner-loop header
  # .MEM_46 = PHI <.MEM_44(7), .MEM_17(5)>

after duplication we have (without argument correction):
  <bb 12>:
  # i_74 = PHI <i_64(13), 0(17)>
  # .MEM_73 = PHI <.MEM_97(13), .MEM_4(D)(17)>

  <bb 15>:
  # .MEM_63 = PHI <.MEM_17(12), .MEM_97(16)>

and later we get verifier error:
test1.c:20:6: error: definition in block 6 does not dominate use in block 10
 void foo (int n)
      ^
for SSA_NAME: .MEM_17 in statement:
.MEM_63 = PHI <.MEM_17(10), .MEM_97(14)>

and you can see that we need to rename MEM_17 argument for out-coming
edge to MEM_73 since
MEM_17 was converted to MEM_73 in outer-loop header.

This explains my simple fix in rename_variables_in_bb.
Note also that loop distribution is not performed for outer loops.

I also did a change in slpeel_can_duplicate_loop_p to simplify check.

Any comments will be appreciated.

Yuri.

2015-06-17 15:24 GMT+03:00 Richard Biener <richard.guent...@gmail.com>:
> On Tue, Jun 16, 2015 at 4:12 PM, Yuri Rumyantsev <ysrum...@gmail.com> wrote:
>> Thanks a lot Richard for your review.
>>
>> I presented updated patch which is not gated by force_vectorize.
>> I added test on outer-loop in vect_enhance_data_refs_alignment
>> and it returns false for it because we can not improve dr alighment
>> through outer-loop peeling in general. So I assume that only
>> versioning for alignment can be applied for targets do not support
>> unaligned memory access.
>
> @@ -998,7 +998,12 @@
>    gimple stmt = DR_STMT (dr);
>    stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
>    tree vectype = STMT_VINFO_VECTYPE (stmt_info);
> +  loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
>
> +  /* Peeling of outer loops can't improve alignment.  */
> +  if (loop_vinfo && LOOP_VINFO_LOOP (loop_vinfo)->inner)
> +    return false;
> +
>
> but this looks wrong.  It depends on the context (DRs in the outer loop
> can improve alignment by peeling the outer loop and we can still
> peel the inner loop for alignment).
>
> So IMHO the correct place to amend is vect_enhance_data_refs_alignment
> (which it seems doesn't consider peeling the inner loop).
>
> I'd say you should simply add
>
>      || loop->inner)
>
> to the
>
>   /* Check if we can possibly peel the loop.  */
>   if (!vect_can_advance_ivs_p (loop_vinfo)
>       || !slpeel_can_duplicate_loop_p (loop, single_exit (loop)))
>     do_peeling = false;
>
> check.
>
>> I did not change tests for outer loops in slpeel_can_duplicate_loop_p
>> as you proposed since it is not called outside vectorization.
>
> There is still no reason for this complex condition, so please remove it.
>
> _Please_ also generate diffs with -p, it is very tedious to see patch hunks
> without a function name context.
>
> You didn't explain why you needed the renaming changes as I don't
> remember needing it when using the code from loop distribution.
>
>> I also noticed one not-resolved issue with outer-loop peeling - we don't
>> consider remainder for possible vectorization of inner-loop as we can see
>> on the following example:
>>
>>   for (i = 0; i < n; i++) {
>>     diff = 0;
>>     for (j = 0; j < M; j++) {
>>       diff += in[j+i]*coeff[j];
>>     }
>>     out[i] = diff;
>>   }
>>
>> Is it worth to fix it?
>
> You mean vectorizing the inner loop in the niter peel epilogue loop
> of the outer loop?  Possibly yes.
>
> Thanks,
> Richard.
>
>> 2015-06-16  Yuri Rumyantsev  <ysrum...@gmail.com>
>>
>> * tree-vect-loop-manip.c (rename_variables_in_bb): Add argument
>> to allow renaming of PHI arguments on edges incoming from outer
>> loop header, add corresponding check before start PHI iterator.
>> (slpeel_tree_duplicate_loop_to_edge_cfg): Introduce new bool
>> variable DUPLICATE_OUTER_LOOP and set it to true for outer loops
>> with true force_vectorize.  Set-up dominator for outer loop too.
>> Pass DUPLICATE_OUTER_LOOP as argument to rename_variables_in_bb.
>> (slpeel_can_duplicate_loop_p): Allow duplicate of outer loop if it
>> was marked with force_vectorize and has restricted cfg.
>> * tree-vect-data-refs.c (vector_alignment_reachable_p): Alignment can
>> not be reachable for outer loops.
>> (vect_enhance_data_refs_alignment): Add test on true value of
>> do_peeling.
>>
>> gcc/testsuite/ChangeLog:
>> * gcc.dg/vect/vect-outer-simd-2.c: New test.
>>
>> 2015-06-09 16:26 GMT+03:00 Richard Biener <richard.guent...@gmail.com>:
>>> On Mon, Jun 8, 2015 at 12:27 PM, Yuri Rumyantsev <ysrum...@gmail.com> wrote:
>>>> Hi All,
>>>>
>>>> Here is a simple fix which allows duplication of outer loops to
>>>> perform peeling for number of iterations if outer loop is marked with
>>>> pragma omp simd.
>>>>
>>>> Bootstrap and regression testing did not show any new failures.
>>>> Is it OK for trunk?
>>>
>>> Hmm, I don't remember needing to adjust rename_variables_in_bb
>>> when teaching loop distibution to call slpeel_tree_duplicate_to_edge_cfg
>>> on non-innermost loops...  (I just copied, I never called
>>> slpeel_can_duplicate_loop_p though).
>>>
>>> So - you should just remove the loop->inner condition from
>>> slpeel_can_duplicate_loop_p as it is used by non-vectorizer
>>> code as well (yeah, I never merged the nested loop support
>>> for loop distribution...).
>>>
>>> Index: tree-vect-loop.c
>>> ===================================================================
>>> --- tree-vect-loop.c    (revision 224100)
>>> +++ tree-vect-loop.c    (working copy)
>>> @@ -1879,6 +1879,10 @@
>>>        return false;
>>>      }
>>>
>>> +  /* Peeling for alignment is not supported for outer-loop vectorization.  
>>> */
>>> +  if (LOOP_VINFO_LOOP (loop_vinfo)->inner)
>>> +    LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo) = 0;
>>>
>>> I think you can't simply do this - if vect_enhance_data_refs_alignment
>>> decided to peel for alignment then it has adjusted the DRs alignment
>>> info already.  So instead of the above simply disallow peeling for
>>> alignment in vect_enhance_data_refs_alignment?  Thus add
>>> || ->inner to
>>>
>>>   /* Check if we can possibly peel the loop.  */
>>>   if (!vect_can_advance_ivs_p (loop_vinfo)
>>>       || !slpeel_can_duplicate_loop_p (loop, single_exit (loop)))
>>>     do_peeling = false;
>>>
>>> ?
>>>
>>> I also can't see why the improvement has to be gated on force_vect,
>>> it surely looks profitable to enable more outer loop vectorization in
>>> general, no?
>>>
>>> How do the cost model calculations end up with peeling the outer loop
>>> for niter?
>>>
>>> On targets which don't support unaligned accesses we're left with
>>> versioning for alignment.  Isn't peeling for alignment better there?
>>> Thus only disallow peeling for alignment if there is no unhandled
>>> alignment?
>>>
>>> Thanks,
>>> Richard.
>>>
>>>> ChangeLog:
>>>>
>>>> 2015-06-08  Yuri Rumyantsev  <ysrum...@gmail.com>
>>>>
>>>> * tree-vect-loop-manip.c (rename_variables_in_bb): Add argument
>>>> to allow renaming of PHI arguments on edges incoming from outer
>>>> loop header, add corresponding check before start PHI iterator.
>>>> (slpeel_tree_duplicate_loop_to_edge_cfg): Introduce new bool
>>>> variable DUPLICATE_OUTER_LOOP and set it to true for outer loops
>>>> with true force_vectorize.  Set-up dominator for outer loop too.
>>>> Pass DUPLICATE_OUTER_LOOP as argument to rename_variables_in_bb.
>>>> (slpeel_can_duplicate_loop_p): Allow duplicate of outer loop if it
>>>> was marked with force_vectorize and has restricted cfg.
>>>> * tre-vect-loop.c (vect_analyze_loop_2): Prohibit alignment peeling
>>>> for outer loops.
>>>>
>>>> gcc/testsuite/ChangeLog:
>>>> * gcc.dg/vect/vect-outer-simd-2.c: New test.

Attachment: patch.1.3
Description: Binary data

Reply via email to