on 2020/6/3 下午5:27, Richard Biener wrote:
> On Wed, 3 Jun 2020, Kewen.Lin wrote:
> 
>> on 2020/6/3 下午3:07, Richard Biener wrote:
>>> On Wed, 3 Jun 2020, Kewen.Lin wrote:
>>>
>>>> Hi Richi,
>>>>

snip ...

>>>>>
>>>>> I'd just mention there are other targets that have the choice between
>>>>> the above forms.  Since IVOPTs itself does not perform the unrolling
>>>>> the IL it produces is the same, correct?
>>>>>
>>>> Yes.  Before this patch, IVOPTs doesn't consider the unrolling impacts,
>>>> it only models things based on what it sees.  We can assume it thinks
>>>> later RTL unrolling won't perform.
>>>>
>>>> With this patch, since the IV choice probably changes, the IL can probably
>>>> change.  The typical difference with this patch is:
>>>>
>>>>   vect__1.7_15 = MEM[symbol: x, index: ivtmp.19_22, offset: 0B];
>>>> vs.
>>>>   vect__1.7_15 = MEM[base: _29, offset: 0B];
>>>
>>> So we're asking IVOPTS "if we were unrolling this loop would you make
>>> a different IV choice?" thus I wonder why we need so much complexity
>>> here?  
>>
>> I would describe it more like "we are going to unroll this loop with
>> unroll factor uf in RTL, would you consider this variable when modeling?"
>>
>> In most cases, one single iteration is representative for the unrolled
>> body, so it doesn't matter considering unrolling or not.  But for the
>> case here, it's not true, expected reg_offset iv cand can make iv cand
>> step cost reduced, it leads the difference.
>>
>>> That is, if we can classify the loop as being possibly unrolled
>>> we could evaluate IVOPTs IV choice (and overall cost) on the original
>>> loop and in a second run on the original loop with fake IV uses
>>> added with extra offset.  If the overall IV cost is similar we'll
>>> take the unroll friendly choice if the costs are way different
>>> (I wouldn't expect this to be the case ever?) I'd side with the
>>> IV choice when not unrolling (and mark the loop as to be not unrolled).
>>>
>>
>> Could you elaborate it a bit?  I guess it won't estimate the unroll
>> factor here, just guess it's to be unrolled or not?  The second run
>> with fake IV uses added with extra offset sounds like scaling up the 
>> iv group cost by uf.
> 
> From your example above the D-form (MEM[symbol: x, index: ivtmp.19_22, 
> offset: 0B]) is preferable since in the unrolled variant we have
> the same addres but with a different constant offset for the unroll
> copies while the second form would have to update the 'base' IV.
> 
> Thus I think the difference in IV cost and decision should already
> show up if we, for each USE add a USE with an added constant offset.
> This might be what your patch does with that extra flag on the USEs,
> I was suggesting to model the USEs more explicitely, simulating a
> 2-way unroll.  I think in the end I'll defer to Bin here who knows
> the code best.
> 

Thanks for your further explanation!  As your proposal we introduce more
iv use groups with step added.  Take the example here
https://gcc.gnu.org/pipermail/gcc-patches/2020-June/547128.html
Imagining initially the cand iv 4 leading to x-form wins, it's the
original iv, has the iv-group cost 1 against the address group.
Although we introduce one more group (2-way unrolling), the iv still
wins since pulling the address iv in takes 5 (15 for three).  Probably
we can introduce more groups according to uf here.

OK.  Looking forward to Bin's comments.

>>> Thus I'd err on the side of not unrolling but leave the ultimate choice
>>> of whether to unroll to RTL unless IV cost makes that prohibitive.
>>>
>>> Even without X- or D- form addressing modes the IV choice may differ
>>> and I think we don't need extra knobs for the unroller but instead
>>> can decide to set the existing n_unroll to zero (force not unroll)
>>> when costs say it would be bad?
>>
>> Yes, even without x- or d- form addressing, the difference probably comes 
>> from compare type IV use for loop ending, maybe more cases which I am not
>> aware of.  But I don't see people care about it, probably the impact is
>> small.
>>
>> IIUC what you stated here looks like to use ivopts information for unrolling
>> factor decision, I think this is a separate direction, do we have this
>> kind of case where ivopts costs can foresee the unrolling?
>>
>> Now the unroll factor estimation can be used for other optimization passes
>> if they are wondering future unrolling factor decision, as discussed it
>> sounds a good idea to override the n_unroll with some benchmarking.
> 
> I didnt' suggest to use IVOPTs to determine the unroll factor.  In
> fact your patch looks like it does this?  Instead I wanted to make
> IVOPTs choose a set of IVs that is best for a blend of both worlds - use
> D-form when it doesn't hurt the not unrolled code [much], and X-form
> when the D-form is way worse (for whatever reason) and signal that
> to the unroller (but we could chose to not do that).
> 

Sorry for my weak comprehension!  Nice, we are on the same direction.  :)

> The real issue is of course we're applying IV decision to a not final
> loop.
> 

Exactly.

BR,
Kewen

Reply via email to