On Sun, May 11, 2014 at 2:49 PM, Bin.Cheng <amker.ch...@gmail.com> wrote:
> On Thu, May 8, 2014 at 5:08 PM, Bin.Cheng <amker.ch...@gmail.com> wrote:
>> On Tue, May 6, 2014 at 6:44 PM, Richard Biener
>> <richard.guent...@gmail.com> wrote:
>>> On Tue, May 6, 2014 at 10:39 AM, Bin.Cheng <amker.ch...@gmail.com> wrote:
>>>> On Fri, Dec 6, 2013 at 6:19 PM, Richard Biener
>>>> <richard.guent...@gmail.com> wrote:
>>
>>>> Hi,
>>>> I split the patch into two and updated the test case.
>>>> The patches pass bootstrap/tests on x86/x86_64, also pass test on arm 
>>>> cortex-m3.
>>>> Is it OK?
>>>>
>>>> Thanks,
>>>> bin
>>>>
>>>> PATCH 1:
>>>>
>>>> 2014-05-06  Bin Cheng  <bin.ch...@arm.com>
>>>>
>>>>     * gcc/tree-affine.c (tree_to_aff_combination): Handle MEM_REF for
>>>>     core part of address expressions.
>>>
>>> No gcc/ in the changelog
>>>
>>> Simplify that by using aff_combination_add_cst:
>>>
>>> +      if (TREE_CODE (core) == MEM_REF)
>>> +       {
>>> +         aff_combination_add_cst (comb, mem_ref_offset (core));
>>> +         core = TREE_OPERAND (core, 0);
>>>
>>> patch 1 is ok with that change.
>>
>> Installed with below change because of wide-int merge:
>> -      core = build_fold_addr_expr (core);
>> +      if (TREE_CODE (core) == MEM_REF)
>> +    {
>> +      aff_combination_add_cst (comb, wi::to_widest (TREE_OPERAND (core, 
>> 1)));
>> +      core = TREE_OPERAND (core, 0);
>>
>>>
>>>> PATCH 2:
>>>>
>>>> 2014-05-06  Bin Cheng  <bin.ch...@arm.com>
>>>>
>>>>     * gcc/tree-ssa-loop-ivopts.c (contain_complex_addr_expr): New.
>>>>     (alloc_iv): Lower base expressions containing ADDR_EXPR.
>>>
>>> So this "lowers" addresses(?) which are based on &<not-decl>,
>>> like &a[0] + 4 but not &a + 4.  I question this odd choice.  ISTR
>> &a+4 is already in its lowered form, what we want to handle is &EXPR +
>> 4, in which EXPR is MEM_REF/ARRAY_REF, etc..
>>
>>> when originally introducing address lowering (in rev. 204497) I was
>>> concerned about the cost?
>> Yes, you did.  I still think the iv number is relative small for each
>> loop, thus the change won't cause compilation time issue considering
>> the existing tree<->affine transformation in ivopt.
>> I would like to collect more accurate time information for ivopt in
>> gcc bootstrap.  Should I use "-ftime-report" then add all time slices
>> in TV_TREE_LOOP_IVOPTS category?  Is there any better solutions?
>> Thanks.
>>
>>>
>>> Now I wonder why we bother to convert the lowered form back
>>> to trees to store it in ->base and not simply keep (and always
>>> compute) the affine expansion only.  Thus, change struct iv
>>> to have aff_tree base instead of tree base?
>>>
>>> Can you see what it takes to do such change?  At the point
>>> we replace uses we go into affine representation again anyway.
>> Good idea, I may have a look.
> After going through work flow of ivopt, I think it's practical to make
> such change and can help compilation time.  Ivopt calls
> tree_to_aff_combination to convert iv->base into affine_tree whenever
> it tries to represent an iv_use by an iv_cand,  i.e., N*M times for
> computing cost of each iv_use represented by each iv_cand, and M times
> for rewriting each iv_use.  Here, N and M stands for number of iv_use
> and iv_cand.  Also strip_offset (which is a recursive function) is
> called for iv->base also at complexity of O(NM), so it may be improved
> too.
> To make a start, it's possible to store both tree and affine_tree of
> base in struct iv, and use either of them whenever needed.
>
> It seems to me there are two potential issues of this change.  First,
> though affine_tree of base is stored, we can't operate on it directly,
> which means we have to copy it before using it.

You'd use it as unchanged operand to some aff_* function to avoid
actual copying of course.

>  Second, ivopt
> sometimes computes affine_tree of base in different type other than
> TREE_TYPE(base), we may need to do additional calls to
> aff_combination_convert to get wanted type.  All these will introduce
> additional computation and compromise benefit of the change.

Sure.

> At last, I did some experiments on how many additional calls to
> tree_to_aff_combination are introduced by this patch.  The data of
> bootstrap shows it's less than 3% comparing to calls made by current
> implementation of ivopt, which confirms that this patch shouldn't have
> issue of compilation time.
>
> Any comments?

I'd see the use of affines as making the algorithmic structure of
IVOPTs clearer and more consistent (also with possibly using
the _expand variants later to get even more simplifications).

I don't want to force you to do this change but I thought it may
help further changes (as you seem to be doing a lot of IVOPTs
changes lately).

So - the patch is ok as-is then, but please consider refactoring
IVOPTs code when you do changes.  It's current structure
is somewhat awkward.

Thanks,
Richard.

> Thanks,
> bin
>
>
> --
> Best Regards.

Reply via email to