on 2019/4/27 上午11:44, Bin.Cheng wrote:
> On Fri, Apr 26, 2019 at 2:44 PM Kewen.Lin <[email protected]> wrote:
>> +
>> + /* zero cost use makes it easier to select memory based iv cand
>> + for replacement of non memory based iv and its use. But if
>> + the setup sequence are too costly, loop iv analysis can NOT
>> + easily figure out it's finite, it's possible to stop the
>> + low-overhead loop transformation and get unexpected code. */
>> + if (use->zero_cost_p && cand->iv->base_object && !use->iv->base_object
>> + && elim_cost.cost >= 30)
>> + dont_elim_p = true;
> No, we'd like to avoid such things in general. The conditions look
> like a hack to me. elim_cost is compared to express_cost, adding
> another check on it at different place isn't really good, especially
> 30 itself is a magic number. It's most likely improvement in some
> cases, deterioration in others.
>
Yes, I agree it's too hacky and unacceptable as a formal fix. And I
tried to investigate more whether it's a general issue but never got
exposed.
> Also it punishes one pass (IVOPTs here) because of other pass' (RTL)
> problem. It does't mean we can't do such transformations, only it has
> to be as precise/conservative as possible. For example, if RTL loop
> iv is improved to handle the case in the future, who would remember to
> come back and adjust this?
>
Good question!
> GCC lacks the capability passing information to later passes. Gimple
> analyzer worked hard collecting various information but discards it
> entering RTL or earlier. Other examples are like runtime alias
> information, non-wrapping information for specific operations, etc.
> IMHO, this is what needs to be done. As for this case, it could be
> finite loop info, or non-wrapping info of the iv_var's increment
> operation. By passing more information, RTL passes can be simplified
> too.
>
Thanks for the information! Is there any under development work for this?
That would be fine if we can pass down those information to downstream
passes based on upcoming feature.
> Thanks,
> bin
>> +
>> /* The bound is a loop invariant, so it will be only computed
>> once. */
>> elim_cost.cost = adjust_setup_cost (data, elim_cost.cost);
>> @@ -5184,7 +5195,7 @@ determine_group_iv_cost_cond (struct ivopts_data *data,
>> express_cost += bound_cost;
>>
>> /* Choose the better approach, preferring the eliminated IV. */
>> - if (elim_cost <= express_cost)
>> + if (elim_cost <= express_cost && !dont_elim_p)
>> {
>>
>>
>> I was thinking whether this zero cost change just exposed
>> an existing problem, then this kind of check should be for all
>> cases not only for zero cost use, similar to
>> expression_expensive_p? But why doesn't it happen before?
>> Need more investigation.
>>
>>>
>>>> Btw, this is for GCC10.
>>>
>>> *Phew* ;-)
>>>
>>>
>>> Some trivial comments:
>>>
>>>> +static bool
>>>> +invalid_insn_for_doloop_p (struct loop *loop)
>>>> +{
>>>> + basic_block *body = get_loop_body (loop);
>>>> + unsigned num_nodes = loop->num_nodes;
>>>> + gimple_stmt_iterator gsi;
>>>> + unsigned i;
>>>> +
>>>> + for (i = 0; i < num_nodes; i++)
>>>
>>> for (unsigned i = 0; i < num_nodes; i++)
>>>
>>> (and maybe you can just say loop->num_nodes here; I don't know if that
>>> generates worse code, or if that even matters).
>>
>> Good idea, will fix it.
>>
>>>
>>>> + if (dump_file && (dump_flags & TDF_DETAILS))
>>>> + fprintf (
>>>> + dump_file,
>>>> + "predict doloop failure due to finding computed jump.\n");
>>>
>>> We don't normally end lines in (. There are other solutions to why you
>>> did that here; you can use string pasting, to break the string into two,
>>> or factor out (some part of) the loop body to reduce indentation.
>>>
>>
>> Will adjust it.
>>
>>>
>>> Segher
>>>
>>
>