On Tue, Jan 26, 2016 at 1:51 PM, Bin.Cheng <amker.ch...@gmail.com> wrote:
> On Tue, Jan 26, 2016 at 12:56 PM, Bernd Schmidt <bernds_...@t-online.de> 
> wrote:
>> On 01/26/2016 10:48 AM, Bin.Cheng wrote:
>>>
>>> Yes, I moved whole loop pass (also the pass_web) after combine and it
>>> worked.  A combine pass before loop-invariant can fix this problem.
>>> Below passes are currently between loop transform and combine:
>>>
>>>        NEXT_PASS (pass_web);
>>>        NEXT_PASS (pass_rtl_cprop);
>>>        NEXT_PASS (pass_cse2);
>>>        NEXT_PASS (pass_rtl_dse1);
>>>        NEXT_PASS (pass_rtl_fwprop_addr);
>>>        NEXT_PASS (pass_inc_dec);
>>>        NEXT_PASS (pass_initialize_regs);
>>>        NEXT_PASS (pass_ud_rtl_dce);
>>>
>>> I think pass_web needs to be after loop transform because it's used to
>>> handle unrolled register live range.
>>> pass_fwprop_addr and pass_inc_dec should stay where they are now.  And
>>> putting pass_inc_dec before loop unroll may be helpful to keep more
>>> auto increment addressing mode chosen by IVO.
>>> We should not need to duplicate pass_initialize_regs.
>>> So what's about pass_rtl_cprop, cse2 and dse1.  Should these pass be
>>> duplicated after loop transform thus loop transformed code can be
>>> cleaned up?
>>
>>
>> Hard to tell in advance - you might want to experiment a bit; set up a large
>> collection of input files and see what various approaches do to code
>> generation. In any case, I suspect this is gcc-7 material (unfortunately).
>
> Quite surprising, there is only one new failure after moving loop
> transforms (only along with pass_web).  Yes, it is GCC7 change, will
For the record.  The only new failure is loop-9.c.  Given insn
patterns as below:

Loop:
   23: r106:DF=[`*.LC0']
      REG_EQUAL 1.84241999999999990222931955941021442413330078125e+1
   24: [r101:DI]=r106:DF

insn 23 is merged into insn 24 by combine if it's run before
loop_invariant, resulting in below insn:

   23: NOTE_INSN_DELETED
   24: [r101:DI]=1.84241999999999990222931955941021442413330078125e+1

But we can't support the floating constant, so insn24 will be split
anyway by reload:

   35: xmm0:DF=[`*.LC0']
   24: [di:DI]=xmm0:DF

This time both instructions are in loop, causing a regression.

Combine may need some change if we run it before loop_invariant.

Thanks,
bin

> revisit this to see if it makes substantial code generation
> difference.  Anyway, for the bug itself, there might be simple fix in
> x86 backend according to newest update.
>
> Thanks,
> bin

Reply via email to