On Tue, Jan 26, 2016 at 1:51 PM, Bin.Cheng <[email protected]> wrote:
> On Tue, Jan 26, 2016 at 12:56 PM, Bernd Schmidt <[email protected]>
> wrote:
>> On 01/26/2016 10:48 AM, Bin.Cheng wrote:
>>>
>>> Yes, I moved whole loop pass (also the pass_web) after combine and it
>>> worked. A combine pass before loop-invariant can fix this problem.
>>> Below passes are currently between loop transform and combine:
>>>
>>> NEXT_PASS (pass_web);
>>> NEXT_PASS (pass_rtl_cprop);
>>> NEXT_PASS (pass_cse2);
>>> NEXT_PASS (pass_rtl_dse1);
>>> NEXT_PASS (pass_rtl_fwprop_addr);
>>> NEXT_PASS (pass_inc_dec);
>>> NEXT_PASS (pass_initialize_regs);
>>> NEXT_PASS (pass_ud_rtl_dce);
>>>
>>> I think pass_web needs to be after loop transform because it's used to
>>> handle unrolled register live range.
>>> pass_fwprop_addr and pass_inc_dec should stay where they are now. And
>>> putting pass_inc_dec before loop unroll may be helpful to keep more
>>> auto increment addressing mode chosen by IVO.
>>> We should not need to duplicate pass_initialize_regs.
>>> So what's about pass_rtl_cprop, cse2 and dse1. Should these pass be
>>> duplicated after loop transform thus loop transformed code can be
>>> cleaned up?
>>
>>
>> Hard to tell in advance - you might want to experiment a bit; set up a large
>> collection of input files and see what various approaches do to code
>> generation. In any case, I suspect this is gcc-7 material (unfortunately).
>
> Quite surprising, there is only one new failure after moving loop
> transforms (only along with pass_web). Yes, it is GCC7 change, will
For the record. The only new failure is loop-9.c. Given insn
patterns as below:
Loop:
23: r106:DF=[`*.LC0']
REG_EQUAL 1.84241999999999990222931955941021442413330078125e+1
24: [r101:DI]=r106:DF
insn 23 is merged into insn 24 by combine if it's run before
loop_invariant, resulting in below insn:
23: NOTE_INSN_DELETED
24: [r101:DI]=1.84241999999999990222931955941021442413330078125e+1
But we can't support the floating constant, so insn24 will be split
anyway by reload:
35: xmm0:DF=[`*.LC0']
24: [di:DI]=xmm0:DF
This time both instructions are in loop, causing a regression.
Combine may need some change if we run it before loop_invariant.
Thanks,
bin
> revisit this to see if it makes substantial code generation
> difference. Anyway, for the bug itself, there might be simple fix in
> x86 backend according to newest update.
>
> Thanks,
> bin