http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59536
--- Comment #10 from bin.cheng <amker.cheng at gmail dot com> ---
The offending loop before IVOPT is like:
<bb 350>:
# var_index_1889 = PHI <1(924), var_index_983(923)>
# var_index.250_1269 = PHI <1(924), var_index.250_1959(923)>
if (var_index.250_1269 < _1237)
goto <bb 351>;
else
goto <bb 885>;
<bb 351>:
loopi_952 = MEM[(const struct vec
*)pretmp_2270].m_vecdata[var_index.250_1269];
_947 = loopi_952->num;
if (_947 == pretmp_2268)
goto <bb 352>;
else
goto <bb 923>;
<bb 923>:
var_index_983 = var_index_1889 + 1;
var_index.250_1959 = (unsigned int) var_index_983;
goto <bb 350>;
<bb 924>:
goto <bb 350>;
The patch can recognize var_index.250_1269 is an iv with {1, 1}_loop, thus
rewriting the loop into:
<bb 350>:
# var_index_1889 = PHI <1(924), var_index_983(923)>
# ivtmp.1067_1968 = PHI <ivtmp.1067_696(924), ivtmp.1067_884(923)>
var_index.250_1269 = (unsigned int) var_index_1889;
if (var_index_1889 != _958)
goto <bb 351>;
else
goto <bb 885>;
<bb 351>:
_111 = (void *) ivtmp.1067_1968;
loopi_952 = MEM[base: _111, offset: 0B];
ivtmp.1067_884 = ivtmp.1067_1968 + 4;
_947 = loopi_952->num;
if (_947 == pretmp_2268)
goto <bb 352>;
else
goto <bb 923>;
<bb 923>:
var_index_983 = var_index_1889 + 1;
goto <bb 350>;
<bb 924>:
_1542 = pretmp_2270 + 12;
ivtmp.1067_696 = (unsigned int) _1542;
_958 = (int) _1237;
goto <bb 350>;
The transformation looks good and takes advantage of post-increment addressing
mode for memory access "MEM[base: _111, offset: 0B]".
The loop is expanded into rtl like:
4438: L4438:
1814: NOTE_INSN_BASIC_BLOCK 352
1815: r626:SI=r817:SI
1816: cc0=cmp(r817:SI,r492:SI)
1817: pc={(cc0==0)?L4244:pc}
REG_BR_PROB 900
1818: NOTE_INSN_BASIC_BLOCK 353
1819: r490:SI=[r829:SI]
1820: r829:SI=r829:SI+0x4
1821: cc0=cmp([r490:SI],r864:SI)
1822: pc={(cc0!=0)?L4435:pc}
...
4435: L4435:
4436: NOTE_INSN_BASIC_BLOCK 952
4437: r817:SI=r817:SI+0x1
4439: pc=L4438
4440: barrier
4441: L4441:
4442: NOTE_INSN_BASIC_BLOCK 953
4443: r829:SI=r865:SI+0xc
4444: r492:SI=r621:SI
44: r817:SI=0x1
4445: pc=L4438
Then instruction 1819/1820 are combined by auto-inc-dec pass into:
1819: r490:SI=[r829:SI++]
REG_INC r829:SI
1821: cc0=cmp([r490:SI],r864:SI)
REG_DEAD r490:SI
1822: pc={(cc0!=0)?L4435:pc}
REG_BR_PROB 9550
Problem comes from reload which puts both r490 and r829 into %a0 (reg 8?) and
generates below code:
1819: %a0:SI=[%a0:SI++]
REG_INC %a0:SI
1821: cc0=cmp([%a0:SI],%d2:SI)
1822: pc={(cc0!=0)?L4435:pc}
REG_BR_PROB 9550
Insn 1819 is now bogus and causes assertion in cselib.
In IRA, there are dumps like:
Popping a1119(r829,l0: a921(r829,l17)) -- assign reg 8
Popping a1122(r1111,l0: a924(r1111,l17)) -- assign reg 8
Popping a1120(r494,l0: a922(r494,l17)) -- assign reg 9
Popping a1147(r1054,l0: a1006(r1054,l15)) -- assign reg 8
Popping a1157(r490,l0: a1124(r490,l17: a959(r490,l18))) -- assign reg 2
But in reload, there are dumps:
Reloads for insn # 1819
Reload 0: reload_in (SI) = (post_inc:SI (reg:SI 829 [ ivtmp.1067 ]))
reload_out (SI) = (post_inc:SI (reg:SI 829 [ ivtmp.1067 ]))
ADDR_REGS, RELOAD_FOR_OPERAND_ADDRESS (opnum = 1), inc by 4
reload_in_reg: (post_inc:SI (reg:SI 829 [ ivtmp.1067 ]))
reload_reg_rtx: (reg:SI 8 %a0)
Reload 1: reload_out (SI) = (reg/v/f:SI 490 [ loopi ])
GENERAL_REGS, RELOAD_FOR_OUTPUT (opnum = 0), optional
reload_out_reg: (reg/v/f:SI 490 [ loopi ])
Reload 2: reload_in (SI) = (mem/f:SI (post_inc:SI (reg:SI 829 [ ivtmp.1067 ]))
[4 MEM[base: _111, offset: 0B]+0 S4 A16])
GENERAL_REGS, RELOAD_FOR_INPUT (opnum = 1), optional
reload_in_reg: (mem/f:SI (post_inc:SI (reg:SI 829 [ ivtmp.1067 ])) [4
MEM[base: _111, offset: 0B]+0 S4 A16])
So I am not sure if there are some bugs in reload for m68k, or ivopt is doing
something very trick and wrong?
Thanks,
bin