On 8/1/23 12:17, Vineet Gupta wrote:
Hi Jeff,
As discussed this morning, I'm sending over dumps for the optim of DF
const -0.0 (PR/110748) [1]
For rv64gc_zbs build, IRA is undoing the split which eventually leads
to ICE in final pass.
[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110748#c15
void znd(double *d) { *d = -0.0; }
*split1*
(insn 10 3 11 2 (set (reg:DI 136)
(const_int [0x8000000000000000])) "neg.c":4:5 -1
(insn 11 10 0 2 (set (mem:DF (reg:DI 135) [1 *d_2(D)+0 S8 A64])
(subreg:DF (reg:DI 136) 0)) "neg.c":4:5 -1
*ira*
(insn 11 9 12 2 (set (mem:DF (reg:DI 135) [1 *d_2(D)+0 S8 A64])
(const_double:DF -0.0 [-0x0.0p+0])) "neg.c":4:5 190
{*movdf_hardfloat_rv64}
(expr_list:REG_DEAD (reg:DI 135)
For the working case, the large const is not involved and not subject
to IRA playing foul.
I investigated this some more. So IRA update_equiv_regs () has code
identifying potential replacements: if a reg is referenced exactly
twice: set once and used once.
if (REG_N_REFS (regno) == 2
&& (rtx_equal_p (replacement, src)
|| ! equiv_init_varies_p (src))
&& NONJUMP_INSN_P (insn)
&& equiv_init_movable_p (PATTERN (insn), regno))
reg_equiv[regno].replace = 1;
}
And combine_and_move_insns () does the replacement, undoing the split1
above.
As a hack if I disable this code, the ICE above goes away and we get the
right output.
bseti a5,zero,63
sd a5,0(a0)
ret
In fact this is the reason for many more split1 being undone. See the
suboptimal codegen for large const for Andrew Pinski's test case [1]
long long f(void)
{
unsigned t = 0x101_0101;
long long t1 = t;
long long t2 = ((unsigned long long )t) << 32;
asm("":"+r"(t1));
return t1 | t2;
}
[1] https://gcc.gnu.org/pipermail/gcc-patches/2023-June/620413.html
Again if I use the hacked compiler the redundant immediate goes away.
upstream | ira hack |
li a0,0x101_0000 | li a5,0x1010000 |
addi a0,a0,0x101 | addi a5,a5,0x101 |
li a5,0x101_0000 | mv a0,a5 |
slli a0,a0,32 | slli a5,a5,32 |
addi a5,a5,0x101 | or a0,a0,a5 |
or a0,a5,a0 | ret |
ret | |
This code has been there since 2009, when ira.c was created so it
obviously per design/expected.
I'm wondering (naively) if there is some way to tune this - for a given
backend. In general it would make sense to do the replacement, but not
if the cost changes (e.g. consts could be embedded in x86 insn freely,
but not for RISC-V where this is costly and if something is split, it
might been intentional.
Thx,
-Vineet