在 2022/11/23 00:44, Xi Ruoyao 写道:
While I still can't fully understand the immediate load issue and how
this patch fix it, I've tested this patch (alongside the prefetch
instruction patch) with bootstrap-ubsan.  And the compiled result of
imm-load1.c seems OK.
And it's doing correct thing for Glibc "improved generic string
functions" patch, producing some really tight loop now.

In the process of debugging, I found this,bringing the immediate number load instruction out of the loop is done in loop2_invariant optimization.

One of the conditions for extraction is that the destination register cannot be used more than once, and the sequence before it was modified was like this:

(insn 12 11 13 3 (set (reg:DI 90)
        (const_int 16842752 [0x1010000])) "test.c":13:12 discrim 1 131 {*movdi_64bit}
     (nil))
(insn 13 12 14 3 (set (reg:DI 91)
        (ior:DI (reg:DI 90)
            (const_int 257 [0x101]))) "test.c":13:12 discrim 1 88 {iordi3}
     (expr_list:REG_DEAD (reg:DI 90)
        (expr_list:REG_EQUAL (const_int 16843009 [0x1010101])
            (nil))))

(insn 14 13 15 3 (set (reg:DI 91)
        (ior:DI (zero_extend:DI (subreg:SI (reg:DI 91) 0))
            (const_int 282578783305728 [0x1010100000000]))) "test.c":13:12 discrim 1 150 {lu32i_d}
     (expr_list:REG_EQUAL (const_int 282578800148737 [0x1010101010101])
        (nil)))
(insn 15 14 17 3 (set (reg:DI 91)
        (ior:DI (and:DI (reg:DI 91)
                (const_int 4503599627370495 [0xfffffffffffff]))
            (const_int 72057594037927936 [0x100000000000000]))) "test.c":13:12 discrim 1 151 {lu52i_d}
     (expr_list:REG_EQUAL (const_int 72340172838076673 [0x101010101010101])
        (nil)))

Therefore, the last two instructions do not meet the extraction conditions.

But because of the implementation of our instructions, I freed myself up immediately to do it loop2_invariant later, so I avoided this problem.

Reply via email to