On Mon, Dec 20, 2021 at 11:44:08AM -0800, H.J. Lu wrote:
> The problem is in
>
> (define_memory_constraint "TARGET_MEM_CONSTRAINT"
> "Matches any valid memory."
> (and (match_code "mem")
> (match_test "memory_address_addr_space_p (GET_MODE (op), XEXP (op, 0),
> MEM_ADDR_SPACE (op))")))
>
> define_register_constraint allows LRA to convert the operand to the form
> '(mem (reg X))', where X is a base register. I am testing the v2 patch with
If you mean replacing an immediate with a MEM containing that immediate,
isn't that often the right thing though?
I mean, if the register pressure is high and options are either spill some
register, set it to immediate, use it in one instruction and then fill the
spilled register (i.e. 2 memory loads), compared to a MEM use on the
arithmetic instruction one MEM seems cheaper to me. With -fPIC and the
cst needing runtime relocation slightly less so of course.
The code due to ivopts is trying to have something like
size_t a = (size_t) &tunable_list;
size_t b = 0xffffffffffffffa8 - a;
size_t c = x + b;
and for that cst - &symbol one needs actually 2 registers, one to hold the
constant and one to hold the (%rip) based address.
(insn 790 789 791 111 (set (reg:DI 292)
(const_int -88 [0xffffffffffffffa8])) "dl-tunables.c":304:15 76
{*movdi_internal}
(nil))
(insn 791 790 792 111 (set (reg:DI 293)
(symbol_ref:DI ("tunable_list") [flags 0x2] <var_decl 0x7f3460aa9cf0
tunable_list>)) "dl-tunables.c":304:15 76 {*movdi_internal}
(nil))
(insn 792 791 793 111 (parallel [
(set (reg:DI 291)
(minus:DI (reg:DI 292)
(reg:DI 293)))
(clobber (reg:CC 17 flags))
]) "dl-tunables.c":304:15 299 {*subdi_1}
(nil))
(insn 793 792 794 111 (parallel [
(set (reg:DI 294)
(plus:DI (reg:DI 291)
(reg:DI 198 [ ivtmp.176 ])))
(clobber (reg:CC 17 flags))
]) "dl-tunables.c":304:15 226 {*adddi_1}
(nil))
It would be smarter to rewrite the above into a lea 88+tunable_list(%rip),
%temp1
and use a subtraction instead of addition in the last insn above, or of
course in the particular case even consider the following 2 instructions
that do:
(insn 794 793 795 111 (set (reg:DI 296)
(symbol_ref:DI ("tunable_list") [flags 0x2] <var_decl 0x7f3460aa9cf0
tunable_list>)) "dl-tunables.c":304:15 76 {*movdi_internal}
(nil))
(insn 795 794 796 111 (parallel [
(set (reg:DI 295 [ cur ])
(plus:DI (reg:DI 294)
(reg:DI 296)))
(clobber (reg:CC 17 flags))
]) "dl-tunables.c":304:15 226 {*adddi_1}
(nil))
and find out that &tuneble_list - &tuneble_list is 0 and we don't need it at
all. Guess we don't figure that out due to the cast of one of those
addresses to size_t and the other one used in POINTER_PLUS_EXPR as normal
pointer.
Jakub