On Mon, Dec 20, 2021 at 11:44:08AM -0800, H.J. Lu wrote: > The problem is in > > (define_memory_constraint "TARGET_MEM_CONSTRAINT" > "Matches any valid memory." > (and (match_code "mem") > (match_test "memory_address_addr_space_p (GET_MODE (op), XEXP (op, 0), > MEM_ADDR_SPACE (op))"))) > > define_register_constraint allows LRA to convert the operand to the form > '(mem (reg X))', where X is a base register. I am testing the v2 patch with
If you mean replacing an immediate with a MEM containing that immediate, isn't that often the right thing though? I mean, if the register pressure is high and options are either spill some register, set it to immediate, use it in one instruction and then fill the spilled register (i.e. 2 memory loads), compared to a MEM use on the arithmetic instruction one MEM seems cheaper to me. With -fPIC and the cst needing runtime relocation slightly less so of course. The code due to ivopts is trying to have something like size_t a = (size_t) &tunable_list; size_t b = 0xffffffffffffffa8 - a; size_t c = x + b; and for that cst - &symbol one needs actually 2 registers, one to hold the constant and one to hold the (%rip) based address. (insn 790 789 791 111 (set (reg:DI 292) (const_int -88 [0xffffffffffffffa8])) "dl-tunables.c":304:15 76 {*movdi_internal} (nil)) (insn 791 790 792 111 (set (reg:DI 293) (symbol_ref:DI ("tunable_list") [flags 0x2] <var_decl 0x7f3460aa9cf0 tunable_list>)) "dl-tunables.c":304:15 76 {*movdi_internal} (nil)) (insn 792 791 793 111 (parallel [ (set (reg:DI 291) (minus:DI (reg:DI 292) (reg:DI 293))) (clobber (reg:CC 17 flags)) ]) "dl-tunables.c":304:15 299 {*subdi_1} (nil)) (insn 793 792 794 111 (parallel [ (set (reg:DI 294) (plus:DI (reg:DI 291) (reg:DI 198 [ ivtmp.176 ]))) (clobber (reg:CC 17 flags)) ]) "dl-tunables.c":304:15 226 {*adddi_1} (nil)) It would be smarter to rewrite the above into a lea 88+tunable_list(%rip), %temp1 and use a subtraction instead of addition in the last insn above, or of course in the particular case even consider the following 2 instructions that do: (insn 794 793 795 111 (set (reg:DI 296) (symbol_ref:DI ("tunable_list") [flags 0x2] <var_decl 0x7f3460aa9cf0 tunable_list>)) "dl-tunables.c":304:15 76 {*movdi_internal} (nil)) (insn 795 794 796 111 (parallel [ (set (reg:DI 295 [ cur ]) (plus:DI (reg:DI 294) (reg:DI 296))) (clobber (reg:CC 17 flags)) ]) "dl-tunables.c":304:15 226 {*adddi_1} (nil)) and find out that &tuneble_list - &tuneble_list is 0 and we don't need it at all. Guess we don't figure that out due to the cast of one of those addresses to size_t and the other one used in POINTER_PLUS_EXPR as normal pointer. Jakub