On Wed, Aug 13, 2025 at 12:35 AM Hongtao Liu <crazy...@gmail.com> wrote: > > On Wed, Aug 13, 2025 at 2:35 PM Hongtao Liu <crazy...@gmail.com> wrote: > > > > On Tue, Aug 12, 2025 at 10:02 PM H.J. Lu <hjl.to...@gmail.com> wrote: > > > > > > On Tue, Aug 12, 2025 at 06:47:54AM -0700, H.J. Lu wrote: > > > > On Mon, Aug 11, 2025 at 11:13 PM Hongtao Liu <crazy...@gmail.com> wrote: > > > > > > > > > > On Mon, Aug 4, 2025 at 11:33 PM H.J. Lu <hjl.to...@gmail.com> wrote: > > > > > > > > > > > > On Mon, Aug 04, 2025 at 02:57:39PM +0800, Hongtao Liu wrote: > > > > > > > > > > > + rtx_insn *before = nullptr; > > > > > > > > > > > + rtx_insn *after = nullptr; > > > > > > > > > > > + if (insn == BB_HEAD (bb)) > > > > > > > > > > > + before = insn; > > > > > > > > > > > + else > > > > > > > > > > > + after = insn ? PREV_INSN (insn) : BB_END (bb); > > > > > > > > > > > + > > > > > > > > > > > + /* TLS_GD and TLS_LD_BASE instructions are normal > > > > > > > > > > > functions which > > > > > > > > > > > + clobber caller-saved registers. TLSDESC > > > > > > > > > > > instructions are special > > > > > > > > > > > + functions which only clobber RAX. If any registers > > > > > > > > > > > clobbered by > > > > > > > > > > > + the TLS instruction are live in this basic block, > > > > > > > > > > > we must insert > > > > > > > > > > > + the TLS instruction after all live registers > > > > > > > > > > > clobbered by the TLS > > > > > > > > > > > + instruction are dead. */ > > > > > > > > > > > + > > > > > > > > > > > + auto_bitmap live_caller_saved_regs; > > > > > > > > > > > + bitmap in = df_live ? DF_LIVE_IN (bb) : DF_LR_IN (bb); > > > > > > > > > > > + > > > > > > > > > > > + bool flags_live_p = bitmap_bit_p (in, FLAGS_REG); > > > > > > > > > > > + > > > > > > > > > > > + unsigned int i; > > > > > > > > > > > + > > > > > > > > > > > + /* Get all live caller-saved registers. */ > > > > > > > > > > > + if (kind == X86_CSE_TLSDESC) > > > > > > > > > > > + { > > > > > > > > > > > + if (bitmap_bit_p (in, AX_REG)) > > > > > > > > > > > + bitmap_set_bit (live_caller_saved_regs, AX_REG); > > > > > > > > > > > > > > > > > > > > And we don't need to check for those hard registers here > > > > > > > > > > and below? > > > > > > > > > > > > > > > > > > TLS_GD and TLS_LD_BASE instructions are normal functions which > > > > > > > > > clobber caller-saved registers. TLSDESC instructions are > > > > > > > > > special > > > > > > > > > functions which only clobber RAX. live_caller_saved_regs > > > > > > > > > captures > > > > > > > > > live caller-saved registers for these TLS instructions. > > > > > > > > > > > > > > I notice those insns are CALL_INSN, and for ABI, rax/rdi/rsi is > > > > > > > caller_saved registers, so even we explicitly use (clobber (reg: > > > > > > > RAX)) > > > > > > > > > > > > Since a single TLS call will clobber caller-saved registers, it must > > > > > > be placed where all caller-saved registers are dead. Otherwise, > > > > > > the > > > > > > TLS call will clobber some live registers. > > > > > I saw in legitimize_tls_address, it doesn't check the liveness of > > > > > RAX/RDI(or call clobber registers) even with explicit use of RDI/RAX. > > > > > I'm wondering if we can reuse that code to do similar things. > > > > > > > > Before RA, hard registers are only used as scratch registers: > > > > > > > > insn 20 set rax .... > > > > insn 21 set r300 rax # rax is dead after this. > > > > > > > > Now we are inserting another rax usage in the same basic block. > > > > > > > > set rax ... > > > > set r400 rax > > > > > > > > we must insert them where rax isn't alive. We can't insert them > > > > between insn 20 and insn 21. My patch inserts them where all > > > > caller-saved registers are dead. > I c, but for X86_CSE_TLSDESC, there's no need to check RAX since RA > allocates the dest as RAX and there's no reload issue for that.
I am testing this change on top of my patch: @@ -3829,13 +3828,9 @@ ix86_place_single_tls_call (rtx dest, rtx val, x86_cse_kind kind, unsigned int i; - /* Get all live caller-saved registers. */ - if (kind == X86_CSE_TLSDESC) - { - if (bitmap_bit_p (in, AX_REG)) - bitmap_set_bit (live_caller_saved_regs, AX_REG); - } - else + /* Get all live caller-saved registers for TLS_GD and TLS_LD_BASE + instructions. */ + if (kind != X86_CSE_TLSDESC) for (i = 0; i < FIRST_PSEUDO_REGISTER; i++) if (call_used_regs[i] && !fixed_regs[i] I will check it in after testing. Thanks. > Ok for others. > > > > I'll try to rewrite ix86_place_single_tls_call to see if it's possible. > > will get back to you later. > > > > > > > > gcc.target/i386/pr81501-4a.c is an example. We want to insert > > > > > > (call_insn/u 81 74 4 2 (parallel [ > > > (set (reg:DI 0 ax) > > > (call:DI (mem:QI (symbol_ref:DI ("__tls_get_addr")) [0 > > > S1 A8]) > > > (const_int 0 [0]))) > > > (unspec:DI [ > > > (symbol_ref:DI ("foo") [flags 0x10] <var_decl > > > 0x7fb4f0dd6e40 foo>) > > > (reg/f:DI 7 sp) > > > ] UNSPEC_TLS_GD) > > > (clobber (reg:DI 5 di)) > > > ]) "pr81501-4a.c":26:1 1657 {*tls_global_dynamic_64_di} > > > (expr_list:REG_EH_REGION (const_int -2147483648 [0xffffffff80000000]) > > > (nil)) > > > (nil)) > > > > > > in a basic block with > > > > > > (note 7 0 73 2 [bb 2] NOTE_INSN_BASIC_BLOCK) > > > (insn 73 7 74 2 (set (reg:SI 117 [ n ]) > > > (reg:SI 5 di [ n ])) "pr81501-4a.c":26:1 100 {*movsi_internal} > > > (expr_list:REG_DEAD (reg:SI 5 di [ n ]) > > > (nil))) > > > (insn 74 73 4 2 (set (reg/f:DI 118 [ caller_foop ]) > > > (reg:DI 4 si [ caller_foop ])) "pr81501-4a.c":26:1 99 > > > {*movdi_internal} > > > (expr_list:REG_DEAD (reg:DI 4 si [ caller_foop ]) > > > (nil))) > > > (note 4 74 9 2 NOTE_INSN_FUNCTION_BEG) > > > (insn 9 4 10 2 (set (reg/f:DI 105) > > > (symbol_ref/f:DI ("*.LC0") [flags 0x2] <var_decl 0x7fffe99fb5f0 > > > *.LC0>)) "pr81501-4a.c":30:3 99 {*movdi_internal} > > > (nil)) > > > (insn 10 9 11 2 (set (reg:DI 5 di) > > > (reg/f:DI 105)) "pr81501-4a.c":30:3 99 {*movdi_internal} > > > (expr_list:REG_DEAD (reg/f:DI 105) > > > (expr_list:REG_EQUAL (symbol_ref/f:DI ("*.LC0") [flags 0x2] > > > <var_decl 0x7fffe99fb5f0 *.LC0>) > > > (nil)))) > > > (call_insn 11 10 12 2 (call (mem:QI (symbol_ref:DI ("bar3") [flags 0x41] > > > <function_decl 0x7fffe99cdb00 bar3>) [0 bar3 S1 A8]) > > > (const_int 0 [0])) "pr81501-4a.c":30:3 1469 {*call} > > > (expr_list:REG_DEAD (reg:DI 5 di) > > > (expr_list:REG_CALL_DECL (symbol_ref:DI ("bar3") [flags 0x41] > > > <function_decl 0x7fffe99cdb00 bar3>) > > > (nil))) > > > (expr_list:DI (use (reg:DI 5 di)) > > > (nil))) > > > > > > We must place it after both RAX and RDI are dead, which is after > > > > > > (insn 74 73 4 2 (set (reg/f:DI 118 [ caller_foop ]) > > > (reg:DI 4 si [ caller_foop ])) "pr81501-4a.c":26:1 99 > > > {*movdi_internal} > > > (expr_list:REG_DEAD (reg:DI 4 si [ caller_foop ]) > > > (nil))) > > > > > > > > > H.J. > > > > > > > > -- > > BR, > > Hongtao > > > > -- > BR, > Hongtao -- H.J.