On Tue, Mar 6, 2012 at 8:10 PM, H.J. Lu <hjl.to...@gmail.com> wrote: > On Tue, Mar 6, 2012 at 7:37 AM, H.J. Lu <hjl.to...@gmail.com> wrote: >> On Mon, Mar 5, 2012 at 9:11 AM, H.J. Lu <hjl.to...@gmail.com> wrote: >>> On Sun, Mar 4, 2012 at 11:47 PM, Uros Bizjak <ubiz...@gmail.com> wrote: >>>> On Mon, Mar 5, 2012 at 4:53 AM, H.J. Lu <hjl.to...@gmail.com> wrote: >>>> >>>>> and compiler does generate the same output. i386.c also has >>>>> >>>>> xasm = "jmp\t%A0"; >>>>> xasm = "call\t%A0"; >>>>> >>>>> for calls. There are no separate indirect call patterns. For x32, >>>>> only indirect register calls have to be in DImode. The direct call >>>>> should be in Pmode (SImode). >>>> >>>> Direct call just expects label to some abolute address that is assumed >>>> to fit in 32 bits (see constant_call_address_operand_p). >>>> >>>> call and jmp insn expect word_mode operands, so please change >>>> ix86_expand_call and call patterns in the same way as jump >>>> instructions above. >>>> >>>>> Since x86-64 hardware always zero-extends upper 32bits of 64bit >>>>> registers when loading its lower 32bits, it is safe and easier to just >>>>> to output 64bit registers for %A than zero-extend it by hand for all >>>>> jump/call patterns. >>>> >>>> No, the instruction expects word_mode operands, so we have to extend >>>> values to expected mode. I don't think that patching at insn output >>>> time is acceptable. >>> >>> You are right. I found a testcase to show problem: >>> >>> struct foo >>> { >>> void (*f) (void); >>> int i; >>> }; >>> >>> void >>> __attribute__ ((noinline)) >>> bar (struct foo x) >>> { >>> x.f (); >>> } >>> >>> "x" is passed in RDI and the uppper 32bits of RDI is "int i". >>> >> > > Hi, > > This is what I come up with. This patch enforces word_mode for > non-SYMBOL_REF call/jmp address while allowing Pmode for > SYMBOL_REF call/jmp address. I added W for word_mode > used on indirect branches and added C for Pmode/word_mode > used on calls. For calls, I added check for SYMBOL_REF address > or address in word_mode since non-SYMBOL_REF address must > be in word_mode. Tested on Linux/x86-64. > > OK for trunk?
(define_insn_and_split "*sibcall_value_vzeroupper" [(set (match_operand 0 "" "") - (call (mem:QI (match_operand:P 1 "sibcall_insn_operand" "Uz")) + (call (mem:QI (match_operand:C 1 "sibcall_insn_operand" "Uz")) (match_operand 2 "" ""))) (unspec [(match_operand 3 "const_int_operand" "")] UNSPEC_CALL_NEEDS_VZEROUPPER)] - "TARGET_VZEROUPPER && SIBLING_CALL_P (insn)" + "TARGET_VZEROUPPER + && SIBLING_CALL_P (insn) + && ((GET_CODE (operands[1]) == SYMBOL_REF + && GET_MODE (operands[1]) == Pmode) + || (GET_CODE (operands[1]) != SYMBOL_REF + && GET_MODE (operands[1]) == word_mode))" "#" "&& reload_completed" [(const_int 0)] Why does this patterh have different insn constraitn than all others? BTW - please do not split existing "TARGET_VZEROUPPER && SIBLING_CALL_P (insn)" lines ... Uros.