On Thu, May 10, 2018 at 2:50 PM, Peryt, Sebastian <sebastian.pe...@intel.com> wrote: > Hi Uros, > > Updated patch attached, please find comments below. > >> -----Original Message----- >> From: Uros Bizjak [mailto:ubiz...@gmail.com] >> Sent: Wednesday, May 9, 2018 1:47 PM >> To: Peryt, Sebastian <sebastian.pe...@intel.com> >> Cc: gcc-patches@gcc.gnu.org; Kirill Yukhin <kirill.yuk...@gmail.com> >> Subject: Re: [PATCH][i386] Adding WAITPKG instructions >> >> On Tue, May 8, 2018 at 1:34 PM, Peryt, Sebastian <sebastian.pe...@intel.com> >> wrote: >> > Hi, >> > >> > This patch adds support for WAITPKG instructions. >> > >> > Is it ok for trunk and after few day for backport to GCC-8? >> > > (Removed) >> > >> > >> >> + case IX86_BUILTIN_UMONITOR: >> + arg0 = CALL_EXPR_ARG (exp, 0); >> + op0 = expand_normal (arg0); >> + if (!REG_P (op0)) >> + op0 = ix86_zero_extend_to_Pmode (op0); >> + >> + emit_insn (ix86_gen_umonitor (op0)); >> + return 0; >> >> Please see how movdir64b handles its address operand. Also, do not use global >> ix86_gen_monitor, just expand directly in the same way as movdir64b. >> > > Fixed. > >> + case IX86_BUILTIN_UMWAIT: >> + case IX86_BUILTIN_TPAUSE: >> + rtx eax, edx, op1_lo, op1_hi; >> + arg0 = CALL_EXPR_ARG (exp, 0); >> + arg1 = CALL_EXPR_ARG (exp, 1); >> + op0 = expand_normal (arg0); >> + op1 = expand_normal (arg1); >> + eax = gen_rtx_REG (SImode, AX_REG); >> + edx = gen_rtx_REG (SImode, DX_REG); >> + if (!REG_P (op0)) >> + op0 = copy_to_mode_reg (SImode, op0); >> + if (!REG_P (op1)) >> + op1 = copy_to_mode_reg (DImode, op1); >> + op1_lo = gen_lowpart (SImode, op1); >> + op1_hi = expand_shift (RSHIFT_EXPR, DImode, op1, >> + GET_MODE_BITSIZE (SImode), 0, 1); >> + op1_hi = convert_modes (SImode, DImode, op1_hi, 1); >> + emit_move_insn (eax, op1_lo); >> + emit_move_insn (edx, op1_hi); >> + emit_insn (fcode == IX86_BUILTIN_UMWAIT >> + ? gen_umwait (op0, eax, edx) >> + : gen_tpause (op0, eax, edx)); >> + >> + /* Return current CF value. */ >> + op3 = gen_rtx_REG (CCCmode, FLAGS_REG); >> + target = gen_rtx_LTU (QImode, op3, const0_rtx); >> + >> + return target; >> >> For the above code, please see how xsetbv expansion and patterns are handling >> their input operands. There should be two patterns, one for 32bit and the >> other >> for 64bit targets. The patterns will need to set FLAGS_REG, otherwise the >> test >> will be removed. >> > > I copied what is done for xsetbv expansion and most likely I found some bug > in GCC. > The problem is that when I use 3 arguments and compile as 64bit version upper > part > of rax is not cleared. It doesn't appear when I'm using 2 or 4 function > arguments. > Most likely error is caused by the fact that rdx is used both as an input for > function and > argument in instruction.
There is no need to clear upper parts of 64bit register. As specified in the ISA (and modelled with RTX pattern), the instruction (e.g. tpause) reads only lower 32 bits from %rax and %rdx. Implicitly, the instruction should ignore upper 32 bits by itself, so we can use SUBREGs. If this is not the case, we need to use DImode input arguments in RTX pattern and explicitly emit zero-extension insns to clear upper 32 bits of input arguments. Uros.