On Tue, Oct 20, 2020 at 10:04 PM Qing Zhao <qing.z...@oracle.com> wrote:
> +/* Check whether the register REGNO should be zeroed on X86. > + When ALL_SSE_ZEROED is true, all SSE registers have been zeroed > + together, no need to zero it again. > + Stack registers (st0-st7) and mm0-mm7 are aliased with each other. > + very hard to be zeroed individually, don't zero individual st or > + mm registgers at this time. */ > + > +static bool > +zero_call_used_regno_p (const unsigned int regno, > + bool all_sse_zeroed) > +{ > + return GENERAL_REGNO_P (regno) > + || (!all_sse_zeroed && SSE_REGNO_P (regno)) > + || MASK_REGNO_P (regno); > +} > + > +/* Return the machine_mode that is used to zero register REGNO. */ > + > +static machine_mode > +zero_call_used_regno_mode (const unsigned int regno) > +{ > + /* NB: We only need to zero the lower 32 bits for integer registers > + and the lower 128 bits for vector registers since destination are > + zero-extended to the full register width. */ > + if (GENERAL_REGNO_P (regno)) > + return SImode; > + else if (SSE_REGNO_P (regno)) > + return V4SFmode; > + else > + return HImode; > +} > + > +/* Generate a rtx to zero all vector registers togetehr if possible, > + otherwise, return NULL. */ > + > +static rtx > +zero_all_vector_registers (HARD_REG_SET need_zeroed_hardregs) > +{ > + if (!TARGET_AVX) > + return NULL; > + > + for (unsigned int regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++) > + if ((IN_RANGE (regno, FIRST_SSE_REG, LAST_SSE_REG) > + || (TARGET_64BIT > + && (REX_SSE_REGNO_P (regno) > + || (TARGET_AVX512F && EXT_REX_SSE_REGNO_P (regno))))) > + && !TEST_HARD_REG_BIT (need_zeroed_hardregs, regno)) > + return NULL; > + > + return gen_avx_vzeroall (); > +} > + > +/* Generate a rtx to zero all st and mm registers togetehr if possible, > + otherwise, return NULL. */ > + > +static rtx > +zero_all_st_mm_registers (HARD_REG_SET need_zeroed_hardregs) > +{ > + if (!TARGET_MMX) > + return NULL; > + > + for (unsigned int regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++) > + if ((STACK_REGNO_P (regno) || MMX_REGNO_P (regno)) > + && !TEST_HARD_REG_BIT (need_zeroed_hardregs, regno)) > + return NULL; > + > + return gen_mmx_emms (); > > > emms is not clearing any register, it only loads x87FPUTagWord with > FFFFH. So I think, the above is useless, as far as register clearing > is concerned. > > > Thanks for the info. > > So, for mm and st registers, should we clear them, and how? > > > I don't know. > > Please note that %mm and %st share the same register file, and > touching %mm registers will block access to %st until emms is emitted. > You can't just blindly load 0 to %st registers, because the register > file can be in MMX mode and vice versa. For 32bit targets, function > can also return a value in the %mm0. > > > If data flow determine that %mm0 does not return a value at the return, can > we clear all the %st as following: > > emms > mov %st0, 0 > mov %st1, 0 > mov %st2, 0 > mov %st3, 0 > mov %st4, 0 > mov %st5, 0 > mov %st6, 0 > mov %st7, 0 The i386 ABI says: -- q -- The CPU shall be in x87 mode upon entry to a function. Therefore, every function that uses the MMX registers is required to issue an emms or femms instruction after using MMX registers, before returning or calling another function. -- /q -- (The above requirement slightly contradicts its own ABI, since we have 3 MMX argument registers and MMX return register, so the CPU obviously can't be in x87 mode at all function boundaries). So, assuming that the first sentence is not deliberately vague w.r.t function exit, emms should not be needed. However, we are dealing with x87 stack registers that have their own set of peculiarities. It is not possible to load a random register in the way you show. Also, stack should be either empty or one (two in case of complex value return) levels deep at the function return. I think you want a series of 8 or 7(6) fldz insns, followed by a series of fstp insn to clear the stack and mark stack slots empty. Uros.