On Wed, Oct 21, 2020 at 4:45 PM Qing Zhao <qing.z...@oracle.com> wrote: > > > > On Oct 21, 2020, at 3:03 AM, Uros Bizjak <ubiz...@gmail.com> wrote: > > On Wed, Oct 21, 2020 at 9:18 AM Uros Bizjak <ubiz...@gmail.com> wrote: > > > On Tue, Oct 20, 2020 at 10:04 PM Qing Zhao <qing.z...@oracle.com> wrote: > > +/* Check whether the register REGNO should be zeroed on X86. > + When ALL_SSE_ZEROED is true, all SSE registers have been zeroed > + together, no need to zero it again. > + Stack registers (st0-st7) and mm0-mm7 are aliased with each other. > + very hard to be zeroed individually, don't zero individual st or > + mm registgers at this time. */ > + > +static bool > +zero_call_used_regno_p (const unsigned int regno, > + bool all_sse_zeroed) > +{ > + return GENERAL_REGNO_P (regno) > + || (!all_sse_zeroed && SSE_REGNO_P (regno)) > + || MASK_REGNO_P (regno); > +} > + > +/* Return the machine_mode that is used to zero register REGNO. */ > + > +static machine_mode > +zero_call_used_regno_mode (const unsigned int regno) > +{ > + /* NB: We only need to zero the lower 32 bits for integer registers > + and the lower 128 bits for vector registers since destination are > + zero-extended to the full register width. */ > + if (GENERAL_REGNO_P (regno)) > + return SImode; > + else if (SSE_REGNO_P (regno)) > + return V4SFmode; > + else > + return HImode; > +} > + > +/* Generate a rtx to zero all vector registers togetehr if possible, > + otherwise, return NULL. */ > + > +static rtx > +zero_all_vector_registers (HARD_REG_SET need_zeroed_hardregs) > +{ > + if (!TARGET_AVX) > + return NULL; > + > + for (unsigned int regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++) > + if ((IN_RANGE (regno, FIRST_SSE_REG, LAST_SSE_REG) > + || (TARGET_64BIT > + && (REX_SSE_REGNO_P (regno) > + || (TARGET_AVX512F && EXT_REX_SSE_REGNO_P (regno))))) > + && !TEST_HARD_REG_BIT (need_zeroed_hardregs, regno)) > + return NULL; > + > + return gen_avx_vzeroall (); > +} > + > +/* Generate a rtx to zero all st and mm registers togetehr if possible, > + otherwise, return NULL. */ > + > +static rtx > +zero_all_st_mm_registers (HARD_REG_SET need_zeroed_hardregs) > +{ > + if (!TARGET_MMX) > + return NULL; > + > + for (unsigned int regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++) > + if ((STACK_REGNO_P (regno) || MMX_REGNO_P (regno)) > + && !TEST_HARD_REG_BIT (need_zeroed_hardregs, regno)) > + return NULL; > + > + return gen_mmx_emms (); > > > emms is not clearing any register, it only loads x87FPUTagWord with > FFFFH. So I think, the above is useless, as far as register clearing > is concerned. > > > Thanks for the info. > > So, for mm and st registers, should we clear them, and how? > > > I don't know. > > Please note that %mm and %st share the same register file, and > touching %mm registers will block access to %st until emms is emitted. > You can't just blindly load 0 to %st registers, because the register > file can be in MMX mode and vice versa. For 32bit targets, function > can also return a value in the %mm0. > > > If data flow determine that %mm0 does not return a value at the return, can > we clear all the %st as following: > > emms > mov %st0, 0 > mov %st1, 0 > mov %st2, 0 > mov %st3, 0 > mov %st4, 0 > mov %st5, 0 > mov %st6, 0 > mov %st7, 0 > > > The i386 ABI says: > > -- q -- > The CPU shall be in x87 mode upon entry to a function. Therefore, > every function that uses the MMX registers is required to issue an > emms or femms instruction after using MMX registers, before returning > or calling another function. > -- /q -- > > (The above requirement slightly contradicts its own ABI, since we have > 3 MMX argument registers and MMX return register, so the CPU obviously > can't be in x87 mode at all function boundaries). > > So, assuming that the first sentence is not deliberately vague w.r.t > function exit, emms should not be needed. However, we are dealing with > x87 stack registers that have their own set of peculiarities. It is > not possible to load a random register in the way you show. Also, > stack should be either empty or one (two in case of complex value > return) levels deep at the function return. I think you want a series > of 8 or 7(6) fldz insns, followed by a series of fstp insn to clear > the stack and mark stack slots empty. > > > Something like this: > > --cut here-- > long double > __attribute__ ((noinline)) > test (long double a, long double b) > { > long double r = a + b; > > asm volatile ("fldz; \ > fldz; \ > fldz; \ > fldz; \ > fldz; \ > fldz; \ > fldz; \ > fstp %%st(0); \ > fstp %%st(0); \ > fstp %%st(0); \ > fstp %%st(0); \ > fstp %%st(0); \ > fstp %%st(0); \ > fstp %%st(0)" : : "X"(r)); > return r; > } > > int > main () > { > long double a = 1.1, b = 1.2; > > long double c = test (a, b); > > printf ("%Lf\n", c); > > return 0; > } > --cut here— > > > > Okay, so, > > 1. First compute how many st registers need to be zeroed, num_of_zeroed_st > 2. Then issue (8 - num_of_zeroed_st) fldz to push 0 to the stack to clear all > the dead stack slots; > 3. Then issue (8 - num_of_zeroed_st) fstp %st(0) to pop the stack and empty > the stack. > > Is the above understanding correctly?
Yes. > Another thought is: > > Looks like it’s very complicate to use the st/mm register set correctly, So, > I assume that this set of registers might be very hard to be used by the > attacker correctly. > Right? Correct, but "very hard to be used" depends on how determined the attacker is. Uros.