i386: Honor 64-bit atomicity in 32-bit mode

Peter Maydell Fri, 05 May 2023 06:29:04 -0700

On Wed, 3 May 2023 at 08:18, Richard Henderson
<[email protected]> wrote:
>
> Use the fpu to perform 64-bit loads and stores.
>
> Signed-off-by: Richard Henderson <[email protected]>



> @@ -2091,7 +2095,20 @@ static void tcg_out_qemu_ld_direct(TCGContext *s, 
> TCGReg datalo, TCGReg datahi,
>              datalo = datahi;
>              datahi = t;
>          }
> -        if (h.base == datalo || h.index == datalo) {
> +        if (h.atom == MO_64) {
> +            /*
> +             * Atomicity requires that we use use a single 8-byte load.
> +             * For simplicity and code size, always use the FPU for this.
> +             * Similar insns using SSE/AVX are merely larger.

I'm surprised there's no performance penalty for throwing old-school
FPU insns into what is presumably otherwise code that's only
using modern SSE.

> +             * Load from memory in one go, then store back to the stack,
> +             * from whence we can load into the correct integer regs.
> +             */
> +            tcg_out_modrm_sib_offset(s, OPC_ESCDF + h.seg, ESCDF_FILD_m64,
> +                                     h.base, h.index, 0, h.ofs);
> +            tcg_out_modrm_offset(s, OPC_ESCDF, ESCDF_FISTP_m64, TCG_REG_ESP, 
> 0);
> +            tcg_out_modrm_offset(s, movop, datalo, TCG_REG_ESP, 0);
> +            tcg_out_modrm_offset(s, movop, datahi, TCG_REG_ESP, 4);
> +        } else if (h.base == datalo || h.index == datalo) {
>              tcg_out_modrm_sib_offset(s, OPC_LEA, datahi,
>                                       h.base, h.index, 0, h.ofs);
>              tcg_out_modrm_offset(s, movop + h.seg, datalo, datahi, 0);

I assume the caller has arranged that the top of the stack
is trashable at this point?

Reviewed-by: Peter Maydell <[email protected]>

-- PMM

Re: [PATCH v4 52/57] tcg/i386: Honor 64-bit atomicity in 32-bit mode

Reply via email to