On Wed, 3 May 2023 at 08:10, Richard Henderson <[email protected]> wrote: > > Notice when Intel or AMD have guaranteed that vmovdqa is atomic. > The new variable will also be used in generated code. > > Signed-off-by: Richard Henderson <[email protected]> > --- > include/qemu/cpuid.h | 18 ++++++++++++++++++ > tcg/i386/tcg-target.h | 1 + > tcg/i386/tcg-target.c.inc | 27 +++++++++++++++++++++++++++ > 3 files changed, 46 insertions(+) > > diff --git a/include/qemu/cpuid.h b/include/qemu/cpuid.h > index 1451e8ef2f..35325f1995 100644 > --- a/include/qemu/cpuid.h > +++ b/include/qemu/cpuid.h > @@ -71,6 +71,24 @@ > #define bit_LZCNT (1 << 5) > #endif > > +/* > + * Signatures for different CPU implementations as returned from Leaf 0. > + */ > + > +#ifndef signature_INTEL_ecx > +/* "Genu" "ineI" "ntel" */ > +#define signature_INTEL_ebx 0x756e6547 > +#define signature_INTEL_edx 0x49656e69 > +#define signature_INTEL_ecx 0x6c65746e > +#endif > + > +#ifndef signature_AMD_ecx > +/* "Auth" "enti" "cAMD" */ > +#define signature_AMD_ebx 0x68747541 > +#define signature_AMD_edx 0x69746e65 > +#define signature_AMD_ecx 0x444d4163 > +#endif
> @@ -4024,6 +4025,32 @@ static void tcg_target_init(TCGContext *s) > have_avx512dq = (b7 & bit_AVX512DQ) != 0; > have_avx512vbmi2 = (c7 & bit_AVX512VBMI2) != 0; > } > + > + /* > + * The Intel SDM has added: > + * Processors that enumerate support for IntelĀ® AVX > + * (by setting the feature flag CPUID.01H:ECX.AVX[bit 28]) > + * guarantee that the 16-byte memory operations performed > + * by the following instructions will always be carried > + * out atomically: > + * - MOVAPD, MOVAPS, and MOVDQA. > + * - VMOVAPD, VMOVAPS, and VMOVDQA when encoded with > VEX.128. > + * - VMOVAPD, VMOVAPS, VMOVDQA32, and VMOVDQA64 when > encoded > + * with EVEX.128 and k0 (masking disabled). > + * Note that these instructions require the linear addresses > + * of their memory operands to be 16-byte aligned. > + * > + * AMD has provided an even stronger guarantee that > processors > + * with AVX provide 16-byte atomicity for all cachable, > + * naturally aligned single loads and stores, e.g. MOVDQU. > + * > + * See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104688 > + */ > + if (have_avx1) { > + __cpuid(0, a, b, c, d); > + have_atomic16 = (c == signature_INTEL_ecx || > + c == signature_AMD_ecx); > + } If the signature is 3 words why are we only checking one here ? thanks -- PMM
