On Wed, Apr 27, 2016 at 4:26 PM, Ilya Enkovich <enkovich....@gmail.com> wrote:
>>> >> > X86_TUNE_GENERAL_REGS_SSE_SPILL: Try to spill general regs to SSE >>> >> > regs >>> >> instead of memory. >>> >> > >>> >> > I tried enabling the above tuning with -march=bdver4 -Ofast -mtune- >>> >> ctrl=general_regs_sse_spill. >>> >> > I did not find any code differences. >>> >> > >>> >> > Looking at the below code to enable this tune, mmx ISA needs to be >>> >> > turned >>> >> off. >>> >> > >>> >> > static reg_class_t >>> >> > ix86_spill_class (reg_class_t rclass, machine_mode mode) { >>> >> > if (TARGET_SSE && TARGET_GENERAL_REGS_SSE_SPILL && ! >>> >> TARGET_MMX >>> >> > && (mode == SImode || (TARGET_64BIT && mode == DImode)) >>> >> > && rclass != NO_REGS && INTEGER_CLASS_P (rclass)) >>> >> > return ALL_SSE_REGS; >>> >> > return NO_REGS; >>> >> > } >>> >> > >>> >> > All processor variants enable MMX by default and why we need to >>> >> > switch >>> >> off mmx? >>> >> >>> >> That really looks weird to me. I ran SPEC2006 on Ofast + LTO with >>> >> and without -mno-mmx and -mno-mmx gives (Haswell machine): >>> >> >>> >> SPEC2006INT : +0.30% >>> >> SPEC2006FP : +0.60% >>> >> SPEC2006ALL : +0.48% >>> >> >>> >> Which is quite surprising for disabling a hardware feature hardly >>> >> used anywhere now. >>> > >>> > As I said without mmx (-mno-mmx), the tune >>> X86_TUNE_GENERAL_REGS_SSE_SPILL may be active now. >>> > Not sure if there are any other reason. >>> >>> Surely that should be the main reason I see performance gain. >>> So I want to ask the same question as you did: why does this important >>> performance feature requires disabled MMX. This restriction exists from the >>> very start of X86_TUNE_GENERAL_REGS_SSE_SPILL existence (at least in >>> trunk) and no comments on why we have this restriction. >> >> I was told by Uros, that using TARGET_MMX is to prevent intreg <-> MMX >> moves that clobber stack registers. > > ix86_spill_class is supposed to return a register class to be used > to store general purpose registers. It returns ALL_SSE_REGS which > doesn't intersect with MMX_REGS class. So I don't see why > intreg <-> MMX moves may appear. And if those moves appear we should > fix it, not disable the whole feature. > > @Uros, do you have a comment here? Looking at the implementation of ix86_spill_class, TARGET_MMX check really looks too restrictive. However, we need to check TARGET_SSE2 and TARGET_INTERUNIT_MOVES instead, otherwise movq xmm <-> intreg pattern gets disabled This change should be OK then, but just in case, SSE2 enabled -mfpmath=i387 32bit SPEC run should uncover unwanted MMX instructions. Uros.