On Wed, Apr 27, 2016 at 4:26 PM, Ilya Enkovich <enkovich....@gmail.com> wrote:

>>> >> > X86_TUNE_GENERAL_REGS_SSE_SPILL: Try to spill general regs to SSE
>>> >> > regs
>>> >> instead of memory.
>>> >> >
>>> >> > I tried enabling the above tuning with -march=bdver4 -Ofast -mtune-
>>> >> ctrl=general_regs_sse_spill.
>>> >> > I did not find any code differences.
>>> >> >
>>> >> > Looking at the below code to enable this tune,  mmx ISA needs to be
>>> >> > turned
>>> >> off.
>>> >> >
>>> >> > static reg_class_t
>>> >> > ix86_spill_class (reg_class_t rclass, machine_mode mode) {
>>> >> >   if (TARGET_SSE && TARGET_GENERAL_REGS_SSE_SPILL && !
>>> >> TARGET_MMX
>>> >> >       && (mode == SImode || (TARGET_64BIT && mode == DImode))
>>> >> >       && rclass != NO_REGS && INTEGER_CLASS_P (rclass))
>>> >> >     return ALL_SSE_REGS;
>>> >> >   return NO_REGS;
>>> >> > }
>>> >> >
>>> >> > All processor variants enable MMX by default  and why we need to
>>> >> > switch
>>> >> off mmx?
>>> >>
>>> >> That really looks weird to me.  I ran SPEC2006 on Ofast + LTO with
>>> >> and without -mno-mmx and -mno-mmx gives (Haswell machine):
>>> >>
>>> >> SPEC2006INT     :    +0.30%
>>> >> SPEC2006FP      :    +0.60%
>>> >> SPEC2006ALL     :    +0.48%
>>> >>
>>> >> Which is quite surprising for disabling a hardware feature hardly
>>> >> used anywhere now.
>>> >
>>> > As I said without mmx (-mno-mmx), the tune
>>> X86_TUNE_GENERAL_REGS_SSE_SPILL may be active now.
>>> > Not sure if there are any other reason.
>>>
>>> Surely that should be the main reason I see performance gain.
>>> So I want to ask the same question as you did: why does this important
>>> performance feature requires disabled MMX.  This restriction exists from the
>>> very start of X86_TUNE_GENERAL_REGS_SSE_SPILL existence (at least in
>>> trunk) and no comments on why we have this restriction.
>>
>> I was told by Uros,  that using TARGET_MMX is to prevent intreg <-> MMX 
>> moves that clobber stack registers.
>
> ix86_spill_class is supposed to return a register class to be used
> to store general purpose registers.  It returns ALL_SSE_REGS which
> doesn't intersect with MMX_REGS class.  So I don't see why
> intreg <-> MMX moves may appear.  And if those moves appear we should
> fix it, not disable the whole feature.
>
> @Uros, do you have a comment here?

Looking at the implementation of ix86_spill_class, TARGET_MMX check
really looks too restrictive. However, we need to check TARGET_SSE2
and TARGET_INTERUNIT_MOVES instead, otherwise movq xmm <-> intreg
pattern gets disabled

This change should be OK then, but just in case, SSE2 enabled
-mfpmath=i387 32bit SPEC run should uncover unwanted MMX instructions.

Uros.

Reply via email to