On Jul 2, 2014, at 9:27 AM, Jakub Jelinek <ja...@redhat.com> wrote:
> On Wed, Jul 02, 2014 at 09:21:25AM -0700, Andi Kleen wrote:
>> Ilya Enkovich <enkovich....@gmail.com> writes:
>> 
>>> Silvermont processors have penalty for instructions having 4+ bytes of
>>> prefixes (including escape bytes in opcode).  This situation happens
>>> when REX prefix is used in SSE4 instructions.  This patch tries to
>>> avoid such situation by preferring xmm0-xmm7 usage over xmm8-xmm15 in
>>> those instructions.  I achieved it by adding new tuning flag and new
>>> alternatives affected by tuning.
>> 
>> Why make it a tuning flag? Shouldn't this help unconditionally for code
>> size everywhere? Or is there some drawback? 
> 
> I don't think it will make code smaller, if you already have some value in
> xmm8..xmm15 register, then by not allowing those registers directly on SSE4
> insns just means it reloading and larger code.

I can’t help but think a better way to do this is to explain the costs of the 
REX registers as being more expensive, then let the register allocator prefer 
the cheaper registers.  You then leave them all as valid, which I think is 
better than disappearing 1/2 of the registers.  Is it really cheaper to spill 
and reload those than use a REX register?

Reply via email to