Mike Stump <mikest...@comcast.net> writes:

>>>> Silvermont processors have penalty for instructions having 4+ bytes of
>>>> prefixes (including escape bytes in opcode).  This situation happens
>>>> when REX prefix is used in SSE4 instructions.  This patch tries to
>>>> avoid such situation by preferring xmm0-xmm7 usage over xmm8-xmm15 in
>>>> those instructions.  I achieved it by adding new tuning flag and new
>>>> alternatives affected by tuning.
>>> 
>>> Why make it a tuning flag? Shouldn't this help unconditionally for code
>>> size everywhere? Or is there some drawback? 
>> 
>> I don't think it will make code smaller, if you already have some value in
>> xmm8..xmm15 register, then by not allowing those registers directly on SSE4
>> insns just means it reloading and larger code.
>
> I can’t help but think a better way to do this is to explain the costs
> of the REX registers as being more expensive, then let the register
> allocator prefer the cheaper registers.  You then leave them all as
> valid, which I think is better than disappearing 1/2 of the registers.

Yes that would sound like a much better strategy.

BTW I thought gcc already did that for the integer registers to
avoid unnecessary prefixes, but maybe I misremember. Copying Honza.

-Andi
-- 
a...@linux.intel.com -- Speaking for myself only

Reply via email to