On Fri, Aug 3, 2012 at 3:53 PM, Richard Earnshaw <rearn...@arm.com> wrote:
> On 03/08/12 13:49, Mans Rullgard wrote:
>> I have noticed gcc has a preference for generating UXTB instructions
>> when an AND with #255 would do the same thing.  This is bad, because
>> on A9 UXTB has two cycles latency compared to one cycle for AND.  On
>> A8 both instructions have one cycle latency.
>>
>
> UXTB on the other hand is a 16-bit instruction, whereas AND is a 32-bit one.
>
> Of the cores I'm aware of, only A9 has this performance anomaly.

While you are at it, please also consider blacklisting UXTAB
instruction variants when tuning for Cortex-A9 unless optimizing for
size.
I was fairly confident that I had a feature request in gcc bugzilla
about this, but apparently this is not the case. My bad.

-- 
Best regards,
Siarhei Siamashka

_______________________________________________
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain

Reply via email to