Wilco Dijkstra writes:
> Hi Richard,
>
>>> That tune is only used by an obsolete core. I ran the memcpy and memset
>>> benchmarks from Optimized Routines on xgene-1 with and without LDP/STP.
>>> There is no measurable penalty for using LDP/STP. I'm not sure why it was
>>> ever added given it does
Hi Richard,
>> That tune is only used by an obsolete core. I ran the memcpy and memset
>> benchmarks from Optimized Routines on xgene-1 with and without LDP/STP.
>> There is no measurable penalty for using LDP/STP. I'm not sure why it was
>> ever added given it does not do anything useful. I'll po
Wilco Dijkstra writes:
> Hi Richard,
>
>>> +#define MAX_SET_SIZE(speed) (speed ? 256 : 96)
>>
>> Since this isn't (AFAIK) a standard macro, there doesn't seem to be
>> any need to put it in the header file. It could just go at the head
>> of aarch64.cc instead.
>
> Sure, I've moved it in v4.
>
>>
Hi Richard,
>> +#define MAX_SET_SIZE(speed) (speed ? 256 : 96)
>
> Since this isn't (AFAIK) a standard macro, there doesn't seem to be
> any need to put it in the header file. It could just go at the head
> of aarch64.cc instead.
Sure, I've moved it in v4.
>> + if (len <= 24 || (aarch64_tune_p