Re: [PATCH v4] AArch64: Cleanup memset expansion

2024-02-01 Thread Richard Sandiford
Wilco Dijkstra writes: > Hi Richard, > >>> That tune is only used by an obsolete core. I ran the memcpy and memset >>> benchmarks from Optimized Routines on xgene-1 with and without LDP/STP. >>> There is no measurable penalty for using LDP/STP. I'm not sure why it was >>> ever added given it does

Re: [PATCH v4] AArch64: Cleanup memset expansion

2024-01-30 Thread Wilco Dijkstra
Hi Richard, >> That tune is only used by an obsolete core. I ran the memcpy and memset >> benchmarks from Optimized Routines on xgene-1 with and without LDP/STP. >> There is no measurable penalty for using LDP/STP. I'm not sure why it was >> ever added given it does not do anything useful. I'll po

Re: [PATCH v4] AArch64: Cleanup memset expansion

2024-01-10 Thread Richard Sandiford
Wilco Dijkstra writes: > Hi Richard, > >>> +#define MAX_SET_SIZE(speed) (speed ? 256 : 96) >> >> Since this isn't (AFAIK) a standard macro, there doesn't seem to be >> any need to put it in the header file. It could just go at the head >> of aarch64.cc instead. > > Sure, I've moved it in v4. > >>

Re: [PATCH v4] AArch64: Cleanup memset expansion

2024-01-09 Thread Wilco Dijkstra
Hi Richard, >> +#define MAX_SET_SIZE(speed) (speed ? 256 : 96) > > Since this isn't (AFAIK) a standard macro, there doesn't seem to be > any need to put it in the header file.  It could just go at the head > of aarch64.cc instead. Sure, I've moved it in v4. >> +  if (len <= 24 || (aarch64_tune_p