Re: [PATCH, AArch64, v3 0/6] LSE atomics out-of-line

2019-09-17 Thread Richard Henderson
On 9/17/19 6:55 AM, Wilco Dijkstra wrote: > Hi Kyrill, > >>> When you select a CPU the goal is that we optimize and schedule for that >>> specific microarchitecture. That implies using atomics that work best for >>> that core rather than outlining them. >> >> I think we want to go ahead with this

Re: [PATCH, AArch64, v3 0/6] LSE atomics out-of-line

2019-09-17 Thread Wilco Dijkstra
Hi Kyrill, >> When you select a CPU the goal is that we optimize and schedule for that >> specific microarchitecture. That implies using atomics that work best for >> that core rather than outlining them. > > I think we want to go ahead with this framework to enable the portable > deployment of L

Re: [PATCH, AArch64, v3 0/6] LSE atomics out-of-line

2019-09-17 Thread Kyrill Tkachov
On 9/16/19 12:58 PM, Wilco Dijkstra wrote: Hi Richard, >> So what is the behaviour when you explicitly select a specific CPU? > > Selecting a specific cpu selects the specific architecture that the cpu > supports, does it not?  Thus the architecture example above still applies. > > Unless I

Re: [PATCH, AArch64, v3 0/6] LSE atomics out-of-line

2019-09-16 Thread Wilco Dijkstra
Hi Richard, >> So what is the behaviour when you explicitly select a specific CPU? > > Selecting a specific cpu selects the specific architecture that the cpu > supports, does it not?  Thus the architecture example above still applies. > > Unless I don't understand what distinction that you're mak

Re: [PATCH, AArch64, v3 0/6] LSE atomics out-of-line

2019-09-14 Thread Richard Henderson
On 9/5/19 10:35 AM, Wilco Dijkstra wrote: > Agreed. I've got a couple of general comments: > > * The option name -matomic-ool sounds too abbreviated. I think eg. > -moutline-atomics is more descriptive and user friendlier. Changed. > * Similarly the exported __aa64_have_atomics variable could be

Re: [PATCH, AArch64, v3 0/6] LSE atomics out-of-line

2019-09-05 Thread Wilco Dijkstra
Hi Richard, >What I have not done, but is now a possibility, is to use a custom >calling convention for the out-of-line routines. I now only clobber >2 (or 3, for TImode) temp regs and set a return value. This would be a great feature to have since it reduces the overhead of outlinin

Re: [PATCH, AArch64, v3 0/6] LSE atomics out-of-line

2019-09-05 Thread Kyrill Tkachov
Hi Richard, On 11/1/18 9:46 PM, Richard Henderson wrote: From: Richard Henderson Changes since v2:   * Committed half of the patch set.   * Split inline TImode support from out-of-line patches.   * Removed the ST out-of-line functions, to match inline.   * Moved the out-of-line functions to as

Re: [PATCH, AArch64, v3 0/6] LSE atomics out-of-line

2018-11-11 Thread Richard Henderson
Ping. On 11/1/18 10:46 PM, Richard Henderson wrote: > From: Richard Henderson > > Changes since v2: > * Committed half of the patch set. > * Split inline TImode support from out-of-line patches. > * Removed the ST out-of-line functions, to match inline. > * Moved the out-of-line function

[PATCH, AArch64, v3 0/6] LSE atomics out-of-line

2018-11-01 Thread Richard Henderson
From: Richard Henderson Changes since v2: * Committed half of the patch set. * Split inline TImode support from out-of-line patches. * Removed the ST out-of-line functions, to match inline. * Moved the out-of-line functions to assembly. What I have not done, but is now a possibility, is