Re: [PATCH v3] Target-independent store forwarding avoidance.

2024-07-09 Thread Manolis Tsamis
On Mon, Jul 8, 2024 at 6:41 PM Andi Kleen wrote: > > > I have added a target hook for this in v4 of this patch. The hook > > receives all the information about the stores, the load, the estimated > > sequence cost and whether we expect to eliminate the load. With this > > information the target sh

Re: [PATCH v3] Target-independent store forwarding avoidance.

2024-07-08 Thread Jeff Law
On 7/8/24 6:58 AM, Manolis Tsamis wrote: This is still hard to tell. In some cases I have observed either improvement or regressions in benchmarks, which are highly susceptible to costing and the specific store-forwarding penalties of the CPU. I have seen cases where the store-forwarding ins

Re: [PATCH v3] Target-independent store forwarding avoidance.

2024-07-08 Thread Andi Kleen
> I have added a target hook for this in v4 of this patch. The hook > receives all the information about the stores, the load, the estimated > sequence cost and whether we expect to eliminate the load. With this > information the target should be able to make an informed decision. > > What you men

Re: [PATCH v3] Target-independent store forwarding avoidance.

2024-07-08 Thread Manolis Tsamis
On Thu, Jun 13, 2024 at 7:18 PM Andi Kleen wrote: > > Manolis Tsamis writes: > > > > Assembly like this can appear with bitfields or type punning / unions. > > On stress-ng when running the cpu-union microbenchmark the following > > speedups > > have been observed. > > > > Neoverse-N1: +2

Re: [PATCH v3] Target-independent store forwarding avoidance.

2024-06-14 Thread Jeff Law
On 6/13/24 5:32 AM, Manolis Tsamis wrote: This pass detects cases of expensive store forwarding and tries to avoid them by reordering the stores and using suitable bit insertion sequences. For example it can transform this: strbw2, [x1, 1] ldr x0, [x1] # Expensive sto

Re: [PATCH v3] Target-independent store forwarding avoidance.

2024-06-13 Thread Jeff Law
On 6/13/24 10:10 AM, Andi Kleen wrote: Manolis Tsamis writes: Assembly like this can appear with bitfields or type punning / unions. On stress-ng when running the cpu-union microbenchmark the following speedups have been observed. Neoverse-N1: +29.4% Intel Coffeelake: +13.1%

Re: [PATCH v3] Target-independent store forwarding avoidance.

2024-06-13 Thread Andi Kleen
Manolis Tsamis writes: > > Assembly like this can appear with bitfields or type punning / unions. > On stress-ng when running the cpu-union microbenchmark the following speedups > have been observed. > > Neoverse-N1: +29.4% > Intel Coffeelake: +13.1% > AMD 5950X:+17.5% It seems