Re: Zen5 tuning part 2: disable gather and scatter

2024-09-04 Thread Toon Moene
On 9/4/24 12:55, Jan Hubicka wrote: On 9/3/24 15:07, Jan Hubicka wrote: Hi, We disable gathers for zen4. It seems that gather has improved a bit compared to zen4 and Zen5 optimization manual suggests "Avoid GATHER instructions when the indices are known ahead of time. Vector loads followed by

Re: Zen5 tuning part 2: disable gather and scatter

2024-09-04 Thread Richard Biener
On Wed, Sep 4, 2024 at 12:56 PM Jan Hubicka wrote: > > > On 9/3/24 15:07, Jan Hubicka wrote: > > > > > Hi, > > > We disable gathers for zen4. It seems that gather has improved a bit > > > compared > > > to zen4 and Zen5 optimization manual suggests "Avoid GATHER instructions > > > when > > > th

Re: Zen5 tuning part 2: disable gather and scatter

2024-09-04 Thread Jan Hubicka
> On 9/3/24 15:07, Jan Hubicka wrote: > > > Hi, > > We disable gathers for zen4. It seems that gather has improved a bit > > compared > > to zen4 and Zen5 optimization manual suggests "Avoid GATHER instructions > > when > > the indices are known ahead of time. Vector loads followed by shuffles

Re: Zen5 tuning part 2: disable gather and scatter

2024-09-04 Thread Toon Moene
On 9/3/24 15:07, Jan Hubicka wrote: Hi, We disable gathers for zen4. It seems that gather has improved a bit compared to zen4 and Zen5 optimization manual suggests "Avoid GATHER instructions when the indices are known ahead of time. Vector loads followed by shuffles result in a higher load band

Re: Zen5 tuning part 2: disable gather and scatter

2024-09-03 Thread Richard Biener
On Tue, Sep 3, 2024 at 3:07 PM Jan Hubicka wrote: > > Hi, > We disable gathers for zen4. It seems that gather has improved a bit compared > to zen4 and Zen5 optimization manual suggests "Avoid GATHER instructions when > the indices are known ahead of time. Vector loads followed by shuffles result

Zen5 tuning part 2: disable gather and scatter

2024-09-03 Thread Jan Hubicka
Hi, We disable gathers for zen4. It seems that gather has improved a bit compared to zen4 and Zen5 optimization manual suggests "Avoid GATHER instructions when the indices are known ahead of time. Vector loads followed by shuffles result in a higher load bandwidth." however the situation seems to