Re: [PATCH 4/6] aarch64: Optimize vector rotates into REV* instructions where possible

2024-10-27 Thread Kyrylo Tkachov
> On 25 Oct 2024, at 15:25, Richard Sandiford wrote: > > Kyrylo Tkachov writes: >>> On 25 Oct 2024, at 13:46, Richard Sandiford >>> wrote: >>> >>> Kyrylo Tkachov writes: Thank you for the suggestions! I’m trying them out now. >> + if (rotamnt % BITS_PER_UNIT != 0) >> +

Re: [PATCH 4/6] aarch64: Optimize vector rotates into REV* instructions where possible

2024-10-25 Thread Richard Sandiford
Kyrylo Tkachov writes: >> On 25 Oct 2024, at 13:46, Richard Sandiford >> wrote: >> >> Kyrylo Tkachov writes: >>> Thank you for the suggestions! I’m trying them out now. >>> > + if (rotamnt % BITS_PER_UNIT != 0) > +return NULL_RTX; > + machine_mode qimode; > + if (!qimod

Re: [PATCH 4/6] aarch64: Optimize vector rotates into REV* instructions where possible

2024-10-25 Thread Kyrylo Tkachov
> On 25 Oct 2024, at 13:46, Richard Sandiford wrote: > > Kyrylo Tkachov writes: >> Thank you for the suggestions! I’m trying them out now. >> + if (rotamnt % BITS_PER_UNIT != 0) +return NULL_RTX; + machine_mode qimode; + if (!qimode_for_vec_perm (mode).exists (&qimo

Re: [PATCH 4/6] aarch64: Optimize vector rotates into REV* instructions where possible

2024-10-25 Thread Richard Sandiford
Kyrylo Tkachov writes: > Thank you for the suggestions! I’m trying them out now. > >>> + if (rotamnt % BITS_PER_UNIT != 0) >>> +return NULL_RTX; >>> + machine_mode qimode; >>> + if (!qimode_for_vec_perm (mode).exists (&qimode)) >>> +return NULL_RTX; >>> + >>> + vec_perm_builder builder

Re: [PATCH 4/6] aarch64: Optimize vector rotates into REV* instructions where possible

2024-10-25 Thread Kyrylo Tkachov
Thank you for the suggestions! I’m trying them out now. > On 24 Oct 2024, at 21:11, Richard Sandiford wrote: > > Kyrylo Tkachov writes: >> Hi Richard, >> >>> On 23 Oct 2024, at 11:30, Richard Sandiford >>> wrote: >>> >>> Kyrylo Tkachov writes: Hi all, Some vector rotate ope

Re: [PATCH 4/6] aarch64: Optimize vector rotates into REV* instructions where possible

2024-10-25 Thread Richard Sandiford
Kyrylo Tkachov writes: > Hi Richard, > >> On 23 Oct 2024, at 11:30, Richard Sandiford >> wrote: >> >> Kyrylo Tkachov writes: >>> Hi all, >>> >>> Some vector rotate operations can be implemented in a single instruction >>> rather than using the fallback SHL+USRA sequence. >>> In particular, when

Re: [PATCH 4/6] aarch64: Optimize vector rotates into REV* instructions where possible

2024-10-25 Thread Kyrylo Tkachov
Hi Richard, > On 23 Oct 2024, at 11:30, Richard Sandiford wrote: > > Kyrylo Tkachov writes: >> Hi all, >> >> Some vector rotate operations can be implemented in a single instruction >> rather than using the fallback SHL+USRA sequence. >> In particular, when the rotate amount is half the bitwidth

Re: [PATCH 4/6] aarch64: Optimize vector rotates into REV* instructions where possible

2024-10-23 Thread Richard Sandiford
tate_as_vec_perm which checks whether the rotation amount is suitable and tries to generate the permutation if so. Thanks, Richard > From e97509382b6bb755336ec4aa220fabd968e69502 Mon Sep 17 00:00:00 2001 > From: Kyrylo Tkachov > Date: Wed, 16 Oct 2024 04:10:08 -0700 > Subject: [PATCH 4

[PATCH 4/6] aarch64: Optimize vector rotates into REV* instructions where possible

2024-10-22 Thread Kyrylo Tkachov
Hi all, Some vector rotate operations can be implemented in a single instruction rather than using the fallback SHL+USRA sequence. In particular, when the rotate amount is half the bitwidth of the element we can use a REV64,REV32,REV16 instruction. This patch adds this transformation in the recent