Re: [PATCH] vect: Add a popcount fallback.

2023-08-09 Thread Richard Biener via Gcc-patches
On Wed, Aug 9, 2023 at 12:23 PM Robin Dapp wrote: > > > We seem to be looking at promotions of the call argument, lhs_type > > is the same as the type of the call LHS. But the comment mentions .POPCOUNT > > and the following code also handles others, so maybe handling should be > > moved. Also w

Re: [PATCH] vect: Add a popcount fallback.

2023-08-09 Thread Robin Dapp via Gcc-patches
> We seem to be looking at promotions of the call argument, lhs_type > is the same as the type of the call LHS. But the comment mentions .POPCOUNT > and the following code also handles others, so maybe handling should be > moved. Also when we look to vectorize popcount (x) instead of popcount((T)

Re: [PATCH] vect: Add a popcount fallback.

2023-08-08 Thread Robin Dapp via Gcc-patches
> Presumably this is an alternative to the approach Juzhe posted a week > or two ago and ultimately dropped? Yeah, I figured having a generic fallback could help more targets. We can still have a better expander if we see the need. Regards Robin

Re: [PATCH] vect: Add a popcount fallback.

2023-08-08 Thread Jeff Law via Gcc-patches
On 8/8/23 02:55, Robin Dapp via Gcc-patches wrote: Looks reasonable to me - I couldn't read from above whether you did testing on riscv and thus verified the runtime correctness of the fallback? If not may I suggest to force matching the pattern on a target you can test for this purpose? I t

Re: [PATCH] vect: Add a popcount fallback.

2023-08-08 Thread Richard Biener via Gcc-patches
On Tue, Aug 8, 2023 at 3:06 PM Robin Dapp wrote: > > > Hmm, the conversion should be a separate statement so I wonder > > why it would go wrong? > > It is indeed. Yet, lhs_type is the lhs type of the conversion > and not the call and consequently we compare the precision of > the converted type w

Re: [PATCH] vect: Add a popcount fallback.

2023-08-08 Thread Robin Dapp via Gcc-patches
> Hmm, the conversion should be a separate statement so I wonder > why it would go wrong? It is indeed. Yet, lhs_type is the lhs type of the conversion and not the call and consequently we compare the precision of the converted type with the popcount input. So we should probably rather do someth

Re: [PATCH] vect: Add a popcount fallback.

2023-08-08 Thread Richard Biener via Gcc-patches
On Tue, Aug 8, 2023 at 1:37 PM Robin Dapp wrote: > > > Well, not sure how VECT_COMPARE_COSTS can help here, we either > > get the pattern or vectorize the original function. There's no special > > handling > > for popcount in vectorizable_call so all special cases are handled via > > patterns.

Re: [PATCH] vect: Add a popcount fallback.

2023-08-08 Thread Robin Dapp via Gcc-patches
> Well, not sure how VECT_COMPARE_COSTS can help here, we either > get the pattern or vectorize the original function. There's no special > handling > for popcount in vectorizable_call so all special cases are handled via > patterns. > I was thinking of popcounthi via popcountsi and zero-extend

Re: [PATCH] vect: Add a popcount fallback.

2023-08-08 Thread Richard Biener via Gcc-patches
On Tue, Aug 8, 2023 at 10:55 AM Robin Dapp wrote: > > > Looks reasonable to me - I couldn't read from above whether you did > > testing on riscv and thus verified the runtime correctness of the fallback? > > If not may I suggest to force matching the pattern on a target you can > > test for this p

Re: [PATCH] vect: Add a popcount fallback.

2023-08-08 Thread Robin Dapp via Gcc-patches
> Looks reasonable to me - I couldn't read from above whether you did > testing on riscv and thus verified the runtime correctness of the fallback? > If not may I suggest to force matching the pattern on a target you can > test for this purpose? I tested on riscv (manually and verified the run tes

Re: [PATCH] vect: Add a popcount fallback.

2023-08-07 Thread Richard Biener via Gcc-patches
On Mon, Aug 7, 2023 at 10:20 PM Robin Dapp via Gcc-patches wrote: > > Hi, > > This patch adds a fallback when the backend does not provide a popcount > implementation. The algorithm is the same one libgcc uses, as well as > match.pd for recognizing a popcount idiom. __builtin_ctz and __builtin_f