Re: [PATCH] vect: Improve vectorization for small-trip-count loops using subvectors

Pengfei Li Fri, 09 May 2025 09:46:32 -0700

Hi Richard Biener,

As Richard Sandiford has already addressed your questions in another email, I
just wanted to add a few below.


> That said, we already have unmasked ABS in the IL:
> 
>   vect__1.6_15 = .MASK_LOAD (&a, 16B, { -1, -1, -1, -1, -1, 0, 0, 0, 0, 0,
> 0, 0, 0, 0, 0, 0, ... }, { 0, ... });
>   vect__2.7_16 = ABSU_EXPR <vect__1.6_15>;
>   vect__3.8_17 = VIEW_CONVERT_EXPR<vector([8,8]) short int>(vect__2.7_16);
>   .MASK_STORE (&a, 16B, { -1, -1, -1, -1, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0,
> 0, 0, ... }, vect__3.8_17); [tail call]
> 
> so what's missing here?  I suppose having a constant masked ABSU here
> would allow RTL expansion to select a fixed-size mode?

Before implementing this patch, I have tried the approach you suggested. I
eventually decided not to move on with it for two reasons:

1) Having constant masked operations does indicate the inactive lanes, but it
doesn't model if we need to care about the inactive lanes. For some operations
(mainly floating-point) that may trap, we can't simply use the upper iteration
bound for the fixed-size mode. This is why I added a `could_trap` parameter to
the target hook I implemented. The `could_trap` information is available in
GIMPLE, but so far I haven't figured out how/if we can get it from RTL.

2) Transforming unmasked operations to masked operations for this seems adding
unnecessary complexity in GIMPLE. I'm not sure if it has any side effect or
may lead to unexpected performance regressions in some cases.

> And the vectorizer could simply use the existing
> related_vector_mode hook instead?

Thanks for pointing it out. I'm not familiar with that hook but I'll take a
look to see if there's anything I can reuse or build upon.

Thanks,
Pengfei

Re: [PATCH] vect: Improve vectorization for small-trip-count loops using subvectors

Reply via email to