[llvm-bugs] [Bug 150197] Unoptimized header masks mixed with VP intrinsics may have different lengths during EVL tail folding

LLVM Bugs via llvm-bugs Wed, 23 Jul 2025 02:32:03 -0700

Issue	150197
Summary	Unoptimized header masks mixed with VP intrinsics may have different lengths during EVL tail folding
Labels	backend:RISC-V, vectorizers
Assignees
Reporter	lukel97

    As spotted by @Mel-Chen in this review comment: https://github.com/llvm/llvm-project/pull/149981#discussion_r2224826250


Consider an EVL tail folded loop with a VF of 4 and a trip count of 5. With EVL tail folding, it's possible that this will take place with two iterations, one with EVL=3, and one with EVL=2.

A header mask will come in with the form `icmp ule wide-canonical-iv, backedge-tc`.

Most recipes will be converted to a VP intrinsic to use EVL in `optimizeMaskToEVL`. This should really be thought of as an optimisation, but consider a recipe that isn't handled yet or slips through, and so still uses the header mask.

The header mask is generated as `icmp ule wide-canonical-iv, backedge-tc`.

On the first iteration, the mask will look like:

`[0, 1, 2, 3] <= 4 = [T, T, T, T]` 

However for the recipes which were optimized to VP intrinsics, they will have an EVL of 3, so basically a mask of `[T, T, T, F]`. 

On the second iteration, the mask will look like:

`[4, 5, 6, 7] <= 4 = [T, F, F, F]`

But for the VP intrinsics, they will have an EVL of 2 so a mask of `[T, T, F, F]`.

We need to convert the header masks to something of the form `icmp ult step-vector, EVL`, otherwise we end up processing a different number of elements per iteration depending on whether or not it was converted to a VP intrinsic.

_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

[llvm-bugs] [Bug 150197] Unoptimized header masks mixed with VP intrinsics may have different lengths during EVL tail folding

Reply via email to