Re: New optabs and IFN required for early break [bikeshed]

Robin Dapp Thu, 11 Sep 2025 05:03:31 -0700

So AVX512 has vcompressp{d,s} and vexpandp{d,s} (but nothing for smaller
integer element types).  Those could be used for this but they have
a vector result (and element zero would be the first active).


But don't you possibly want the last inactive as well, dependent on
whether this is a peeled/not peeled exit?  We could shift the mask
by one for either case.

vcompresspd is not available on AVX2 or SSE4.1, using a vector-vector
permute to get the element 'i' % nunits to lane zero would be another
possibility, also for non-float or double sized elements we need sth
like this.

I do wonder whether we want to have the compress/expand as actual
optabs when we use them.  Having an extract_first (without _active,
following extract_last_optab) is probably OK to abstract this to
some extent.  extract_last doesn't specify what happens if no
bit is set in the mask, fold_extract_last seems to be the same
but with an else value - I wonder whether we should canonicalize
those and thus have an else value for extract_first.

Without having read the rest yet, riscv has a vcompress for all element sizeswith similar semantics. Also still needs to extract element zero.


--
Regards
Robin

Re: New optabs and IFN required for early break [bikeshed]

Reply via email to