> -----Original Message-----
> From: Robin Dapp <rdapp....@gmail.com>
> Sent: Thursday, September 11, 2025 1:03 PM
> To: Richard Biener <rguent...@suse.de>; Tamar Christina
> <tamar.christ...@arm.com>
> Cc: rdapp....@gmail.com; gcc-patches@gcc.gnu.org;
> rdsandif...@googlemail.com
> Subject: Re: New optabs and IFN required for early break [bikeshed]
> 
> > So AVX512 has vcompressp{d,s} and vexpandp{d,s} (but nothing for smaller
> > integer element types).  Those could be used for this but they have
> > a vector result (and element zero would be the first active).
> >
> > But don't you possibly want the last inactive as well, dependent on
> > whether this is a peeled/not peeled exit?  We could shift the mask
> > by one for either case.
> >
> > vcompresspd is not available on AVX2 or SSE4.1, using a vector-vector
> > permute to get the element 'i' % nunits to lane zero would be another
> > possibility, also for non-float or double sized elements we need sth
> > like this.
> >
> > I do wonder whether we want to have the compress/expand as actual
> > optabs when we use them.  Having an extract_first (without _active,
> > following extract_last_optab) is probably OK to abstract this to
> > some extent.  extract_last doesn't specify what happens if no
> > bit is set in the mask, fold_extract_last seems to be the same
> > but with an else value - I wonder whether we should canonicalize
> > those and thus have an else value for extract_first.
> 
> Without having read the rest yet, riscv has a vcompress for all element sizes
> with similar semantics.  Also still needs to extract element zero.

Ah great! Does it just take a mask? could you point me to some docs?

Thanks,
Tamar
> 
> --
> Regards
>  Robin

Reply via email to