On Tue, Jul 27, 2021 at 6:39 PM Jakub Jelinek <ja...@redhat.com> wrote:
>
> On Tue, Jul 27, 2021 at 06:33:24PM +0800, Hongtao Liu wrote:
> > > AVX2 introduced vector >> vector shifts, but unfortunately for 
> > > V{2,4}DImode
> > > it only supports logical and not arithmetic shifts, only AVX512F for
> > > V8DImode or AVX512VL for V{2,4}DImode fixed that omission.
> > > Earlier in GCC12 cycle I've committed vector >> scalar arithmetic shift
> > > emulation using various sequences, this patch handles the vector >> vector
> > > case.  No need to adjust costs, the previous cost adjustment actually
> > > covers even the vector by vector shifts.
> > > The patch emits the right arithmetic V{2,4}DImode shifts using 2 logical 
> > > right
> > > V{2,4}DImode shifts (once of the original operands, once of sign mask
> > > constant by the vector shift count), xor and subtraction, on each element
> > > (long long) x >> y is done as
> > > (((unsigned long long) x >> y) ^ (0x8000000000000000ULL >> y))
> > > - (0x8000000000000000ULL >> y)
> > I'm wondering when y > 64, would the transformation still be proper.
> > Guess since it's UD, compiler can do anything.
>
> The patch is changing optabs, not something from target builtins where the
> intrinsics might make it well defined.
> In the optabs out of bound shifts (including y == 64) are UB - i386.h
> doesn't define SHIFT_COUNTS_TRUNCATED.
Thanks for the explanation, patch LGTM.
>
>         Jakub
>


-- 
BR,
Hongtao

Reply via email to