Re: [PATCH] [x86] Don't use builtins for SIMD shifts

Marc Glisse Sat, 10 Dec 2016 11:00:06 -0800

On Sat, 10 Dec 2016, Allan Sandfeld Jensen wrote:

On Saturday 10 December 2016, Marc Glisse wrote:

On Sat, 10 Dec 2016, Marc Glisse wrote:

On Sat, 10 Dec 2016, Allan Sandfeld Jensen wrote:

Replaces the definitions of the shift intrinsics with GCC extension
syntax to
allow GCC to reason about what the instructions does. Tests are added to
ensure the intrinsics still produce the right instructions, and that a
few basic optimizations now work.


I don't think we can do it in such a straightforward way. Those
intrinsics are well defined for any shift argument, while operator<< and
operator>> are considered undefined for many input values. I believe you
may need to use an unsigned type for the lhs of left shifts, and write a
<< (b & 31) to match the semantics for the rhs (I haven't checked
Intel's doc). Which might require adding a few patterns in sse.md to
avoid generating code for that AND.


Oups, apparently I got that wrong, got confused with the scalar case. Left
shift by more than precision is well defined for vectors, but it gives 0,
it doesn't take the shift count modulo the precision. Which is even harder
to explain to gcc (maybe ((unsigned)b<=31)?a<<b:0...). Yes, the way we
model operations in gcc is often inconvenient.

There was a similar issue for LLVM (https://reviews.llvm.org/D3353).

Well, it is undefined behaviour by the C standard, but is it also undefined
inside GCC (since this specifically uses a GCC extension)? I would assume it
just produces 0.

In the optimizers, we tend to handle vectors the same as scalars. I don'tremember if we have any optimization that takes advantage of the limitedrange for the second operand of shifts (mostly we disable optimizationswhen there is a risk they might produce a shift amount larger than prec),but I don't think we have specified the behavior for larger-than-precshifts, so it would be dangerous. Note that depending on the platform, gccmay lower vector operations to smaller vectors or even scalars, whichmakes it inconvenient to have different semantics for vectors and scalars(not impossible, we would for instance need to teach the lowering passthat vec1 << vec2 lowers to { ((unsigned)vec2[0]<=31)?vec1[0]<<vec2[0]:0,((unsigned)vec2[1]<=31)?vec1[1]<<vec2[1]:0 }). I think on some platforms,shifting vectors does use the modulo behavior, or handles negative shiftvalues as shifts in the other direction, we can't look at just one target.

We could have a middle-end policy of vec << 1234 is not undefined (wecannot assume it doesn't happen) but we don't know what it is defined to,so we avoid any optimization that may have it as input or output. Thatmight mean disabling all shift optimizations though, which would becounter productive.

The non-immediate case is simpler then, because it produces
the instructions which will inherently return the right thing.

It will only generate the instructions you expect if it hasn't alreadyoptimized it to something else, which might not even involve a shiftanymore.

I expect reviewers (I am only commenting) will need convincing argumentsthat this is safe (I could be wrong, maybe they'll find reasons that Imissed that make it obviously safe).


--
Marc Glisse

Re: [PATCH] [x86] Don't use builtins for SIMD shifts

Reply via email to