Re: [PATCH][RFC][match.pd] optimize (X & C) == N when C is power of 2

Jakub Jelinek Fri, 24 Jul 2015 01:24:02 -0700

On Fri, Jul 24, 2015 at 09:09:39AM +0100, Kyrill Tkachov wrote:
> This transformation folds (X % C) == N into
> X & ((1 << (size - 1)) | (C - 1))) == N
> for constants C and N where N is positive and C is a power of 2.
> 
> The idea is similar to the existing X % C -> X & (C - 1) transformation
> for unsigned values but this time when we have the comparison we use the
> C - 1 mask (all 1s for power of 2 N) orred with the sign bit.
> 
> At runtime, if X is positive then X & ((1 << (size - 1)) | (C - 1)))
> calculates X % C, which is compared against the positive N.
> 
> If X is negative then X & ((1 << (size - 1)) | (C - 1))) doesn't calculate
> X % C but since we also catch the set sign bit, it will never compare equal
> to N (which is positive), so we still get the right comparison result.
> 
> I don't have much experience with writing match.pd patterns, so I appreciate
> any feedback on how to write this properly.
> 
> Bootstrapped and tested on arm, aarch64, x86_64.


I think this is another case that, if at all, should be done during or right
before RTL expansion and should test rtx costs.
Because, ((1 << (size - 1)) | (C - 1))) constant might be very expensive,
while C cheap, and % might not be that expensive compared to & to offset
that.

E.g. on x86_64, for 32-bit and smaller X the constant is cheap as any other
(well, if we don't take instruction size into account), but 64-bit constant
is at least 3 times more expensive (movabsq is needed with its latency).
In the x86_64 case supposedly the divmod is still more expensive, but there
are many other targets.  On sparc64 for 64-bit constants, you might need
many instructions to create the constants, etc.

        Jakub

Re: [PATCH][RFC][match.pd] optimize (X & C) == N when C is power of 2

Reply via email to