On Wed, 9 Nov 2016, Segher Boessenkool wrote:
On Wed, Nov 09, 2016 at 10:54:53PM +0100, Marc Glisse wrote:
match.pd transforms (A&C)|(B&~C) to ((A^B)&C)^B, which is fewer
operations if C is not const (and it is not on simple tests at least,
this transform is done very early already).
Various processors have "insert" instructions that can do this, but
combine cannot build those from the xor-and-xor, especially it has no
chance at all to do that if A or B or multiple instructions as well
(on PowerPC, the rl[ws]imi instructions that can do this with a rotate,
or a simple shift with appropriate C; other ISAs have similar insns).
This patch makes RTL simplify transform (xor (and (xor A B) C) B) back
to (ior (and A C) (and B ~C)) for constant C (and similar with A instead
of B for that last term).
Would it make sense to implement this transformation in match.pd, next to
the "opposite" one, or do you need it at the RTL level because C only
becomes a constant at that stage?
It becomes a constant in the later gimple passes, but we need it in the RTL
simplifiers as well, even if you also do it in match.pd?
(assuming it is always an improvement, even though it may use the same
number of operations and one more constant)
Sure, it doesn't hurt to have it in both places. It just seems that since
the problem was caused by match.pd in your original testcase, fixing it at
that level (undoing the harm as soon as possible) would make the RTL
version less useful (though not useless). Anyway, I don't feel competent
to decide when which form is preferable, I was just curious.
(simplify
(bit_xor:cs (bit_and:s (bit_xor:cs @0 @1) CONSTANT_CLASS_P@2) @0)
(bit_ior (bit_and @0 (bit_not @2)) (bit_and @1 @2)))
(this handles vectors as well, I don't know if that is desired)
--
Marc Glisse