On Wed, Dec 17, 2014 at 09:46:44AM +0100, Marek Polacek wrote: > This adds a transformation of (x & ~m) | (y & m), which (on GIMPLE) > has 4 ops to x ^ ((x ^ y) & m) that has 3 ops on GIMPLE. In fact, > the latter is then transformed to (a ^ b) & m ^ a, which also has 3 > ops.
So why don't you transform it to ((x ^ y) & m) ^ x directly (i.e. swap @0 with the (bit_and ...) ? BTW, the advantage of (x & ~m) | (y & m) form is that there are fewer dependencies, at least if the target has andn instruction (e.g. SPARC, Alpha, PA, MMIX, IA64 have them), just the final or depends on the result of both and and andnot instructions. While in the 2 xor forms, and depends on the first xor result and the second xor depends on the and result, so if there are multiple ALU units available, the and | andn form can use both the units, while the second one is unnecessarily serialized. Jakub