> On Oct 1, 2015, at 7:51 AM, James Greenhalgh <james.greenha...@arm.com> wrote: > > On Thu, Oct 01, 2015 at 03:28:22PM +0100, pins...@gmail.com wrote: >>> >>> On Oct 1, 2015, at 6:57 AM, James Greenhalgh <james.greenha...@arm.com> >>> wrote: >>> >>> >>> Hi, >>> >>> If it is cheap enough to treat a floating-point value as an integer and >>> to do bitwise arithmetic on it (as it is for AArch64) we can rewrite: >>> >>> x * copysign (1.0, y) >>> >>> as: >>> >>> x ^ (y & (1 << sign_bit_position)) >> >> Why not just convert it to copysign (x, y) instead and let expand chose >> the better implementation? > > Because that transformation is invalid :-) > > let x = -1.0, y = -1.0 > > x * copysign (1.0, y) > = -1.0 * copysign (1.0, -1.0) > = -1.0 * -1.0 > = 1.0 > > copysign (x, y) > = copysign (-1.0, -1.0) > = -1.0 > > Or have I completely lost my maths skills :-)
No you are correct. Note I would rather see the copysign form in the tree level and have the integer form on the rtl level. So placing this in expand would be better instead of match.md. Thanks, Andrew > >> Also I think this can only be done for finite and non trapping types. > > That may be well true, I swithered either way and went for no checks, but > I'd happily go back on that and wrap this in something suitable restrictive > if I need to. > > Thanks, > James > > >>> >>> This patch implements that rewriting rule in match.pd, and a testcase >>> expecting the transform. >>> >>> This is worth about 6% in 481.wrf for AArch64. I don't don't know enough >>> about the x86 microarchitectures to know how productive this transformation >>> is there. In Spec2006FP I didn't see any interesting results in either >>> direction. Looking at code generation for the testcase I add, I think the >>> x86 code generation looks worse, but I can't understand why it doesn't use >>> a vector-side xor and load the mask vector-side. With that fixed up I think >>> the code generation would look better - though as I say, I'm not an expert >>> here... >>> >>> Bootstrapped on both aarch64-none-linux-gnu and x86_64 with no issues. >>> >>> OK for trunk? >>> >>> Thanks, >>> James >>> >>> --- >>> gcc/ >>> >>> 2015-10-01 James Greenhalgh <james.greenha...@arm.com> >>> >>> * match.pd (mult (COPYSIGN:s real_onep @0) @1): New simplifier. >>> >>> gcc/testsuite/ >>> >>> 2015-10-01 James Greenhalgh <james.greenha...@arm.com> >>> >>> * gcc.dg/tree-ssa/copysign.c: New. >>> >>> <0001-Patch-match.pd-Add-a-simplify-rule-for-x-copysign-1..patch> >>