------- Comment #6 from rguenther at suse dot de 2010-06-26 09:18 ------- Subject: Re: [4.6 Regression] c-c++-common/torture/complex-sign-add.c
On Fri, 25 Jun 2010, sje at cup dot hp dot com wrote: > ------- Comment #5 from sje at cup dot hp dot com 2010-06-25 22:47 ------- > I verified that this works in r160902 and fails in 160903. Thanks for further investigating. > In 160902 I get this (partial) psuedo-code: > > IMAGPART_EXPR <a1> = 0.0; > D.1749_4 = -0.0; > IMAGPART_EXPR <b1> = D.1749_4; > D.1760_12 = IMAGPART_EXPR <a1>; > D.1762_14 = IMAGPART_EXPR <b1>; > D.1764_16 = D.1760_12 + D.1762_14; > IMAGPART_EXPR <c1> = D.1764_16; > D.1754_9 = IMAGPART_EXPR <c1>; > D.1755_10 = (double) D.1754_9; > printf (&"%f\n"[0], D.1755_10); > > In 160903 I get: > > b1$imag_4 = -0.0; > c1$imag_10 = b1$imag_4 + 0.0; > D.1749_7 = c1$imag_10; > D.1750_8 = (double) D.1749_7; > printf (&"%f\n"[0], D.1750_8); > > I am not sure if it is significant that in the first one I have: > > D.1764_16 = 0.0 + (- 0.0) > > and in the second one I have: > > c1$imag_10 = (- 0.0) + 0.0 > > I.e. the order of the -0.0 is different. Even floating-point addition is commutative. The difference is because we canonicalize 0.0 + b1$imag_4 to have the constant in 2nd place. > Looking at the assembly code in 160902 I see (paraphrasing): > > fmov f8 = f0 > fneg f6 = f0 > fadd.s f6 = f8, f6 > > and in 160903 I see: > > fneg f6 = f0 > fadd.s f6 = f6, f0 I suppose f0 is a special register for 0.0? Googling around finds me references that f0 is treated specially wrt zero signs at least when used in fmadd operations. > Changing the new code by hand to swap the arguments to fadd around does > seem to fix things in the new code. But, oddly enough, if I swap the > arguments around in the old (good) code it doesn't break, I am not sure why. Which would hint at this is exactly the problem. Thus, with -fsigned-zeros we cannot use the f0 register in arithmetic but have to copy it to a regular register first? So, does fneg f6 = f0 fmov f8 = f0 fadd.s f6 = f6, f8 work? Richard. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44583