http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52258

             Bug #: 52258
           Summary: __builtin_isgreaterequal is sometimes signaling on ARM
    Classification: Unclassified
           Product: gcc
           Version: 4.6.3
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
        AssignedTo: unassig...@gcc.gnu.org
        ReportedBy: aurel...@aurel32.net
              Host: armv7l-unknown-linux-gnueabihf
            Target: armv7l-unknown-linux-gnueabihf
             Build: armv7l-unknown-linux-gnueabihf


__builtin_isgreaterequal is supposed to be non-signaling in case a qNaN is
provided in input. It works when the function is alone, but when combined with
another test, it sometimes triggered an invalid FP exception at -O1 and above
optimization levels.

For example:
int sel_fmax (double x, double y)
{
  return __builtin_isgreaterequal(x, y) || isnan(y);
}


At -O0, the corresponding assembly code is:
00000000 <sel_fmax>:
   0:   b580            push    {r7, lr}
   2:   b084            sub     sp, #16
   4:   af00            add     r7, sp, #0
   6:   ed87 0b02       vstr    d0, [r7, #8]
   a:   ed87 1b00       vstr    d1, [r7]
   e:   ed97 6b02       vldr    d6, [r7, #8]
  12:   ed97 7b00       vldr    d7, [r7]
  16:   eeb4 6b47       vcmp.f64        d6, d7
  1a:   eef1 fa10       vmrs    APSR_nzcv, fpscr
  1e:   bfac            ite     ge
  20:   2300            movge   r3, #0
  22:   2301            movlt   r3, #1
  24:   b2db            uxtb    r3, r3
  26:   f083 0301       eor.w   r3, r3, #1
  2a:   b2db            uxtb    r3, r3
  2c:   2b00            cmp     r3, #0
  2e:   d106            bne.n   3e <selfmax+0x3e>
  30:   ed97 0b00       vldr    d0, [r7]
  34:   f7ff fffe       bl      0 <__isnan>
                        34: R_ARM_THM_CALL      __isnan
  38:   4603            mov     r3, r0
  3a:   2b00            cmp     r3, #0
  3c:   d002            beq.n   44 <selfmax+0x44>
  3e:   f04f 0301       mov.w   r3, #1
  42:   e001            b.n     48 <selfmax+0x48>
  44:   f04f 0300       mov.w   r3, #0
  48:   4618            mov     r0, r3
  4a:   f107 0710       add.w   r7, r7, #16
  4e:   46bd            mov     sp, r7
  50:   bd80            pop     {r7, pc}
  52:   bf00            nop


At -O1, the corresponding assembly code is:
00000000 <sel_fmax>:
   0:   b508            push    {r3, lr}
   2:   eeb4 0bc1       vcmpe.f64       d0, d1
   6:   eef1 fa10       vmrs    APSR_nzcv, fpscr
   a:   da07            bge.n   1c <selfmax+0x1c>
   c:   eeb0 0b41       vmov.f64        d0, d1
  10:   f7ff fffe       bl      0 <__isnan>
                        10: R_ARM_THM_CALL      __isnan
  14:   3000            adds    r0, #0
  16:   bf18            it      ne
  18:   2001            movne   r0, #1
  1a:   bd08            pop     {r3, pc}
  1c:   f04f 0001       mov.w   r0, #1
  20:   bd08            pop     {r3, pc}
  22:   bf00            nop


Note how the vcmp.f64 is changed into a vcmpe.f64, triggering an invalid
exception. This means that a lot of the FP functions in the GNU libc trigger an
invalid exception where they should not, therefore rendering FP exceptions
unusable on ARM.

Reply via email to