https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96371

            Bug ID: 96371
           Summary: [nvptx] frounding-math support
           Product: gcc
           Version: 11.0
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: vries at gcc dot gnu.org
  Target Milestone: ---

Floating-point ops like f.i. div:
...
(define_insn "div<mode>3"
  [(set (match_operand:SDFM 0 "nvptx_register_operand" "=R")
        (div:SDFM (match_operand:SDFM 1 "nvptx_register_operand" "R")
                  (match_operand:SDFM 2 "nvptx_nonmemory_operand" "RF")))]
  ""
  "%.\\tdiv%#%t0\\t%0, %1, %2;")
...
have a bit '%#' with according to nvptx_print_operand the semantics:
...
   # -- print a rounding mode for the instruction                              
              ...
but which is hardcoded to .rn (round to nearest):
...
  else if (code == '#')
    {
      fputs (".rn", file);
      return;
    }
...

According to this ( https://gcc.gnu.org/wiki/FloatingPointMath ), round to
nearest is the rounding mode for div by default, but when -frounding-math is
specified, that can no longer be assumed.

The way this normally works is that a cpu has a status register describing the
current state of rounding mode.  By specifying -frounding-math, we make sure
the compiler makes no assumptions about rounding mode, such that the status
register will take effect at runtime. And at runtime, we use a libc function
from fenv.h to manipulate the status register.

Nvptx has no such status register.

Newlib has fenv.h support since version 3.2.0 (Jan 2020), but the nvptx port
has no implementation.  It could add one, implementing a fake status register
(perhaps there is another architecture that has something similar), which could
then be tested in the assembly for div<mode>3 to determine whether to execute
div.rn, div.rz, div.rm or div.rp.

The standalone implementation only supports scalar execution, so we only need a
scalar status register, but in the offloading and parallel context, each thread
can have set a different rounding mode, so we'll need thread-specific status
registers.  Perhaps that's too expensive, and we'll have to limit fesetround to
using constants (which I guess will be the case anyway for typical numerical
code).

Anyway, in absence of all this, without fenv.h support there's no way to set
the rounding mode, meaning that we can assume default rounding mode, as the
current implementation of "div<mode>3" does.  OTOH, we don't take that
assumption further, f.i. we don't ignore frounding-math.

It would be nice if we'd warn about making the assumption when emitting a div
with .rn hardcoded and frounding-math, something like:
...
Assuming fenv.h not supported, so using default rounding mode for float op.
...

Or, we could just error out when specifying frounding-math, or when
encountering a float op with frounding-math or some such.

Reply via email to