On 13.01.2017 18:53, Jason Ekstrand wrote:
On Fri, Jan 13, 2017 at 8:43 AM, Marek Olšák <[email protected]
<mailto:[email protected]>> wrote:

    On Fri, Jan 13, 2017 at 5:25 PM, Jason Ekstrand
    <[email protected] <mailto:[email protected]>> wrote:
    > On Fri, Jan 13, 2017 at 4:05 AM, Marek Olšák <[email protected]
    <mailto:[email protected]>> wrote:
    >>
    >> On Fri, Jan 13, 2017 at 3:37 AM, Ilia Mirkin
    <[email protected] <mailto:[email protected]>> wrote:
    >> > On Thu, Jan 12, 2017 at 9:13 PM, Jason Ekstrand
    <[email protected] <mailto:[email protected]>>
    >> > wrote:
    >> >> Unless, of course, it's controlled by the same hardware bit...
    Clearly,
    >> >> we
    >> >> can can give you abs on rsq without denorm flushing (easy
    shader hacks)
    >> >> but
    >> >> not the other way around.
    >> >
    >> > OK, so somehow I missed that earlier. However there's an
    interesting
    >> > section in the PRM:
    >> >
    >> >
    >> >
    
https://01.org/sites/default/files/documentation/intel-gfx-prm-osrc-skl-vol07-3d_media_gpgpu.pdf
    
<https://01.org/sites/default/files/documentation/intel-gfx-prm-osrc-skl-vol07-3d_media_gpgpu.pdf>
    >> >
    >> > on PDF page 854, "Dismissed Legacy Behaviors" which has a list of
    >> > suggested IEEE 754 deviations for DX9. One of them is indeed
    that 0 *
    >> > x = 0, but another is that input NaNs be propagated with certain
    >> > exceptions. Also they suggest that RCP(0)/RSQ(0) = fmax.
    Interesting.
    >> >
    >> > So at this point, the zero_wins thing is pretty much blown. i965
    >> > appears to have an all-or-nothing approach, and additionally that
    >> > approach doesn't match up exactly to what NVIDIA does (or at
    least I'm
    >> > not aware of a clamp-everything mode).
    >> >
    >> > This will take some thought to figure out how something can be
    >> > specified so that a single spec works for both i965 and nv/amd.
    OTOH
    >> > we could have two different specs that just expose different
    things -
    >> > e.g. i965 could expose a MESA_shader_float_alt_mode or whatever
    which
    >> > is spec'd to do the things that the PRM says, and nv/amd have the
    >> > MESA_shader_float_zero_wins ext which does what we were talking
    about
    >> > earlier.
    >> >
    >> > I'm open to other suggestions too.
    >>
    >> There is also the "small" problem that it would take a non-trivial
    >> effort for us on the LLVM side. You guys can flip a switch. We can't.
    >
    >
    > Don't you have to expend that effort for ARB programs anyway?  I
    thought
    > they weren't supposed to generate NaN either.

    No, we don't, because st/mesa adds abs before RSQ and the driver
    implements POW as log+mul+exp, where mul follows the rule
    0*anything=0. I don't think any other opcode follows that rule though.


Ah.  That makes sense.  Do you also implement DIV as MUL+RCP?

For single-precision, yes. For double-precision, it seems we need to move away from that due to precision issues (which is itself a bit odd, since you don't seem to have encountered that?).

Nicolai

 If so,
the two of those should take care of NaN getting generated in the
shader.  We'd still have to do something about inf and maybe denorms.

_______________________________________________
mesa-dev mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to