Re: RISC-V: Use convert instructions instead of calling library functions

Jeff Law Wed, 20 Mar 2024 11:54:50 -0700



On 3/19/24 10:23 AM, Palmer Dabbelt wrote:

On Mon, 18 Mar 2024 20:50:14 PDT (-0700), jeffreya...@gmail.com wrote:



On 3/18/24 3:09 AM, Jivan Hakobyan wrote:

As RV has round instructions it is reasonable to use them instead of
calling the library functions.

With my patch for the following C code:
double foo(double a) {
     return ceil(a);
}

GCC generates the following ASM code (before it was tail call)
foo:
         fabs.d  fa4,fa0
         lui     a5,%hi(.LC0)
         fld     fa3,%lo(.LC0)(a5)
         flt.d   a5,fa4,fa3
         beq     a5,zero,.L3
         fcvt.l.d a5,fa0,rup

I'm not sure exactly what context this is in, but my reading of"according to the current rounding mode" means we'd usually use thedynamic rounding mode.

As Andrew W. noted, we're dealing with ceil and thus rup is the correctrounding mode to use here.

My only worry here is that when we were doing the other patterns wedecided not to do rint. I can't remember exactly why, but from readingthe docs we probably just skipped it due to the inexact handling and Zfahaving an instruction that just does this. FP stuff is always a bit ofa time sink, so at that point it probably just fell off the priority list.

rint is supposed to raise FE_INEXACT, so it's actually a good match forRISC-V fcvt semantics as they appropriately raise FE_INEXACT.

nearby* do not arise FE_INEXACT and thus would rely on the new Zfainstructions where we have ones that do not raise FE_INEXACT or theyneed to be conditional on flag_fp_int_builtin_inexact. One couldreasonably argue that when flag_fp_int_builtin_inexact is enabled that acall to nearby* ought to be converted into a call to rint*.

I'm not really an FP guy, so I usually just poke around what the otherports generate and try to figure out what's going on. arm64 has the Zfainstruction and x86 FP is complicated, so I'm not sure exactly who elseto look at for this sort of stuff. From just looking at the code,though, I think there's two issues -- I'm not really an FP person,though, so take this with a grain of salt:

Right. And the condition under which we use the new sequence forceil/round actually borrows from x86. Essentially we only use the newsequence when we've been told we don't care about FE_INEXACT or fpexceptions in general.

IIUC what we've got here doesn't actually set the inexact flag for thebounds check failure case, as we're just loading up an exact constantand doing the bounds check. We're also not clamping to INT_MAX-typevalues, but not sure if we're supposed to. I think we could fix both ofthose by adjusting the expansion to something like

The state of FE_INEXACT is a don't care here due the condition on theexpansion code.


              fabs.d  fa4,fa0
              lui     a5,%hi(.LC0)
              fld     fa3,%lo(.LC0)(a5)
              flt.d   a5,fa4,fa3
              bne     a5,zero,.L3
              mv      fa0, fa3
     .L3:
              fcvt.l.d a5,fa0,rup
              fcvt.d.l        fa4,a5
              fsgnj.d fa0,fa4,fa0
              ret

and then adjusting the constant to be an epsilon larger than INT_MAX soit'll still trigger the clamping but also inexact.

I think Jivan's sequence is more correct. It's not just INT_MAX herethat's concerning, there's a whole class of values that cause problems.

There's also a pair of changes to the ISA in 2020 that added theconversion inexact handling requirement, it was a grey area before. Idon't remember exactly what happened there, but I did remember ithappening. I don't think anyone cares all that much about theperformance of systems that target the older ISAs, so maybe we justrestrict the non-libcall expansion to ISAs that contain the new wording?

I think all this got sufficiently cleaned up. The spec is explicitabout when FE_INEXACT gets raised on the fcvt instructions. I referredto it repeatedly when analyzing Jivan's work.

We can hash through the final items in a few weeks once the trunkre-opens for development.


Jeff

Re: RISC-V: Use convert instructions instead of calling library functions

Reply via email to