On 08/06/20 18:54, Joseph Myers wrote: > The x87 floating-point emulation of the fprem and fprem1 instructions > works via conversion to and from double. This is inherently > unsuitable for a good emulation of any floatx80 operation. This patch > series adapts the softfloat floatx80_rem implementation to be suitable > for these instructions and uses it to reimplement them. > > There is an existing test for these instructions, test-i386-fprem.c, > based on comparison of output. It produces 1679695 lines of output, > and before this patch series 415422 of those lines are different on > hardware from the output produced by QEMU. Some of those differences > are because QEMU's x87 emulation does not yet produce the "denormal > operand" exception; ignoring such differences (modifying the output > from a native run not to report that exception), there are still > 398833 different lines. This patch series reduces that latter number > to 1 (that one difference being because of missing checks for > floating-point stack underflow, another global issue with the x87 > emulation), or 35517 different lines without the correction for lack > of denormal operand exception support. > > Several fixes to and new features in the softfloat support for this > operation are needed; floatx80_mod, previously present in the m68k > code only, is made generic and unified with floatx80_rem in a new > floatx80_modrem of which floatx80_mod and floatx80_rem are thin > wrappers. The only architectures using float*_rem for other formats > are arm (FPA emulation) and openrisc (instructions that have been > removed in the latest architecture version); they do not appear to > need any of the new features, and all the bugs fixed are specific to > floatx80, so no changes are made to the remainder implementation for > those formats. > > A new feature added is returning the low bits of the quotient from > floatx80_modrem, as needed for both x87 and m68k. The logic used to > determine the low 7 bits of the quotient for m68k > (target/m68k/fpu_helper.c:make_quotient) appears completely bogus (it > looks at the result of converting the remainder to integer, the > quotient having been discarded by that point); this patch series does > not change that to use the new interface, but the m68k maintainers may > wish to do so. > > The Intel instruction set documentation leaves unspecified the exact > number of bits by which the remainder instructions reduce the operand > each time. The AMD documentation gives a specific formula, which > empirically Intel processors follow as well, and that formula is > implemented in the code. The AMD documentation also specifies that > flags other than C2 are cleared in the partial remainder case, whereas > the Intel manual is silent on that (but the processors do appear to > clear those flags); this patch implements that flag clearing, and > keeps the existing flag clearing in cases where the instructions raise > "invalid" (although it seems hardware in fact only clears some but not > all flags in that case, leaving other flags unchanged). > > The Intel manuals include an inaccurate table asserting that (finite > REM 0) should raise "divide by zero"; actually, in accordance with > IEEE semantics, it raises "invalid". The AMD manuals inaccurately say > for both fprem and fprem1 that if the exponent difference is negative, > the numerator is returned unchanged, which is correct (apart from > normalizing pseudo-denormals) for fprem but not for fprem1 (and the > old QEMU code had an incorrect optimization following the AMD manuals > for fprem1). > > Changes in version 2 of the patch series: fix comment formatting and > combine patches 6 and 7. > > Joseph Myers (6): > softfloat: merge floatx80_mod and floatx80_rem > softfloat: fix floatx80 remainder pseudo-denormal check for zero > softfloat: do not return pseudo-denormal from floatx80 remainder > softfloat: do not set denominator high bit for floatx80 remainder > softfloat: return low bits of quotient from floatx80_modrem > target/i386: reimplement fprem, fprem1 using floatx80 operations > > fpu/softfloat.c | 87 ++++++++++++++++++---- > include/fpu/softfloat.h | 3 + > target/i386/fpu_helper.c | 156 ++++++++++++--------------------------- > target/m68k/softfloat.c | 83 --------------------- > target/m68k/softfloat.h | 1 - > 5 files changed, 122 insertions(+), 208 deletions(-) >
Queued, thanks. Paolo