On 19 March 2013 21:56, Renato Golin <renato.go...@linaro.org> wrote: > Basically, LLVM chooses to lower single-precision FMUL to NEON's VMUL.f32 > instead of VFP's version because, on some cores (A8, A5 and Apple's Swift), > the VFP variant is really slow. > > This is all cool and dandy, but NEON is not IEEE 754 compliant, so the > result is slightly different. So slightly that only one test, that was > really pushing the boundaries (ie. going below FLT_MIN) did catch it. > > There are two ways we can go here: > > 1. Strict IEEE compatibility and *only* lower NEON's VMUL if unsafe-math is > on. This will make generic single-prec. code slower but you can always turn > unsafe-math on if you want more speed. > > 2. Continue using NEON for f32 by default and put a note somewhere that > people should turn this option (FeatureNEONForFP) off on A5/A8 if they > *really* care about maximum IEEE compliance. > > Apple already said that for Darwin, 2 is still the option of choice. Do we > agree and ignore this issue? Or for GNU/EABI we want strict conformance by > default?
This seems straightforward to me. You have a user facing flag for controlling whether you can deviate from IEEE754 in the name of performance (unsafe-math), so you should honour it. This has the secondary advantage of following gcc behaviour, and the primary advantage of not being confusing or requiring people to use architecture-specific feature flags just to get standard fp behaviour. Anybody actually writing code which uses 32 bit floats in performance critical code can apply unsafe-math if it helps them. -- PMM _______________________________________________ linaro-toolchain mailing list linaro-toolchain@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-toolchain