Re: LLVM ARM NEON VMUL.f32

Peter Maydell Wed, 20 Mar 2013 03:33:39 -0700

On 19 March 2013 21:56, Renato Golin <renato.go...@linaro.org> wrote:
> Basically, LLVM chooses to lower single-precision FMUL to NEON's VMUL.f32
> instead of VFP's version because, on some cores (A8, A5 and Apple's Swift),
> the VFP variant is really slow.
>
> This is all cool and dandy, but NEON is not IEEE 754 compliant, so the
> result is slightly different. So slightly that only one test, that was
> really pushing the boundaries (ie. going below FLT_MIN) did catch it.
>
> There are two ways we can go here:
>
> 1. Strict IEEE compatibility and *only* lower NEON's VMUL if unsafe-math is
> on. This will make generic single-prec. code slower but you can always turn
> unsafe-math on if you want more speed.
>
> 2. Continue using NEON for f32 by default and put a note somewhere that
> people should turn this option (FeatureNEONForFP) off on A5/A8 if they
> *really* care about maximum IEEE compliance.
>
> Apple already said that for Darwin, 2 is still the option of choice. Do we
> agree and ignore this issue? Or for GNU/EABI we want strict conformance by
> default?


This seems straightforward to me. You have a user facing flag for
controlling whether you can deviate from IEEE754 in the name of
performance (unsafe-math), so you should honour it. This has the
secondary advantage of following gcc behaviour, and the primary
advantage of not being confusing or requiring people to use
architecture-specific feature flags just to get standard fp behaviour.

Anybody actually writing code which uses 32 bit floats in performance
critical code can apply unsafe-math if it helps them.

-- PMM

_______________________________________________
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain

Re: LLVM ARM NEON VMUL.f32

Reply via email to