Re: Test Case to verify the support of VFPV3 and VFPV4

Jubi Taneja Tue, 09 Oct 2012 03:59:30 -0700

Hi Matt,

Thanks for sharing the information.


On Tue, Oct 9, 2012 at 3:51 PM, Matthew Gretton-Dann <
matthew.gretton-d...@linaro.org> wrote:

> On 9 October 2012 10:37, Jubi Taneja <jubitan...@gmail.com> wrote:
> > Hi All,
> >
> > I wanted to see the difference in objdump of an application where I can
> make
> > the difference between the VFPV3 and VFPV4 support. I tried enabling the
> > flag -mfpu=vfpv3 and -mfpu=vfpv4 for ARM Cortex A15 toolchain in my test
> > code but cannot see the difference in two objdumps.
>
> Try the following (tested against FSF GCC:
>
> /* arm-none-linux-gnueabi-gcc -mcpu=cortex-a15 -mfpu=vfpv4 -S -o-
> /tmp/fma.c -mfloat-abi=hard -O2 */
> float f(float a, float b, float c)
> {
>   return a * b + c;
> }
> /* end of tmp.c */
>
> (Note that -mfloat-abi=softfp will also work in this example.  Which
> one you want to use depends on whether you have configured your system
> for hard or soft-float ABIs).
>
> I checked both with -mfpu=vfpv3 and -mfpu=vfpv4 and it generates the same
assembly code. VMLA insn is emitted for both the cases. I was wondering if
I can get any test case so that I may observe the difference in the two
objdumps.


> > According to my survey, the fused multiply and accumulate is the only
> > instruction that can create the difference in two. Can any one provide
> the
> > sample test code for the same? Precisely, I wish to see the difference in
> > performance for vfpv3 and vfpv4.
>
> I would be surprised if you see much difference at all.  VFPv3 has the
> VMLA (non-fused multiply-accumulate) instruction, which does an extra
> rounding-step,

Correct, I checked this.


> but I expect will have similar performance
> characteristics to VFMA.
>
Yes, since the assembly code are similar and they cannot make any
performance difference as of now.

>
> Note that between -mfpu=vfpv3 and -mfpu=vfpv4 there is also
> -mfpu=vfpv3-fp16 which added support for loading and storing
> half-precision floating-point values.  Again this won't make a
> performance difference unless you use half-precision as your storage
> format.
>

I need to check this once.

Thanks,
Jubi

>
> Thanks,
>
> Matt
>
> --
> Matthew Gretton-Dann
> Linaro Toolchain Working Group
> matthew.gretton-d...@linaro.org
>

_______________________________________________
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain

Re: Test Case to verify the support of VFPV3 and VFPV4

Reply via email to