Hi!

I've attempted to study the implementation of memcpy for 32-bit Arm cores in
Glibc (which is also found in arm-optimized-routines and first appeared in 
Linaro's cortex-strings project), and I came across a peculiar snippet:

#ifdef USE_VFP
        /* Magic dust alert!  Force VFP on Cortex-A9.  Experiments show
           that the FP pipeline is much better at streaming loads and
           stores.  This is outside the critical loop.  */
        vmov.f32        s0, s0
#endif

This seems to imply that this NOP-like instruction affects CPU state and makes
the vldr/vstr instructions that follow use different datapaths that they might
otherwise?  Can anyone shed more light on this, please?


I was able to trace history of this code back to revision 100 in cortex-strings
repository, where it appeared as part of a large rewrite by Will Newton:
 
 
https://bazaar.launchpad.net/~linaro-toolchain-dev/cortex-strings/trunk/revision/100/src/linaro-a9/memcpy.S

The entire memcpy.S file in Arm optimized-routines repo can be found here:

 
https://github.com/ARM-software/optimized-routines/blob/master/string/arm/memcpy.S

Thanks!
Alexander
_______________________________________________
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/linaro-toolchain

Reply via email to