Hi, > Anyway, I think this explains why the non-SMS loop executes more > quickly than GCC expects, and why the SMS loop is slower than it > needs to be. It might be worth comparing the two loops with > -mtune=cortex-a8.
Thanks for the detailed explanation! I see this regression on cortex-a8 as well. Also, there is still a delay of 9 between the accumulators shown in the SMS dumps running with -mtune=cortex-a8 -mcpu=cortex-a8 . Thanks, Revital _______________________________________________ linaro-toolchain mailing list linaro-toolchain@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-toolchain