Hi, * regtested vzip/vuzp patch * looked into big-endian build * applied all the required patches and checked that Viterbi gets vectorized giving ~2x performance improvement (compiled with cross-compiler) * looked into vld/vst implementation - mostly discussions with Richard * DenBench analysis: - there are loops that should get vectorized with vzip/vuzp patch, I'll check them next week - sad8_c (hot function from mp4encode) needs reduction SLP (which I implemented several weeks ago), and an ability to jump unknown stride in loop SLP - I am looking into this
Ira _______________________________________________ linaro-toolchain mailing list linaro-toolchain@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-toolchain