Hi, the speedup of ISPC-vectorized is very impressive and for SSE and AVX, many of the examples show near ideal 4x and 8x scaling with the SIMD width.
So I'm very interested in the results of the 16 wide AVX512 for the common ISPC examples (like aobench and ray-tracer). In theory, AVX512 should be the ideal hardware for ISPC with a potential of sixteen times speedup. (the only limiting factor would be memory bandwidth). Am I missing something or are the AVX512 results not reported in the "performance" paper ? Does anyone have some numbers that comprae the speedup (esp. with respect to 8 wide AVX2) or build an run the examples on a AVX512 capable machine ? Thanks & Regards, bb3141 -- You received this message because you are subscribed to the Google Groups "Intel SPMD Program Compiler Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/ispc-users/7405df17-7632-4b1e-bd8a-5da48ca8a8f9%40googlegroups.com.
