There are two options - (1) file a bug and wait until we fix it. It's generally compiler responsibility to do this optimization. It's actually was implemented at some point, but it doesn't work in most of the cases and we are planning to fix it. So one more test case and and a reminder in the bug tracker is always good.
And options (2) try to use shuffle() intrinsics to specify the desired behaviour. Dmitry. On Tue, Apr 9, 2019 at 7:31 AM Lyubomir Kozlovski < [email protected]> wrote: > I'm computing 2 floats per lane and want to store them interleaved, like > so: > > foreach (i = 0 ... _n) > { > // do some work to compute data0, data1... > > *(_b->m_data + 2 * i + 0) = data0; > *(_b->m_data + 2 * i + 1) = data1; > } > > Resulting in a scatter. > > Using C++ vector intrinsics (_mm256_unpacklo_ps, _mm256_unpackhi_ps, 2x > _mm256_permute2f128_ps) I can interleave data0 and data1 and generate > vector stores. > > Is there a way to do this in ISPC? > > -- > You received this message because you are subscribed to the Google Groups > "Intel SPMD Program Compiler Users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "Intel SPMD Program Compiler Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
