Hi Jakub! On 21 Mar 21:16, Jakub Jelinek wrote: > The ix86_expand_vecop_qihi function has been adjusted for AVX512* just > by changing i < 32 to i < 64 (where both were sometimes wasteful), but > for !full_interleave that is even wrong, swapping the second and third > quarter is something that works to undo AVX256 unpacks only, > where we want > 0,2,4,6,8,10,12,14,32,34,36,38,40,42,44,46,16,18,20,22,24,26,28,30,48,50,52,54,56,58,60,62, > permutation. But, for AVX512 we want > 0,2,4,6,8,10,12,14,64,66,68,70,72,74,76,78,16,18,20,22,24,26,28,30,80,82,84,86,88,90,92,94,32,34,36,38,40,42,44,46,96,98,100,102,104,106,108,110,48,50,52,54,56,58,60,62,112,114,116,118,120,122,124,126 > where the current trunk code has been producing > 0,2,4,6,8,10,12,14,32,34,36,38,40,42,44,46,16,18,20,22,24,26,28,30,48,50,52,54,56,58,60,62,64,66,68,70,72,74,76,78,96,98,100,102,104,106,108,110,80,82,84,86,88,90,92,94,112,114,116,118,120,122,124,126 > instead. > > Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for > trunk?
Your putch is OK. I'd only suggest to add a comment to this calculation: + d.perm[i] = ((i * 2) & 14) + ((i & 8) ? d.nelt : 0) + (i & ~15); -- Thanks, K