On Fri, Oct 17, 2014 at 04:28:12PM +0400, Kirill Yukhin wrote: > > I wonder whether for these modes it can ever be beneficial to build them > > through interleaves/concatenations etc., if it wouldn't be better to build > > them by storing all values into memory and just reading it back. > I've tried this example: > #include <immintrin.h> > > unsigned char a0, a1, a2, a3, a4, a5, a6, a7, a8, a9, a10, a11, a12, a13, a14, > a15, a16, a17, a18, a19, a20, a21, a22, a23, a24, a25, a26, a27, a28, a29, > a30, a31, a32, a33, a34, a35, a36, a37, a38, a39, a40, a41, a42, a43, a44, > a45, a46, a47, a48, a49, a50, a51, a52, a53, a54, a55, a56, a57, a58, a59, > a60, a61, a62, a63; > > __m512i foo () > { > return __extension__ (__m512i)(__v64qi){ > a0, a1, a2, a3, a4, a5, a6, a7, a8, a9, a10, a11, a12, a13, a14, > a15, a16, a17, a18, a19, a20, a21, a22, a23, a24, a25, a26, a27, a28, > a29, > a30, a31, a32, a33, a34, a35, a36, a37, a38, a39, a40, a41, a42, a43, > a44, > a45, a46, a47, a48, a49, a50, a51, a52, a53, a54, a55, a56, a57, a58, > a59, > a60, a61, a62, a63 }; > } > > w/ and w/o -mavx512bw (and always -mavx512f). > > When, this code works, we've got 127 lines of assembly to do this init. > W/o AVX-512BW we've got > 300 lines of code (mostly on GPRs, using sal, and > etc.) > > Then I've looked into actual assembly w/ -mavx512bw and it turns out that no > AVX-512BW insn were generated, only AVX-512F (and below). Fixed iterator.
Ok, if it is shorter than copying all those into memory and reading from memory, so be it. > > > -(define_mode_iterator VI48F_512 [V16SI V16SF V8DI V8DF]) > > > +(define_mode_iterator VI48F_I12_AVX512BW > > > + [V16SI V16SF V8DI V8DF > > > + (V32HI "TARGET_AVX512BW") (V64QI "TARGET_AVX512BW")]) > > > > What does the I12 stand for? Wasn't it meant to be VI48F_512_AVX512BW > > or I512? > Actually, I am not awere of any name convention for iterators. > As far as I understand, name [more or less] for vector mode > should reflect: > - Type family of the unit: float or int > - Size of the unit: 1, 2, 4 etc. bytes > - If possible, target predicates to enable certain modes in > given iterator. > > The name is: > - Vector (V) > - I48F - contains both ints and floats of size 4 and 8 > - I12 - contains ints of size 1 and 2 > - AVX512BW - affected by the target (according to previous note - to be > removed) > > Maybe it'll be better to name it: VF48_I1248? I'll leave that to Uros, the patch is ok by me. Jakub