On 10/12/2011 09:24 AM, Jakub Jelinek wrote: > BTW, I wonder if vector multiply expansion when one argument is VECTOR_CST > with all elements the same shouldn't use something similar to what expand_mult > does, not sure if in the generic code or at least in the backends. > Testing the costs will be harder, maybe it could just test fewer algorithms > and perhaps just count number of instructions or something similar. > But certainly e.g. v32qi multiplication by 3 is quite costly > (4 interleaves, 2 v16hi multiplications, 4 insns to select even from the > two), while two vector additions (tmp = x + x; result = x + tmp;) > would do the job.
It would certainly be a good thing to try to do this in the middle-end. > 2011-10-12 Jakub Jelinek <ja...@redhat.com> > > * config/i386/sse.md (vec_avx2): New mode_attr. > (mulv16qi3): Macroize to cover also mulv32qi3 for > TARGET_AVX2 into ... > (mul<mode>3): ... this. Ok. r~