On December 8, 2017 5:45:23 PM GMT+01:00, Jan Hubicka <hubi...@ucw.cz> wrote: >> On December 8, 2017 4:51:16 PM GMT+01:00, Jan Hubicka ><hubi...@ucw.cz> wrote: >> >> >> >> This restores the vec_construct cost dependence on the vector >element >> >> count. Honza removed this (accidentially?) during the rework. >> >> >> >> Bootstrap / regtest running on x86_64-unknown-linux-gnu, ok for >> >trunk? >> > >> >Hmm, the false parameter to ix86_vec_cost is supposed to do that. >It >> >uses: >> > >> >if (!parallel) > >> > >> >return cost * GET_MODE_NUNITS (mode); > >> > >> > >> >Why it doesn't work? >> >> I see. Not exactly the same. I guess we need to see why the costs >favor a 16 element vector in it plus store over 16 scalar stores >then... > >Isn't it what you proposed as profitable for Martin Jambor's copy by >pieces >issue? I think the reason is that stores are modeled as having latency >(or >cost/2) of 4. Construction is modelled as simple sse op (latency >1)*number of >parts.
Yes. >So in this simplified model we cummulate latency of 16 stores as 16*4, >while construction plus one store as 16*1+4. But vector vs. Scalar cost somehow mess up for avx512... >Honza >> >> See the PR. >> >> > >> >Honza >> >> >> >> Thanks, >> >> Richard. >> >> >> >> 2017-12-08 Richard Biener <rguent...@suse.de> >> >> >> >> PR target/83008 >> >> * config/i386/i386.c (ix86_builtin_vectorization_cost): Restore >> >> vec_construct dependence on vector element count. >> >> >> >> Index: gcc/config/i386/i386.c >> >> >=================================================================== >> >> --- gcc/config/i386/i386.c (revision 255499) >> >> +++ gcc/config/i386/i386.c (working copy) >> >> @@ -44879,7 +44879,8 @@ ix86_builtin_vectorization_cost (enum ve >> >> ix86_cost->sse_op, true); >> >> >> >> case vec_construct: >> >> - return ix86_vec_cost (mode, ix86_cost->sse_op, false); >> >> + return (ix86_vec_cost (mode, ix86_cost->sse_op, false) >> >> + * (TYPE_VECTOR_SUBPARTS (vectype) - 1)); >> >> >> >> default: >> >> gcc_unreachable ();