https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103554
--- Comment #11 from rguenther at suse dot de <rguenther at suse dot de> --- On Tue, 7 Dec 2021, crazylht at gmail dot com wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103554 > > --- Comment #10 from Hongtao.liu <crazylht at gmail dot com> --- > Got it, thanks for your detail explanation, so there're 2 issues in this case, > first x86 target didn't choose vector size w/ smallest cost, second BB > vectorization with gaps at the end of a load is not supported. > > on the other side, if "BB vectorization with gaps at the end of a load is not > supported", cost of scalar version should be cheaper than both 128 and 256 > vectorization. I've once tried to increase cost of vec_construct to make it > more realistic, but the patch regressed PR101929. The current cost model tends > to generate more vectorized code. The cost model would really need to look at more than a single stmt. If there is work to schedule in parallel to a vector build then it really isn't that expensive. It's just that if we are dependent on the result and cannot proceed then it can end up being more expensive. Remember we are really costing assuming stmts execute one at a time, simply adding latencies. We have ideas on how to improve on that side.