https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105275
--- Comment #9 from Jan Hubicka <hubicka at gcc dot gnu.org> --- The only vectorization difference is: +imagick_r.ltrans8.ltrans.189t.slp1:magick/distort.c:1911:18: optimized: basic block part vectorized using 16 byte vectors +imagick_r.ltrans8.ltrans.189t.slp1:magick/distort.c:1898:17: optimized: basic block part vectorized using 16 byte vectors +imagick_r.ltrans8.ltrans.189t.slp1:magick/distort.c:1966:30: optimized: basic block part vectorized using 16 byte vectors +imagick_r.ltrans8.ltrans.189t.slp1:magick/distort.c:1966:30: optimized: basic block part vectorized using 16 byte vectors +imagick_r.ltrans8.ltrans.189t.slp1:magick/distort.c:2304:24: optimized: basic block part vectorized using 16 byte vectors +imagick_r.ltrans8.ltrans.189t.slp1:magick/distort.c:1814:44: optimized: basic block part vectorized using 16 byte vectors +imagick_r.ltrans8.ltrans.189t.slp1:magick/distort.c:1835:44: optimized: basic block part vectorized using 16 byte vectors magick/distort.c:1835:44: note: Cost model analysis for part in loop 0: Vector cost: 2004 Scalar cost: 2152 changes to: magick/distort.c:1835:44: note: Cost model analysis for part in loop 0: Vector cost: 2004 Scalar cost: 1864 The costing differences are all of the form -MAX_EXPR <_43, _1390> 1 times scalar_stmt costs 4 in body -MAX_EXPR <_44, _1394> 1 times scalar_stmt costs 4 in body +MAX_EXPR <_43, _1390> 1 times scalar_stmt costs 12 in body +MAX_EXPR <_44, _1394> 1 times scalar_stmt costs 12 in body which is wrong. Cost should be 4.