https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118125
--- Comment #4 from Martin Jambor <jamborm at gcc dot gnu.org> --- Redirecting the call to operator delete[](void*) to __builtin_unreachable(), which seems the correct thing to do, leads to one more SLP vectorization in the functin experiencing the slow-down, comparing -fopt-info-optimized gives: @@ -188,6 +188,7 @@ /home/mjambor/gcc/mine/inst/include/c++/15.0.1/bits/stl_tree.h:206:27: optimized: basic block part vectorized using 16 byte vectors include/lac/vector.h:990:12: optimized: basic block part vectorized using 8 byte vectors include/lac/vector.h:979:31: optimized: basic block part vectorized using 8 byte vectors +include/lac/solver_gmres.h:498:14: optimized: basic block part vectorized using 16 byte vectors include/lac/vector.h:949:3: optimized: basic block part vectorized using 8 byte vectors /home/mjambor/gcc/mine/inst/include/c++/15.0.1/bits/stl_tree.h:206:27: optimized: basic block part vectorized using 16 byte vectors include/lac/vector.h:990:12: optimized: basic block part vectorized using 8 byte vectors I have mananged to avoid that one particular SLP vectoriation using -fdbg-cnt=ipa_update_vr:2175-2175:3013-3013,vect_slp:1-61,63-9999 and was able to get the original performance back. Unfortunately when I then looked at SLP vectorization when all IPA-VR propagations were allowed again, this particular case was not there (but there were plenty of others). If I can read the slp dump correctly (which is a big if), the vectorization produced the following change: @@ -30911,15 +26466,16 @@ # DEBUG this => NULL # DEBUG i => NULL c_790 = *_789; + _272 = {c_790, c_790}; # DEBUG c => c_790 # DEBUG this => D#400 # DEBUG i => D#396 _792 = _229 + _785; # DEBUG this => NULL # DEBUG i => NULL - # DEBUG dummy => D__lsm0.1125_879 - _794 = c_790 * D__lsm0.1125_879; - _795 = i_758 + 1; + # DEBUG dummy => D__lsm0.1125_214 + _794 = D__lsm0.1125_214 * c_790; + _795 = i_754 + 1; # DEBUG D#397 => (unsigned int) _795 # DEBUG this => D#400 # DEBUG i => D#397 @@ -30929,13 +26485,16 @@ # DEBUG this => NULL # DEBUG i => NULL _799 = *_798; + _273 = {D__lsm0.1125_214, _799}; _800 = s_787 * _799; _801 = _794 + _800; - *_792 = _801; - _1111 = s_787 * D__lsm0.1125_879; + _1110 = D__lsm0.1125_214 * s_787; + _252 = {_800, _1110}; + vect__2237.1174_737 = .VEC_FMSUBADD (_273, _272, _252); _805 = c_790 * _799; - _41 = _805 - _1111; - *_798 = _41; + _41 = _805 - _1110; + vectp.1176_781 = _792; + MEM <vector(2) double> [(double &)vectp.1176_781] = vect__2237.1174_737; # DEBUG i => _795 if (_298 > _795) goto <bb 405>; [89.00%]