https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69041
Andrew Pinski <pinskia at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Severity|normal |enhancement Summary|Unnecessary push/pop of |Unnecessary push/pop of |caller-save register (ecx) |caller-save register (ecx) |on 32bit with vector |on 32bit with vector |intrinsics. Sometimes |intrinsics. |without the pop, clobbering | |ebp (callee-save) | Keywords|wrong-code | --- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> --- >clobbers the caller's ebp with the pushed value of ecx, but the esp=ebp part >of leave cleans up after the mismatched push/pop This is wrong. Leave does the pop. Basically leave is `sp = ebp + 4; ebp = [ebp];` So no wrong code. Just missed optimizations. The removal of the stack frame happened in GCC 6 for dummy/dummy2. The removal of vzeroupper for dummy/dummy2 happened in GCC 9. Looks like there is still a frame pointer creation happening for add_pixdiff; I have not looked into why though