https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91526
--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> --- (In reply to Jakub Jelinek from comment #4) > > so the C++ FE already elides the return copy by placing 'result' in the > > return slot while the C FE doesn't do this. > > That's because in C++ the language requires NRV to be performed in certain > cases, while for C there is nothing like that and we do the tree NRV in that > case only much later (nrv pass). > > Joseph, any thoughts whether it would be a valid C FE optimization that > valid C programs can't observe? I think we're careful on the caller side not using the destination as return slot in aggr = foo (); already so no need to try to be clever on the callee-side? Fixing this might also fix some missed tail-calling. Note in this particular case the return value is returned via xmm0/xmm2 so the extra copy we create during gimplification is even more pointless. And I guess NRV doesn't do anything because of the CLOBBER? <retval> = result; result ={v} {CLOBBER}; return <retval>; or simply because /* If this function does not return an aggregate type in memory, then there is nothing to do. */ if (!aggregate_value_p (result, current_function_decl)) return 0; I guess. Or because 'result' ends up as TREE_ADDRESSABLE for some reason!? create_iv does this, as part of vectorization but after that we never again do update_address_taken ... :/ I guess after late FRE would be a good time.