https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64896
--- Comment #5 from Jan Hubicka <hubicka at gcc dot gnu.org> --- Hmm, an RTL expansion issue. We optimize m_fn2 as B D::m_fn2() const (const struct D * const this) { ;; basic block 2, loop depth 0 ;; pred: ENTRY MEM[(struct B *)&<retval>] = 0; MEM[(struct B *)&<retval> + 4B] = 0; MEM[(struct B *)&<retval> + 8B] = 0; return <retval>; ;; succ: EXIT } and we want to store <result_decl 0x7ffff7ff82d0 D.2359 type <record_type 0x7ffff6c55000 B type_5 type_6 BLK size <integer_cst 0x7ffff6c50bd0 constant 96> unit size <integer_cst 0x7ffff6c50c00 constant 12> align 32 symtab 0 alias set 5 canonical type 0x7ffff6c55000 fields <field_decl 0x7ffff6c438e8 m_location type <record_type 0x7ffff6c44dc8 A> private nonlocal decl_3 DI file /aux/hubicka/t.ii line 7 col 5 size <integer_cst 0x7ffff6ad6e58 constant 64> unit size <integer_cst 0x7ffff6ad6e70 constant 8> align 32 offset_align 128 offset <integer_cst 0x7ffff6ad6e88 constant 0> bit offset <integer_cst 0x7ffff6ad6ed0 constant 0> context <record_type 0x7ffff6c55000 B> chain <field_decl 0x7ffff6c43980 m_size>> context <translation_unit_decl 0x7ffff7ff81e0 D.1> full-name "class B" X() X(constX&) this=(X&) n_parents=0 use_template=0 interface-unknown pointer_to_this <pointer_type 0x7ffff6c5a738> chain <type_decl 0x7ffff6c437b8 B>> used ignored regdecl BLK file /aux/hubicka/t.ii line 27 col 13 size <integer_cst 0x7ffff6c50bd0 96> unit size <integer_cst 0x7ffff6c50c00 12> align 32 context <function_decl 0x7ffff6c561b0 m_fn2> (parallel:BLK [ (expr_list:REG_DEP_TRUE (reg:DI 87 [ <retval> ]) (const_int 0 [0])) (expr_list:REG_DEP_TRUE (reg:SI 88 [ <retval>+8 ]) (const_int 8 [0x8])) ])> into its DECL_RTL but somehow we end up not doing that correctly. Without -fipa-icf we produce: B D::m_fn2() const (const struct D * const this) { struct B D.2398; <bb 2>: D.2398.m_location.m_x = 0; D.2398.m_location.m_y = 0; D.2398.m_size = 0; return D.2398; } that looks equivalent but gets compiled well. We decide to unify m_fn1 and m_fn2 as: virtual B F::m_fn1() const (const struct F * const this) { struct B D.2396; <bb 2>: D.2396.m_location.m_x = 0; D.2396.m_location.m_y = 0; D.2396.m_size = 0; return D.2396; } B D::m_fn2() const (const struct D * const this) { <bb 2>: <retval> = F::m_fn1 (this_2(D)); [tail call] return <retval>; } which eventually gets inlined back of course (I will teach ICF to skip thunk creation when inline sequence is shorter). The inliner produces: B D::m_fn2() const (const struct D * const this) { struct B D.2413; <bb 2>: D.2413.m_location.m_x = 0; D.2413.m_location.m_y = 0; D.2413.m_size = 0; <retval> = D.2413; return <retval>; } I suppose the extra <retval> store is the problem. Implementing wrapper by hand gives me: B D::m_fn5() const (const struct D * const this) { struct B D.2406; <bb 2>: D.2406 = D::m_fn2 (this_2(D)); return D.2406; } which is bit uncool by adding extra copy and I also think it won't fly for DECL_BY_REFERENCE stuff. Jakub/Richi can we get the direct return to work or shall we extend thunk generation to introduce a temporary? If so under what conditions? thunk expansion already does: if (DECL_BY_REFERENCE (resdecl)) { restmp = gimple_fold_indirect_ref (resdecl); if (!restmp) restmp = build2 (MEM_REF, TREE_TYPE (TREE_TYPE (DECL_RESULT (alias))), resdecl, build_int_cst (TREE_TYPE (DECL_RESULT (alias)), 0)); } else if (!is_gimple_reg_type (restype)) { restmp = resdecl; if (TREE_CODE (restmp) == VAR_DECL) add_local_decl (cfun, restmp); BLOCK_VARS (DECL_INITIAL (current_function_decl)) = restmp; } else restmp = create_tmp_reg (restype, "retval"); I suppose we may want to add case for !DECL_BY_REFERENCE that needs temporary "retval", too.