https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103562
Martin Liška <marxin at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |hubicka at gcc dot gnu.org Priority|P3 |P1 --- Comment #3 from Martin Liška <marxin at gcc dot gnu.org> --- It's very fishy and it's screwed up in the inliner: $ cat pr103562-2.c struct my_struct { long a; long b; long c; }; static struct my_struct deref(struct my_struct *ptr) { return *ptr; } long get_a(struct my_struct *s) { return deref(s).a; } ... $ gcc pr103562-2.c -O1 -c -fdump-tree-all -fdump-ipa-all-details pr103562-2.c.082i.fnsummary contains: long int get_a (struct my_struct * s) { struct my_struct D.1989; long int _4; <bb 2> [local count: 1073741824]: D.1989 = deref (s_2(D)); [return slot optimization] _4 = D.1989.a; return _4; } struct my_struct deref (struct my_struct * ptr) { <bb 2> [local count: 1073741824]: <retval> = *ptr_2(D); return <retval>; } pr103562-2.c.083i.inline does: long int get_a (struct my_struct * s) { struct my_struct D.1989; long int _4; <bb 2> [local count: 1073741824]: D.1989 = *s_2(D); _4 = D.1989.a; return _4; } While using JIT with the following patch: diff --git a/gcc/jit/jit-playback.c b/gcc/jit/jit-playback.c index b412eae6aa8..ccde56cdd98 100644 --- a/gcc/jit/jit-playback.c +++ b/gcc/jit/jit-playback.c @@ -2512,9 +2512,9 @@ make_fake_args (vec <char *> *argvec, if (get_bool_option (GCC_JIT_BOOL_OPTION_DUMP_EVERYTHING)) { - ADD_ARG ("-fdump-tree-all"); - ADD_ARG ("-fdump-rtl-all"); - ADD_ARG ("-fdump-ipa-all"); + ADD_ARG ("-fdump-tree-all-details"); + ADD_ARG ("-fdump-rtl-all-details"); + ADD_ARG ("-fdump-ipa-all-details"); } /* Add "-fdump-" options for any calls to $ gcc pr103562.c -lgccjit && ./a.out using libgccjit 12.0.0 intermediate files written to /tmp/libgccjit-kH6M9y get_a(&s) is 140737488346368 $ cd /tmp/libgccjit-kH6M9y fake.c.083i.inline does: Processing frequency deref/0 Called by get_a/1 that is normal or hot Accounting size:-4.00, time:-13.00 on predicate exec:(true) Inlined into get_a/1 which now has 6 size ... Updating SSA: Registering new PHI nodes in block #0 Registering new PHI nodes in block #2 Registering new PHI nodes in block #4 Updating SSA information for statement <retval> = *ptr_2(D); Registering new PHI nodes in block #3 Updating SSA information for statement _4 = D.88.a; Updating SSA information for statement return _4; ... long int get_a (struct my_struct * ptr) { struct my_struct D.88; long int _4; <bb 2> [local count: 1073741824]: <L0>: <retval> = *ptr_2(D); _4 = D.88.a; return _4; } So the inliner is for some reason responsible for that.