https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103562
Martin Liška <marxin at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |hubicka at gcc dot gnu.org
Priority|P3 |P1
--- Comment #3 from Martin Liška <marxin at gcc dot gnu.org> ---
It's very fishy and it's screwed up in the inliner:
$ cat pr103562-2.c
struct my_struct { long a; long b; long c; };
static struct my_struct deref(struct my_struct *ptr) { return *ptr; }
long get_a(struct my_struct *s) { return deref(s).a; }
...
$ gcc pr103562-2.c -O1 -c -fdump-tree-all -fdump-ipa-all-details
pr103562-2.c.082i.fnsummary contains:
long int get_a (struct my_struct * s)
{
struct my_struct D.1989;
long int _4;
<bb 2> [local count: 1073741824]:
D.1989 = deref (s_2(D)); [return slot optimization]
_4 = D.1989.a;
return _4;
}
struct my_struct deref (struct my_struct * ptr)
{
<bb 2> [local count: 1073741824]:
<retval> = *ptr_2(D);
return <retval>;
}
pr103562-2.c.083i.inline does:
long int get_a (struct my_struct * s)
{
struct my_struct D.1989;
long int _4;
<bb 2> [local count: 1073741824]:
D.1989 = *s_2(D);
_4 = D.1989.a;
return _4;
}
While using JIT with the following patch:
diff --git a/gcc/jit/jit-playback.c b/gcc/jit/jit-playback.c
index b412eae6aa8..ccde56cdd98 100644
--- a/gcc/jit/jit-playback.c
+++ b/gcc/jit/jit-playback.c
@@ -2512,9 +2512,9 @@ make_fake_args (vec <char *> *argvec,
if (get_bool_option (GCC_JIT_BOOL_OPTION_DUMP_EVERYTHING))
{
- ADD_ARG ("-fdump-tree-all");
- ADD_ARG ("-fdump-rtl-all");
- ADD_ARG ("-fdump-ipa-all");
+ ADD_ARG ("-fdump-tree-all-details");
+ ADD_ARG ("-fdump-rtl-all-details");
+ ADD_ARG ("-fdump-ipa-all-details");
}
/* Add "-fdump-" options for any calls to
$ gcc pr103562.c -lgccjit && ./a.out
using libgccjit 12.0.0
intermediate files written to /tmp/libgccjit-kH6M9y
get_a(&s) is 140737488346368
$ cd /tmp/libgccjit-kH6M9y
fake.c.083i.inline does:
Processing frequency deref/0
Called by get_a/1 that is normal or hot
Accounting size:-4.00, time:-13.00 on predicate exec:(true)
Inlined into get_a/1 which now has 6 size
...
Updating SSA:
Registering new PHI nodes in block #0
Registering new PHI nodes in block #2
Registering new PHI nodes in block #4
Updating SSA information for statement <retval> = *ptr_2(D);
Registering new PHI nodes in block #3
Updating SSA information for statement _4 = D.88.a;
Updating SSA information for statement return _4;
...
long int get_a (struct my_struct * ptr)
{
struct my_struct D.88;
long int _4;
<bb 2> [local count: 1073741824]:
<L0>:
<retval> = *ptr_2(D);
_4 = D.88.a;
return _4;
}
So the inliner is for some reason responsible for that.