https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103227
Jan Hubicka <hubicka at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |hubicka at gcc dot gnu.org,
| |marxin at gcc dot gnu.org,
| |mjambor at suse dot cz
Last reconfirmed| |2021-11-13
Status|UNCONFIRMED |NEW
Component|tree-optimization |ipa
Ever confirmed|0 |1
--- Comment #1 from Jan Hubicka <hubicka at gcc dot gnu.org> ---
So code size is due to ipa-icf:
hubicka@lomikamen:/aux/hubicka/trunk/build6/gcc$
/aux/hubicka/trunk-install-fortran/bin/gfortran -Ofast -march=native -flto
548.exchange2_r/src/exchange2.F90
size a.out
hubicka@lomikamen:/aux/hubicka/trunk/build6/gcc$ size a.out
text data bss dec hex filename
245442 864 6368 252674 3db02 a.out
hubicka@lomikamen:/aux/hubicka/trunk/build6/gcc$
/aux/hubicka/trunk-install-fortran/bin/gfortran -Ofast -march=native -flto
548.exchange2_r/src/exchange2.F90 -fno-ipa-icf -fno-ipa-modref
hubicka@lomikamen:/aux/hubicka/trunk/build6/gcc$ size a.out
text data bss dec hex filename
178982 864 6336 186182 2d746 a.out
hubicka@lomikamen:/aux/hubicka/trunk/build6/gcc$
/aux/hubicka/trunk-install-fortran/bin/gfortran -Ofast -march=native -flto
548.exchange2_r/src/exchange2.F90 -fno-ipa-icf
hubicka@lomikamen:/aux/hubicka/trunk/build6/gcc$ size a.out
text data bss dec hex filename
245442 864 6368 252674 3db02 a.out
We end up with isra clones of digits_2:
bb/a.ltrans0.ltrans.250t.optimized:;; Function digits_2.constprop.isra
(__brute_force_MOD_digits_2.constprop.6.isra.0, funcdef_no=29, decl_uid=4805,
cgraph_uid=32, symbol_order=157)
bb/a.ltrans1.ltrans.250t.optimized:;; Function digits_2.constprop.isra
(__brute_force_MOD_digits_2.constprop.4.isra.0, funcdef_no=2, decl_uid=4730,
cgraph_uid=3, symbol_order=159)
bb/a.ltrans1.ltrans.250t.optimized:;; Function digits_2.constprop.isra
(__brute_force_MOD_digits_2.constprop.2.isra.0, funcdef_no=6, decl_uid=4737,
cgraph_uid=7, symbol_order=161)
bb/a.ltrans1.ltrans.250t.optimized:;; Function digits_2.constprop.isra
(__brute_force_MOD_digits_2.constprop.0.isra.0, funcdef_no=10, decl_uid=4741,
cgraph_uid=11, symbol_order=163)
bb/a.ltrans1.ltrans.250t.optimized:;; Function rearrange.isra
(__brute_force_MOD_rearrange.isra.0, funcdef_no=14, decl_uid=4744,
cgraph_uid=15, symbol_order=165)
Curious is that original has 3 clones
a.ltrans0.ltrans.250t.optimized:;; Function digits_2.constprop
(__brute_force_MOD_digits_2.constprop.3, funcdef_no=2, decl_uid=4799,
cgraph_uid=27, symbol_order=143)
a.ltrans0.ltrans.250t.optimized:;; Function digits_2.constprop
(__brute_force_MOD_digits_2.constprop.0, funcdef_no=5, decl_uid=4796,
cgraph_uid=24, symbol_order=140)
a.ltrans1.ltrans.250t.optimized:;; Function digits_2.constprop
(__brute_force_MOD_digits_2.constprop.6, funcdef_no=5, decl_uid=4760,
cgraph_uid=2, symbol_order=146)
which likely explain the code size difference. The ipa-cp decisions are same so
it looks like isra is affecting inliner which instead of producing 3 function
for clones 0,3,6 produces 4 functions for clones 0,2,4,6