https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103227
Jan Hubicka <hubicka at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |hubicka at gcc dot gnu.org, | |marxin at gcc dot gnu.org, | |mjambor at suse dot cz Last reconfirmed| |2021-11-13 Status|UNCONFIRMED |NEW Component|tree-optimization |ipa Ever confirmed|0 |1 --- Comment #1 from Jan Hubicka <hubicka at gcc dot gnu.org> --- So code size is due to ipa-icf: hubicka@lomikamen:/aux/hubicka/trunk/build6/gcc$ /aux/hubicka/trunk-install-fortran/bin/gfortran -Ofast -march=native -flto 548.exchange2_r/src/exchange2.F90 size a.out hubicka@lomikamen:/aux/hubicka/trunk/build6/gcc$ size a.out text data bss dec hex filename 245442 864 6368 252674 3db02 a.out hubicka@lomikamen:/aux/hubicka/trunk/build6/gcc$ /aux/hubicka/trunk-install-fortran/bin/gfortran -Ofast -march=native -flto 548.exchange2_r/src/exchange2.F90 -fno-ipa-icf -fno-ipa-modref hubicka@lomikamen:/aux/hubicka/trunk/build6/gcc$ size a.out text data bss dec hex filename 178982 864 6336 186182 2d746 a.out hubicka@lomikamen:/aux/hubicka/trunk/build6/gcc$ /aux/hubicka/trunk-install-fortran/bin/gfortran -Ofast -march=native -flto 548.exchange2_r/src/exchange2.F90 -fno-ipa-icf hubicka@lomikamen:/aux/hubicka/trunk/build6/gcc$ size a.out text data bss dec hex filename 245442 864 6368 252674 3db02 a.out We end up with isra clones of digits_2: bb/a.ltrans0.ltrans.250t.optimized:;; Function digits_2.constprop.isra (__brute_force_MOD_digits_2.constprop.6.isra.0, funcdef_no=29, decl_uid=4805, cgraph_uid=32, symbol_order=157) bb/a.ltrans1.ltrans.250t.optimized:;; Function digits_2.constprop.isra (__brute_force_MOD_digits_2.constprop.4.isra.0, funcdef_no=2, decl_uid=4730, cgraph_uid=3, symbol_order=159) bb/a.ltrans1.ltrans.250t.optimized:;; Function digits_2.constprop.isra (__brute_force_MOD_digits_2.constprop.2.isra.0, funcdef_no=6, decl_uid=4737, cgraph_uid=7, symbol_order=161) bb/a.ltrans1.ltrans.250t.optimized:;; Function digits_2.constprop.isra (__brute_force_MOD_digits_2.constprop.0.isra.0, funcdef_no=10, decl_uid=4741, cgraph_uid=11, symbol_order=163) bb/a.ltrans1.ltrans.250t.optimized:;; Function rearrange.isra (__brute_force_MOD_rearrange.isra.0, funcdef_no=14, decl_uid=4744, cgraph_uid=15, symbol_order=165) Curious is that original has 3 clones a.ltrans0.ltrans.250t.optimized:;; Function digits_2.constprop (__brute_force_MOD_digits_2.constprop.3, funcdef_no=2, decl_uid=4799, cgraph_uid=27, symbol_order=143) a.ltrans0.ltrans.250t.optimized:;; Function digits_2.constprop (__brute_force_MOD_digits_2.constprop.0, funcdef_no=5, decl_uid=4796, cgraph_uid=24, symbol_order=140) a.ltrans1.ltrans.250t.optimized:;; Function digits_2.constprop (__brute_force_MOD_digits_2.constprop.6, funcdef_no=5, decl_uid=4760, cgraph_uid=2, symbol_order=146) which likely explain the code size difference. The ipa-cp decisions are same so it looks like isra is affecting inliner which instead of producing 3 function for clones 0,3,6 produces 4 functions for clones 0,2,4,6