On August 22, 2017 6:38:55 PM GMT+02:00, "Richard Earnshaw (lists)" <richard.earns...@arm.com> wrote: >On 22/08/17 13:55, Kyrill Tkachov wrote: >> Hi Richard, >> [roping in more aarch64 maintainers] >> >> On 22/08/17 13:27, Richard Biener wrote: >>> On Tue, 22 Aug 2017, Uros Bizjak wrote: >>> >>>> On Tue, Aug 22, 2017 at 12:15 PM, Richard Biener ><rguent...@suse.de> >>>> wrote: >>>>> The following patch fixes PR81921 (and LTO build of libgo) which I >ran >>>>> into when trying to enable free-lang-data for non-LTO compiles. >>>>> >>>>> free-lang-data forces a DECL_FUNCTION_SPECIFIC_TARGET for all >functions >>>>> so we have them ending up with target_option_default_node >eventually >>>>> which is something ix86_can_inline_p doesn't expect (I tried >forcing >>>>> a compare of the actual options but that fails as well as we get >>>>> spurious differences in use-fpmath, the default node with -m32 >>>>> -march=x86_64 doesn't have it while non-default nodes have it). >>>>> >>>>> The patch is what I consider safe for branches, we might want to >work >>>>> on sth better (actually comparing always and fixing the fpmath >issue) >>>>> on trunk as followup. >>>>> >>>>> Bootstrap & regtest running on x86_64-unknown-linux-gnu, ok for >trunk >>>>> and active branches? >>>>> >>>>> Note the change to the ref = false conditional isn't strictly >necessary >>>>> but it makes -flto and non-flto behave consistently. >>>>> >>>>> Thanks, >>>>> Richard. >>>>> >>>>> 2017-08-22 Richard Biener <rguent...@suse.de> >>>>> >>>>> PR target/81921 >>>>> * config/i386/i386.c (ix86_can_inline_p): Treat >>>>> target_option_default_node as non-existent. >>>> LGTM. >>>> >>>> Please give the patch some soaking time in the mainline before >>>> backporting it to release branches. >>> Thanks. Of course this was copied by other targets (and the x86 >>> one maybe from the default). So the following is an extended patch. >>> >>> Ok for the rs6000 and aarch64 bits? >>> >>> Thanks, >>> Richard. >>> >>> 2017-08-22 Richard Biener <rguent...@suse.de> >>> >>> PR target/81921 >>> * config/i386/i386.c (ix86_can_inline_p): Treat >>> target_option_default_node as non-existent. >>> * targhooks.c (default_target_can_inline_p): Likewise. >>> * config/aarch64/aarch64.c (aarch64_can_inline_p): Likewise. >>> * config/powerpcspe/powerpcspe.c (rs6000_can_inline_p): >Likewise. >>> * config/rs6000/rs6000.c (rs6000_can_inline_p): Likewise. >>> >>> Index: gcc/config/i386/i386.c >>> =================================================================== >>> --- gcc/config/i386/i386.c (revision 251266) >>> +++ gcc/config/i386/i386.c (working copy) >>> @@ -7507,12 +7507,12 @@ ix86_can_inline_p (tree caller, tree cal >>> tree callee_tree = DECL_FUNCTION_SPECIFIC_TARGET (callee); >>> /* If callee has no option attributes, then it is ok to >inline. */ >>> - if (!callee_tree) >>> + if (!callee_tree || callee_tree == target_option_default_node) >>> ret = true; >>> /* If caller has no option attributes, but callee does then it >>> is not ok to >>> inline. */ >>> - else if (!caller_tree) >>> + else if (!caller_tree || caller_tree == >target_option_default_node) >>> ret = false; >>> else >>> Index: gcc/testsuite/gcc.target/i386/pr81921.c >>> =================================================================== >>> --- gcc/testsuite/gcc.target/i386/pr81921.c (nonexistent) >>> +++ gcc/testsuite/gcc.target/i386/pr81921.c (working copy) >>> @@ -0,0 +1,14 @@ >>> +/* { dg-do compile } */ >>> +/* { dg-require-effective-target lto } */ >>> +/* { dg-options "-flto -march=x86-64" } */ >>> + >>> +extern __inline int __attribute__((__gnu_inline__, >>> __always_inline__, __artificial__, target("sse2"))) >>> +_mm_loadu_si128 (int const *__P) >>> +{ >>> + return *__P; >>> +} >>> + >>> +void __attribute__((target("ssse3"))) foo (void *p) >>> +{ >>> + volatile int x = _mm_loadu_si128 (p); >>> +} >>> Index: gcc/targhooks.c >>> =================================================================== >>> --- gcc/targhooks.c (revision 251274) >>> +++ gcc/targhooks.c (working copy) >>> @@ -1447,12 +1447,12 @@ default_target_can_inline_p (tree caller >>> tree caller_opts = DECL_FUNCTION_SPECIFIC_TARGET (caller); >>> /* If callee has no option attributes, then it is ok to inline >*/ >>> - if (!callee_opts) >>> + if (!callee_opts || callee_tree == target_option_default_node) >>> ret = true; >>> /* If caller has no option attributes, but callee does then it >>> is not ok to >>> inline */ >>> - else if (!caller_opts) >>> + else if (!caller_opts || caller_tree == >target_option_default_node) >>> ret = false; >>> /* If both caller and callee have attributes, assume that if >the >>> Index: gcc/config/aarch64/aarch64.c >>> =================================================================== >>> --- gcc/config/aarch64/aarch64.c (revision 251274) >>> +++ gcc/config/aarch64/aarch64.c (working copy) >>> @@ -10074,7 +10074,7 @@ aarch64_can_inline_p (tree caller, tree >>> tree callee_tree = DECL_FUNCTION_SPECIFIC_TARGET (callee); >>> /* If callee has no option attributes, then it is ok to >inline. */ >>> - if (!callee_tree) >>> + if (!callee_tree || callee_tree == target_option_default_node) >>> return true; >>> >> >> The aarch64 changes looks ok to me FWIW (since I wrote that function) >> > >I *think* this is probably OK for AArch64, but I don't think it's safe >in general. > >Consider, for example, the arm port where we can have functions built >for ARM and functions built for Thumb. You cannot, in general, inline >either into the other, (think inline asm, or isa-specific intrinsic >operations). > >I think this approach only works at all because it's generally >meaningless to have some ISA that is the default and then construct >other functions later in the compilation that deliberately target a >smaller subset of the ISA. But such an assumption doesn't apply when >you construct functions in completely separate ISAs which are >call-compatible but otherwise distinct.
Note the patch doesn't change anything for NON-LTO, where you never see target_option_default_node. Just LTO forces those in place of NULL to make cross-TU behavior more like you suggest. Richard. >R. > >> Kyrill >> >>> struct cl_target_option *caller_opts >>> Index: gcc/config/powerpcspe/powerpcspe.c >>> =================================================================== >>> --- gcc/config/powerpcspe/powerpcspe.c (revision 251274) >>> +++ gcc/config/powerpcspe/powerpcspe.c (working copy) >>> @@ -40185,12 +40185,12 @@ rs6000_can_inline_p (tree caller, tree c >>> tree callee_tree = DECL_FUNCTION_SPECIFIC_TARGET (callee); >>> /* If callee has no option attributes, then it is ok to >inline. */ >>> - if (!callee_tree) >>> + if (!callee_tree || callee_tree == target_option_default_node) >>> ret = true; >>> /* If caller has no option attributes, but callee does then it >>> is not ok to >>> inline. */ >>> - else if (!caller_tree) >>> + else if (!caller_tree || caller_tree == >target_option_default_node) >>> ret = false; >>> else >>> Index: gcc/config/rs6000/rs6000.c >>> =================================================================== >>> --- gcc/config/rs6000/rs6000.c (revision 251274) >>> +++ gcc/config/rs6000/rs6000.c (working copy) >>> @@ -37395,12 +37395,12 @@ rs6000_can_inline_p (tree caller, tree c >>> tree callee_tree = DECL_FUNCTION_SPECIFIC_TARGET (callee); >>> /* If callee has no option attributes, then it is ok to >inline. */ >>> - if (!callee_tree) >>> + if (!callee_tree || callee_tree == target_option_default_node) >>> ret = true; >>> /* If caller has no option attributes, but callee does then it >>> is not ok to >>> inline. */ >>> - else if (!caller_tree) >>> + else if (!caller_tree || caller_tree == >target_option_default_node) >>> ret = false; >>> else >>