This patch allows redirecting function calls to the same FMV clone with the same "target_version" attribute on RISC-V. This can be used to reduce the number of PLT calls or GOT lookups in the generated code.
Since AArch64 and RISC-V are the only two targets that zeros TARGET_HAS_FMV_TARGET_ATTRIBUTE, this patch changes this behavior that uses the newly added TARGET_FMV_REDIRECT_CLONE macro to control this feature. For AArch64, due to the previous discussion [1], we still disable this feature by defining TARGET_FMV_REDIRECT_CLONE to 0. [1] https://patchwork.sourceware.org/project/gcc/patch/2df8081a-db6f-acb3-2882-329de3223...@e124511.cambridge.arm.com/ gcc/ChangeLog: * config/aarch64/aarch64.h (TARGET_FMV_REDIRECT_CLONE): Disable TARGET_FMV_REDIRECT_CLONE for aarch64. * defaults.h (TARGET_FMV_REDIRECT_CLONE): New macro. * multiple_target.cc (redirect_to_specific_clone): Support RISC-V FMV redirect. (ipa_target_clone): Check TARGET_FMV_REDIRECT_CLONE. Signed-off-by: Yangyu Chen <c...@cyyself.name> --- There is a problem raised in the previous discussion [1]: This behavior will eliminate the indirection in some cases where the runtime choice of the callee version can't be determined statically at compile time. I've given it some thought, and I've come to the conclusion that this behavior is not a cause for concern. We can just align with the behavior of x86 and powerpc. Because this only happens when the caller is compiled with the callee version declared with "target_version" or it will directly call from PLT or lookup GOT. In this case, the declaration of the callee version is in the same compilation unit as the caller. It mostly resides in the same source file or the header file included by the source file. This means the caller and the callee are tightly coupled, and a developer can easily find it and manually add the same "target_version" attribute to the caller and callee functions. This elimination only happens when the caller and the callee have the same "target_version" attribute. This ensures the callee version is eligible for selection. This behavior does not ensure that if any higher priority callee version were selected at runtime, then a higher priority caller version would have been eligible for selection. But this is hard to solve due to comparing the priority of different versions of the caller may not be meaningful. We can just keep it and let the developer manually add the same "target_version" attribute to the caller and callee functions. [1] https://patchwork.sourceware.org/project/gcc/patch/2df8081a-db6f-acb3-2882-329de3223...@e124511.cambridge.arm.com/ --- gcc/config/aarch64/aarch64.h | 2 ++ gcc/defaults.h | 7 +++++++ gcc/multiple_target.cc | 16 +++++++++------- 3 files changed, 18 insertions(+), 7 deletions(-) diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h index e8bd8c73c12..46d7a8ac55a 100644 --- a/gcc/config/aarch64/aarch64.h +++ b/gcc/config/aarch64/aarch64.h @@ -1450,6 +1450,8 @@ extern enum aarch64_code_model aarch64_cmodel; #define TARGET_HAS_FMV_TARGET_ATTRIBUTE 0 +#define TARGET_FMV_REDIRECT_CLONE 0 + #define TARGET_SUPPORTS_WIDE_INT 1 /* Modes valid for AdvSIMD D registers, i.e. that fit in half a Q register. */ diff --git a/gcc/defaults.h b/gcc/defaults.h index 16f6dc24e3b..bea3453508e 100644 --- a/gcc/defaults.h +++ b/gcc/defaults.h @@ -879,6 +879,13 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see #define TARGET_CLONES_ATTR_SEPARATOR ',' #endif +/* Redirect function call to a function multi-versioning clone when + calling from a function with a same "target" or "target_version" + attribute. */ +#ifndef TARGET_FMV_REDIRECT_CLONE +#define TARGET_FMV_REDIRECT_CLONE 1 +#endif + /* Select a format to encode pointers in exception handling data. We prefer those that result in fewer dynamic relocations. Assume no special support here and encode direct references. */ diff --git a/gcc/multiple_target.cc b/gcc/multiple_target.cc index d25277c0a93..aaf3f68be02 100644 --- a/gcc/multiple_target.cc +++ b/gcc/multiple_target.cc @@ -452,18 +452,20 @@ expand_target_clones (struct cgraph_node *node, bool definition) that we meet the target requirements for a matching callee version does not tell us that we won't also meet the target requirements for a higher priority callee version at runtime. Since this is longstanding behaviour - for x86 and powerpc, we preserve it for those targets, but skip the optimisation - for targets that use the "target_version" attribute for multi-versioning. */ + for x86 and powerpc, we preserve it for those targets that has + TARGET_FMV_REDIRECT_CLONE defined. */ static void redirect_to_specific_clone (cgraph_node *node) { + static const char *fmv_attr = (TARGET_HAS_FMV_TARGET_ATTRIBUTE + ? "target" : "target_version"); cgraph_function_version_info *fv = node->function_version (); if (fv == NULL) return; - gcc_assert (TARGET_HAS_FMV_TARGET_ATTRIBUTE); - tree attr_target = lookup_attribute ("target", DECL_ATTRIBUTES (node->decl)); + gcc_assert (TARGET_FMV_REDIRECT_CLONE); + tree attr_target = lookup_attribute (fmv_attr, DECL_ATTRIBUTES (node->decl)); if (attr_target == NULL_TREE) return; @@ -474,7 +476,7 @@ redirect_to_specific_clone (cgraph_node *node) if (!fv2) continue; - tree attr_target2 = lookup_attribute ("target", + tree attr_target2 = lookup_attribute (fmv_attr, DECL_ATTRIBUTES (e->callee->decl)); /* Function is not calling proper target clone. */ @@ -488,7 +490,7 @@ redirect_to_specific_clone (cgraph_node *node) for (; fv2 != NULL; fv2 = fv2->next) { cgraph_node *callee = fv2->this_node; - attr_target2 = lookup_attribute ("target", + attr_target2 = lookup_attribute (fmv_attr, DECL_ATTRIBUTES (callee->decl)); if (attr_target2 != NULL_TREE && attribute_value_equal (attr_target, attr_target2)) @@ -515,7 +517,7 @@ ipa_target_clone (void) for (unsigned i = 0; i < to_dispatch.length (); i++) create_dispatcher_calls (to_dispatch[i]); - if (TARGET_HAS_FMV_TARGET_ATTRIBUTE) + if (TARGET_FMV_REDIRECT_CLONE) FOR_EACH_FUNCTION (node) redirect_to_specific_clone (node); -- 2.49.0