I recently found that target_clones functions cannot inline even when
the caller has exactly the same target. However, if we only use target
attributes in C++ and let the compiler generate IFUNC for us, the
functions with the same target will be inlined.
For example, the following code compiled on x86-64 target with -O3 will
generate IFUNC for foo and bar and inline foo into the bar:
```cpp
__attribute__((target("default")))
int foo(int *arr) {
int sum = 0;
for (int i=0;i<16;i++) sum += arr[i];
return sum;
}
__attribute__((target("avx2")))
int foo(int *arr) {
int sum = 0;
for (int i=0;i<16;i++) sum += arr[i];
return sum;
}
__attribute__((target("default")))
int bar(int *arr) {
return foo(arr);
}
__attribute__((target("avx2")))
int bar(int *arr) {
return foo(arr);
}
```
However, if we use target_clones attribute, the target_clones functions
will not be inlined:
```cpp
__attribute__((target_clones("default","avx2")))
int foo(int *arr) {
int sum = 0;
for (int i=0;i<16;i++) sum += arr[i];
return sum;
}
__attribute__((target_clones("default","avx2")))
int bar(int *arr) {
return foo(arr);
}
```
This behavior may negatively impact performance since the target_clones
functions are not inlined. And since we didn't jump to the target_clones
functions based on PLT but used the same target as the caller's target.
I think it's better to allow the target_clones functions to be inlined.
gcc/ada/ChangeLog:
* gcc-interface/utils.cc (handle_target_clones_attribute):
Allow functions with target_clones attribute to be inlined.
gcc/c-family/ChangeLog:
* c-attribs.cc (handle_target_clones_attribute):
Allow functions with target_clones attribute to be inlined.
gcc/d/ChangeLog:
* d-attribs.cc (d_handle_target_clones_attribute):
Allow functions with target_clones attribute to be inlined.
Signed-off-by: Yangyu Chen <[email protected]>
---
gcc/ada/gcc-interface/utils.cc | 5 +----
gcc/c-family/c-attribs.cc | 3 ---
gcc/d/d-attribs.cc | 5 -----
3 files changed, 1 insertion(+), 12 deletions(-)
diff --git a/gcc/ada/gcc-interface/utils.cc b/gcc/ada/gcc-interface/utils.cc
index 60f36b1e50d..d010b684177 100644
--- a/gcc/ada/gcc-interface/utils.cc
+++ b/gcc/ada/gcc-interface/utils.cc
@@ -7299,10 +7299,7 @@ handle_target_clones_attribute (tree *node, tree name,
tree ARG_UNUSED (args),
int ARG_UNUSED (flags), bool *no_add_attrs)
{
/* Ensure we have a function type. */
- if (TREE_CODE (*node) == FUNCTION_DECL)
- /* Do not inline functions with multiple clone targets. */
- DECL_UNINLINABLE (*node) = 1;
- else
+ if (TREE_CODE (*node) != FUNCTION_DECL)
{
warning (OPT_Wattributes, "%qE attribute ignored", name);
*no_add_attrs = true;
diff --git a/gcc/c-family/c-attribs.cc b/gcc/c-family/c-attribs.cc
index 4dd2eecbea5..f8759bb1908 100644
--- a/gcc/c-family/c-attribs.cc
+++ b/gcc/c-family/c-attribs.cc
@@ -6105,9 +6105,6 @@ handle_target_clones_attribute (tree *node, tree name,
tree ARG_UNUSED (args),
"single %<target_clones%> attribute is ignored");
*no_add_attrs = true;
}
- else
- /* Do not inline functions with multiple clone targets. */
- DECL_UNINLINABLE (*node) = 1;
}
else
{
diff --git a/gcc/d/d-attribs.cc b/gcc/d/d-attribs.cc
index 0f7ca10e017..9f67415adb1 100644
--- a/gcc/d/d-attribs.cc
+++ b/gcc/d/d-attribs.cc
@@ -788,11 +788,6 @@ d_handle_target_clones_attribute (tree *node, tree name,
tree, int,
warning (OPT_Wattributes, "%qE attribute ignored", name);
*no_add_attrs = true;
}
- else
- {
- /* Do not inline functions with multiple clone targets. */
- DECL_UNINLINABLE (*node) = 1;
- }
return NULL_TREE;
}
--
2.45.2