On 04/12/2018 07:59 PM, Jakub Jelinek wrote:
> On Thu, Apr 12, 2018 at 07:53:35PM +0200, Jakub Jelinek wrote:
>> On Thu, Apr 12, 2018 at 03:46:26PM +0200, Jan Hubicka wrote:
>>> If you make C++ inline and get the idea to use target cloning attribute on
>>> this,
>>> this will likely lead to link error if you compile multiple files because
>>> you
>>> turn comdat to non-comdat.
>>>
>>> For comdats this woudl effectivly need to become C++ abi extension and we
>>> would
>>> need to define comdat sections for these. Perhaps easiest way is to simply
>>> reject the attribute on comdats and probaby also extern functions?
>>
>> I'm not really sure we can do that, various packages in the wild are already
>> using this.
>> What is the problem with comdats and multi-versioning?
>> The question is what comdat groups we should use for the comdat resolver and
>> the versioned functions, shall the ifunc symbol be the original mangling of
>> the method (or other comdat) and the other entrypoints just be .local
>> non-weak symbols inside of the same section?
>
> Ah, but we emit the resolver only if we see a use of it. That sounds quite
> broken, resolver in each TU that uses it? Better to have one at each
> definition...
>
> Jakub
>
So after quite some time I would need some brainstorming as I'm not sure how to
fix that properly. Let's start how 'target' attribute works and I believe it
should
behave same as target_clones attribute:
$ cat mv.h
int foo (void);
void test ();
$ cat mv.cpp
#include "mv.h"
__attribute__ ((target ("default")))
int foo ()
{
return 0;
}
__attribute__ ((target ("sse4.2")))
int foo ()
{
return 1;
}
int main ()
{
__builtin_printf ("in main: %d\n", foo ());
test ();
return 0;
}
$ cat mv2.cpp
#include "mv.h"
void test()
{
int (*f) (void) = &foo;
__builtin_printf ("value: %d\n", foo ());
}
$ gcc -fdump-ipa-cgraph=/dev/stdout mv.cpp -c
_Z13_Z3foov.ifuncv/2 (int _Z3foov.ifunc()) @0x7ffff67182e0
Type: function definition analyzed alias cpp_implicit_alias
Visibility: externally_visible asm_written public comdat
comdat_group:_Z3foov.resolver one_only section:.text._Z3foov.resolver
(implicit_section) artificial
Same comdat group as: _Z3foov.resolver/6
References: _Z3foov.resolver/6 (alias)
Referring:
Availability: overwritable
First run: 0
Version info: next: _Z3foov/0 dispatcher: _Z3foov.resolver
Function flags:
Called by:
Calls:
_Z3foov.sse4.2/1 (int foo()) @0x7ffff6718170
Type: function definition analyzed
Visibility: force_output externally_visible asm_written public
Address is taken.
References:
Referring:
Availability: available
First run: 0
Version info: prev: _Z3foov/0 dispatcher: int _Z3foov.ifunc()
Function flags:
Called by:
Calls:
_Z3foov/0 (int foo()) @0x7ffff6718000
Type: function definition analyzed
Visibility: force_output externally_visible asm_written public
Address is taken.
References:
Referring:
Availability: available
First run: 0
Version info: next: _Z3foov.sse4.2/1 dispatcher: int _Z3foov.ifunc()
Function flags:
Called by:
Calls:
_Z3foov.resolver/6 (_Z3foov.resolver) @0x7ffff67188a0
Type: function definition analyzed
Visibility: externally_visible asm_written public weak comdat
comdat_group:_Z3foov.resolver one_only section:.text._Z3foov.resolver
(implicit_section) artificial
Same comdat group as: _Z13_Z3foov.ifuncv/2
References:
Referring: _Z13_Z3foov.ifuncv/2 (alias)
Availability: available
First run: 0
Function flags:
Called by:
Calls:
So note that resolver and ifunc live in a unique comdat_group. And as opposed
to target_clones, default implementation has not appended '.default' suffix.
That's
problem because an address taken returns default implementation as seen here:
$ gcc mv.cpp mv2.cpp && ./a.out
in main: 1
value: 0
Which is the same problem is fixed for target_clones in this release cycle. So
it's broken.
Apart from that, following is broken for target attribute:
cat Crypto-target.ii
struct aaa
{
static __attribute__((target("avx512f"))) void foo() { __builtin_printf
("avx512f\n"); }
static __attribute__((target("sse"))) void foo() {__builtin_printf ("sse\n");}
static __attribute__((target("default"))) void foo() {__builtin_printf
("default\n");}
};
void (*initializer) (void) = { &aaa::foo };
int main()
{
initializer ();
// aaa::foo ();
}
$ g++ Crypto-target.ii && ./a.out
/tmp/ccJMMaDc.o:(.data+0x0): undefined reference to `_ZN3aaa3fooEv.ifunc()'
collect2: error: ld returned 1 exit status
For target_clones, I have a patch candidate that works for the test-case in
PR85329:
$ cat ~/Programming/testcases/Crypto.ii
struct a
{
__attribute__((target_clones("sse", "default"))) void operator^=(a) {}
} * b;
class c {
public:
a *d();
};
class f {
void g();
c e;
};
void f::g() { *e.d() ^= b[0]; }
$ g++ ~/Programming/testcases/Crypto.ii -c -fdump-ipa-cgraph=/dev/stdout
_ZN1aeOES_.resolver/6 (_ZN1aeOES_.resolver) @0x7ffff690d730
Type: function definition analyzed
Visibility: force_output externally_visible public weak comdat
comdat_group:_ZN1aeOES_ one_only artificial
Same comdat group as: _ZN1aeOES_/5
References: _ZN1aeOES_.sse.0/4 (addr)_ZN1aeOES_.default.1/0 (addr)
Referring: _ZN1aeOES_/5 (alias)
Availability: available
First run: 0
Function flags: body
Called by:
Calls: int __builtin_cpu_supports(const char*)/8 (can throw external) int
__builtin_cpu_init()/7 (can throw external)
_ZN1aeOES_/5 (_ZN1aeOES_) @0x7ffff690d5c0
Type: function definition analyzed alias cpp_implicit_alias
target:_ZN1aeOES_.resolver
Visibility: externally_visible public comdat comdat_group:_ZN1aeOES_ one_only
artificial
Same comdat group as: _ZN1aeOES_.resolver/6
References: _ZN1aeOES_.resolver/6 (alias)
Referring:
Availability: overwritable
First run: 0
Version info: next: _ZN1aeOES_.default.1/0 dispatcher: _ZN1aeOES_.resolver
Function flags:
Called by: void f::g()/2
Calls:
_ZN1aeOES_.sse.0/4 (void a::_ZN1aeOES_.sse.0(a)) @0x7ffff690d450
Type: function definition analyzed
Visibility: force_output no_reorder prevailing_def_ironly artificial
Address is taken.
References:
Referring: _ZN1aeOES_.resolver/6 (addr)
Availability: available
First run: 0
Version info: prev: _ZN1aeOES_.default.1/0 dispatcher: _ZN1aeOES_
Function flags: body
Called by:
Calls:
_ZN1aeOES_.default.1/0 (void a::operator^=(a)) @0x7ffff690d000
Type: function definition analyzed
Visibility: force_output no_reorder prevailing_def_ironly artificial
Address is taken.
References:
Referring: _ZN1aeOES_.resolver/6 (addr)
Availability: available
First run: 0
Version info: next: _ZN1aeOES_.sse.0/4 dispatcher: _ZN1aeOES_
Function flags: body
Called by:
Calls:
It's not ideal because _ZN1aeOES_.sse.0/4 and _ZN1aeOES_.default.1/0 don't
share the same comdat group.
But it should not break anything now, am I right? Honza?
I'll carry on during weekend after I'll get some feedback.
Thanks,
Martin