erichkeane added a comment.
In D112349#3107460 <https://reviews.llvm.org/D112349#3107460>, @ibookstein
wrote:
> I'll first explain my thought process about the representation of aliases and
> ifuncs in the IR, and why I think both aliasees and resolvers must always be
> defined; I hope I'm not completely off track and would love it if @MaskRay
> could weigh in as to whether I make sense.
> Let's start at the level of the object file; My understanding is that aliases
> apply to ELF and MachO, and ifuncs apply only to ELF. I'm not at all
> acquainted with MachO, but on the ELF side, my understanding is that:
>
> 1. Aliases are simply lowered to additional symbols with the same st_value as
> their aliasee. As long as the value of a symbol has to be concrete/numeric
> and cannot express a way to refer to another symbol, for aliases to make
> sense and have the correct semantics at this level, their aliasee must be
> defined at the IR level. Otherwise all you're left with at the object file
> level is an undefined symbol with no way to express that it 'wants to' alias
> an external symbol with some specified name. In other words, symbols are
> either undefined (st_shndx == 0, st_value meaningless) or defined (st_shndx
> != 0, st_value meaningful and holds a section offset). If we were to allow
> aliases to have undefined aliasees, they would decay to simple undefined
> symbols and lose their aliasee information.
> 2. IFuncs are lowered to specially typed symbols whose st_value is the
> resolver. In much the same way as aliases, for this to actually have any
> meaning, the resolver must be defined (because you have no way to specify
> "the value is in another castle named 'XYZ'", only "defined at offset X" or
> "undefined"). When we allow ifuncs to have undefined resolvers, they decay to
> simple undefined symbols with the additional wart of having a special symbol
> type, but the desired resolver name is lost. Concretely, as long as the
> linker doesn't throw a fuss at said wart, for the references against that
> symbol from within the object file this will behave like a simple undefined
> external function. Because in your implementation one TU will have a
> cpu_dispatch and therefore a defined resolver, it will 'win' and
> intra-EXE/intra-DSO references against the ifunc will indeed be bound against
> the return value of the resolver. If no translation unit in the EXE/DSO had
> an ifunc with the same name and a defined resolver, you'd end up with a
> peculiar undefined symbol of type ifunc in the EXE/DSO (same as the .o).
>
> It is my conclusion therefore that ifuncs with undefined resolvers behave
> exactly like function declarations (and lose the name of the resolver), as
> long as the linker is willing to accept such weird symbols.
> Therefore, at the IR level, they're representational slack at best, and don't
> do what you want (possibly binding against a differently-named resolver) at
> worst, so they should not be allowed.
>
>> My understanding is the frontend's semantic rules are/were different from
>> the IR rules, which is why we implemented it that way.
>
> As I understand it, features like aliases and ifuncs consist mostly of
> vertical plumbing to expose low-level object-file semantics, and their design
> must be informed by them.
>
>> I'm not sure what you mean here? Are you suggesting that an undefined
>> resolver should instead just implement an undefined 'function' for the
>> .ifunc? This doesn't seem right? Wouldn't that cause ODR issues?
>
> As I understand it, making the symbol your current implementation calls
> "x.ifunc" a function **declaration** which gets upgraded to an ifunc with a
> defined resolver on encountering cpu_dispatch would yield the correct
> behavior.
>
>> I guess I still don't understand what the practical limitation that requires
>> ifuncs to have a defined resolver? The resolver is just a normal function,
>> so it seems to me that allowing them to have normal linking rules makes
>> sense? I personally think this is the least obtrusive change; this patch is
>> adding a limitation that didn't exist previously unnecessarily.
>
> I //think// I've addressed this in the wall of text above
>
>> It just seems odd I guess to name a function .ifunc, and not have it be an
>> ifunc? What does our linker think about that?
>
> Ah, the name is just a name :)
> As far as the linker is concerned, it encounters an object file with an
> undefined symbol (of type STT_NOTYPE) and an object file with a defined
> symbol with the same name, of type STT_GNU_IFUNC. It will bind references in
> the former against the definition in the latter.
> Here's my trying it out:
>
> itay@CTHULHU ~/tmp/ifuncdecl/tu> cat main.c
> int foo(void);
> int main() { return foo(); }
>
> itay@CTHULHU ~/tmp/ifuncdecl/tu> cat foo.c
> static int foo_impl(void) { return 42; }
> static void *foo_resolver(void) { return &foo_impl; }
> int foo(void) __attribute__((ifunc("foo_resolver")));
>
> itay@CTHULHU ~/tmp/ifuncdecl/tu> clang-14 -c main.c -o main.c.o
> itay@CTHULHU ~/tmp/ifuncdecl/tu> clang-14 -c foo.c -o foo.c.o
> itay@CTHULHU ~/tmp/ifuncdecl/tu> clang-14 main.c.o foo.c.o -o main
> itay@CTHULHU ~/tmp/ifuncdecl/tu> ./main
> itay@CTHULHU ~/tmp/ifuncdecl/tu [42]>
> itay@CTHULHU ~/tmp/ifuncdecl/tu> llvm-readobj-14 --elf-output-style=GNU
> main.c.o --symbols | grep foo
> 4: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND foo
> itay@CTHULHU ~/tmp/ifuncdecl/tu> llvm-readobj-14 --elf-output-style=GNU
> foo.c.o --symbols | grep foo
> 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS foo.c
> 3: 0000000000000000 16 FUNC LOCAL DEFAULT 2 foo_resolver
> 4: 0000000000000010 11 FUNC LOCAL DEFAULT 2 foo_impl
> 5: 0000000000000000 16 IFUNC GLOBAL DEFAULT 2 foo
> itay@CTHULHU ~/tmp/ifuncdecl/tu> llvm-readobj-14 --elf-output-style=GNU
> main --symbols | grep foo
> 35: 0000000000000000 0 FILE LOCAL DEFAULT ABS foo.c
> 36: 0000000000401150 16 FUNC LOCAL DEFAULT 13 foo_resolver
> 37: 0000000000401160 11 FUNC LOCAL DEFAULT 13 foo_impl
> 56: 0000000000401150 16 IFUNC GLOBAL DEFAULT 13 foo
> itay@CTHULHU ~/tmp/ifuncdecl/tu> llvm-readobj-14 --elf-output-style=GNU
> main --relocations
>
> Relocation section '.rela.dyn' at offset 0x3d0 contains 2 entries:
> Offset Info Type Symbol's Value
> Symbol's Name + Addend
> 0000000000403ff0 0000000100000006 R_X86_64_GLOB_DAT 0000000000000000
> __libc_start_main@GLIBC_2.2.5 + 0
> 0000000000403ff8 0000000200000006 R_X86_64_GLOB_DAT 0000000000000000
> __gmon_start__ + 0
>
> Relocation section '.rela.plt' at offset 0x400 contains 1 entries:
> Offset Info Type Symbol's Value
> Symbol's Name + Addend
> 0000000000404018 0000000000000025 R_X86_64_IRELATIVE 401150
>
>
>
>> Thats correct, these aren't 'aliases' or 'ifuncs' as far as the CFE is
>> concerned; they are multiversioned functions. That 'Aliases' and 'ifunc'
>> list in the CFE are the AST-constructs of those, not the IR constructs, so
>> there is no reason to put the multiversioned thinks in that list, since they
>> are implementation details. Emitting an error "invalid alias!"/etc for
>
> I see, makes sense, thanks for the explanation.
From my perspective, an ifunc is just a linkable entity, as is a resolver. If
the linker can merge symbols for the resolver (and an ifunc points to one), it
seems to me to make a ton of sense to allow the resolver to be defined in
another TU?
I guess I feel the same way with an alias, why can't I just alias to a function
in a different TU, so long as this is a linkable entity?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D112349/new/
https://reviews.llvm.org/D112349
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits