[Bug ipa/95775] Command line argument for target_clones?

yyc1992 at gmail dot com Tue, 23 Jun 2020 17:07:32 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95775


--- Comment #4 from Yichao Yu <yyc1992 at gmail dot com> ---
> Hey. My opinion is similar to Richi's. If you really want a highly optimized 
> library, you should rather use a dlopen mechanism with pre-built set of 
> options.

Well, a few things,

1. That sounds like an argument against `target_clone` and `target`. If
dlopen'ing different libraries is your recommended solution then none of these
would be needed.
2. The solution you propose put all the pression on the user of the library.
That has a few problems.

   2.1. There are strictly more users than libraries. (Assuming the library is
used at all) so this is forcing more (repeated) work to be done.
   2.2. The author of the library and to a lesser degree the builder of the
library has the best knowledge of the set of features that can benefit the
library/the most useful for the deployment environment. The author of the user
code of the library, who has to implement the dispatch/loading logic in general
has much less complete knowledge of what the target to support.
   2.3. It'll be even worse for code size since this forces each user to carry
their own library, and now all data has to be duplicated as well in additional
to code. Also because,

3. There's no standard way of doing this AFAICT.

Now (3) is really the main point.
I'm fine with whatever mechanism that allows multiple versions of the code to
be available as long as it requires no more effort/cost from/for the user (and
to a lesser degree the author) of the library.

If one such mechanism is provided by gcc/glibc/binutils so that library writers
don't have to invent their own loading and detection mechanism and won't cause
unnecessary indirection (as cheap as ifunc) and will just work for the user to
either link or dlopen, then I think it doesn't really matter if that's backed
by one file/multiple files or whatever one can come up with.

Currently, the only mechanism available that fits this description AFAICT is
`target_clones`/`ifunc`. Unless there's a roadmap that I'm not aware of to
replace this mechanism with a similar one backed by multiple files I don't
think suggesting such a mechanism is the right approach.

Again, I said in the very first post that I totally agree this won't be the
method to give absolutely the best performance, but neither is `target_clones`.
I also completely agree that this option can be misused and the compiler should
not do it on its own before getting smarter but this is far from the first
option that can be misused and given how cheap memory is and how multiple load
of the same library doesn't take more memory this isn't even closoed to be the
worse misused either.

[Bug ipa/95775] Command line argument for target_clones?

Reply via email to