gchatelet added a comment.

In D61634#1494518 <https://reviews.llvm.org/D61634#1494518>, @alexandre.isoard 
wrote:

> I'm not convinced by the approach.
>
> Can it still recognize the loop idiom into memcpy implementation but use 
> `memmove` (as only `memcpy` has been blacklisted)?


Yes it can and it's fine, passing `src` and `dst` arguments as `__restrict` 
will ensure that `memcpy` is the function to choose in this case.
This attribute is not to be used widely but is directed towards library 
maintainer that already know what the compiler is allowed to do.

> Can it do the same for `memmove` (for the case without overlap), and end-up 
> with infinite recursion?

I fail to see how this would happen. Could you craft some pseudo code that 
would exhibit the infinite recursion?
My take on it is that if the compiler can't generate the code it's totally OK 
to fail with an error message.

The purpose of this RFC it to get some help from the compiler to generate the 
best code for some building blocks (say `copy 4 bytes`, `copy 8 bytes`) and 
assemble them in a way that maximizes something (code size, runtime for certain 
parameter distributions).
It would be useful only to libc / libm / compiler-rt implementers to build 
these functions on top of smaller functions that we know the compiler can 
produce at compile time.

I don't think we want to use this attribute to generate fully the `memcpy`. I 
added the expansion into a `rep;movsb` for variable sized memcpy only because 
it's feasible on x86. It's preferable in this case since it has the correct 
semantic and prevents a compilation error but I would be fine with a 
compilation error here.

> I have a feeling we should pick a stance:
> 
> - either not allow the compiler to lower a builtin to a call to the library; 
> (my preferred choice, but I'm biased)

From the point of view of LLVM `@llvm.memcpy` intrinsics has the semantic of 
`memcpy` it does not know if it comes from a lowering in the frontend (e.g. 
passing structures by value) or from a built in. 
I believe this is feasible although I fail to see where my proposal falls 
short. Can you show a code example where the current proposal is problematic?

> - or not allow the library to use compiler builtins (but LTO flow with the 
> runtime library *already* linked smells like trouble if we go this way).

This is currently generating very poor code because `-fno-builtin` prevents 
LLVM from understanding the copy semantic.

> The reason for my bias is that I have a multi-memcpy codegen in the compiler 
> to generate those two calls:
> 
>   memcpy(left,  in, n);
>   memcpy(right, in, n);
> 
> 
> with a single loop.

I don't quite understand how this is linked to the issue at hand. Can you 
provide more context? Pointers to code?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D61634/new/

https://reviews.llvm.org/D61634



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to