[PATCH] D61634: [clang/llvm] Allow efficient implementation of libc's memory functions in C/C++

Eli Friedman via Phabricator via cfe-commits Mon, 13 May 2019 13:15:15 -0700

efriedma added a comment.

My main blocker is that I want to make sure we're moving in the right 
direction: towards LLVM IR with clear semantics, towards straightforward rules 
for writing freestanding C code, and towards solutions which behave 
appropriately for all targets.  There's clearly a problem here that's worth 
solving, but I want to make sure we're adding small features that interact with 
existing features in an obvious way.  The patch as written is adding a new 
attribute that changes the semantics of C and LLVM IR in a subtle way that 
interacts with existing optimizations and lowering in ways that are complex and 
hard to understand.


I don't want to mix general restrictions on calling C library functions, with 
restrictions that apply specifically to memcpy: clang generally works on the 
assumption that a "memcpy" symbol is available in freestanding environments, 
and we don't really want to change that.

With -fno-builtin, specifically, we currently apply the restriction that 
optimizations will not introduce memcpy calls that would not exist with 
optimization disabled.  This is sort of artificial, and and maybe a bit 
confusing, but it seems to work well enough in practice.  gcc does something 
similar.

I don't really want to add an attribute that has a different meaning from 
-fno-builtin.  An attribute that has the same meaning is a lot less 
problematic: it reduces potential confusion, and solves a related problem that 
-fno-builtin currently doesn't really work with LTO, because we don't encode it 
into the IR.

Your current patch is using the "AlwaysInline" flag for 
SelectionDAG::getMemcpy, which forces a memcpy to be lowered to an inline 
implementation.  In general, this flag is only supported for constant-size 
memcpy calls; otherwise, on targets where EmitTargetCodeForMemcpy does not have 
some special lowering, like the x86 "rep;movs", you'll hit an assertion 
failure.  And we don't really want to add an implementation of variable-length 
memcpy for a lot of targets; it's very inefficient on targets which don't have 
unaligned memory access.  You could try to narrowly fix this to only apply 
"AlwaysInline" if the size is a constant integer, but there's a related 
problem: even if a memcpy is constant size in C, optimizations don't promise to 
preserve that, in general.  We could theoretically add such a promise, I guess, 
but that's awkward: it would conditionally apply to llvm.memcpy where the 
parameter is already const, so it's not something we can easily verify.

If we added a new builtin `llvm.memcpy.inline`, or something like that, the end 
result ends up being simpler in many ways. It has obvious rules for both 
generating it and lowering it, which don't depend on attributes in the parent 
function.  We could mark the size as "immarg", so we don't have to add special 
optimization rules for it.   And if we have it, we have a "complete" solution 
for writing memcpy implementations in C: we make `__builtin_memcpy`, or a new 
`__builtin_inline_memcpy`, produce it, and it can be combined with -fno-builtin 
to ensure we don't produce any calls to the "real" memcpy.  The downside of a 
new builtin, of course, is that we now have two functions with similar 
semantics, and potentially have to fix a bunch of places to check for both of 
them.

MemCpyOpt has been mentioned (the pass which transforms memcpy-like loops into 
llvm.memcpy).  If we want it to perform some transform in circumstances where 
the call "memcpy" isn't available, we have to make sure to restrict it based on 
the target: in the worst case, on some targets without unaligned memory access, 
it could just act as a low-quality loop unroller.  This applies no matter how 
the result is actually represented in IR.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D61634/new/

https://reviews.llvm.org/D61634



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D61634: [clang/llvm] Allow efficient implementation of libc's memory functions in C/C++

Reply via email to