https://gcc.gnu.org/bugzilla/show_bug.cgi?id=32667

--- Comment #28 from Rich Felker <bugdal at aerifal dot cx> ---
> No, that is not a reasonable fix, because it severely pessimizes common code 
> for a theoretical only problem.

Far less than a call to memmove (which necessarily has something comparable to
that and other unnecessary branches) pessimizes it.

I also disagree that it's severe. On basically any machine with branch
prediction, the branch will be predicted correctly all the time and has
basically zero cost. On the other hand, the branches in memmove could go
different ways depending on the caller, so it's much more
machine-capability-dependent whether they can be predicted.

In some sense the optimal thing to do is "nothing", just assuming it would be
hard to write a memcpy that fails on src==dest. However, at the very least this
precludes hardened memcpy trapping on src==dest, which might be a useful
hardening feature (or rather on a range test for overlapping, which would
happen to also catch exact overlap). So it would be nice if it were fixed.

FWIW, I don't think single branches are relevant to overall performance in
cases where the compiler is doing something reasonable by emitting a call to
memcpy to implement assignment. If the object is small enough that the branch
is relevant, the call overhead is even more of a big deal, and it should be
inlining loads/stores to perform the assignment.

Reply via email to