[Bug tree-optimization/65752] Too strong optimizations int -> pointer casts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65752 --- Comment #59 from post+gcc at ralfj dot de --- > With the C provenance proposal this example is undefined since 'a' is not exposed (it's address is not converted to an integer). However, from what I can tell, GCC's behavior does not change if we insert '(uintptr_t) &a;' at the beginning of the function. That change should be sufficient to make the example well-defined again.
[Bug middle-end/32667] block copy with exact overlap is expanded as memcpy
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=32667 post+gcc at ralfj dot de changed: What|Removed |Added CC||post+gcc at ralfj dot de --- Comment #23 from post+gcc at ralfj dot de --- > Is glibc community ready to provide such guarantee? This is indeed a key question here I think. Currently GCC makes assumptions that *even the libc produced by the same project* does not document as stable guarantees. That's rather dissonant. The GNU project should at least within itself come to a proper conclusion on the question of whether memcpy should be UB or not when both pointers are equal. Right now we have everyone pointing at everyone else, and users are left in the rain with their valgrind errors. Ideally of course the C standard would be updated to ensure that slowly but steadily, the memcpy contract is updated to match reality. That will take a while, but given that this issue was filed 16 years ago (!), there clearly would have been enough time. (If someone does, please join forces with the clang people that are interested in getting C updated: https://reviews.llvm.org/D86993#4585590). But GNU controls glibc so there's not really any excuse for not updating those docs, I think? glibc making such a move would be a great step towards convincing valgrind and the C committee that memcpy should have defined behavior when both pointers are equal.
[Bug c/112449] New: Arithmetic operations can produce signaling NaNs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112449 Bug ID: 112449 Summary: Arithmetic operations can produce signaling NaNs Product: gcc Version: 13.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: post+gcc at ralfj dot de Target Milestone: --- According to the IEEE 754 specification, the output of an arithmetic operation can never be a signaling NaN. However, GCC performs optimizations that turn `x * 1.0` into just `x`, and if `x` is a signaling NaN, that means that the multiplication will now (seem to) return a signaling NaN. (proof of GCC performing that transformation: https://godbolt.org/z/scPhn1d8s) It is very common for C compilers to violate this IEEE 754 requirement, but it does open the door to a great many questions. Since GCC evidently does not seem to implement the original IEEE 754 semantics, it would be great to have some documentation on what exactly GCC *does* implement, and in particular under which conditions operations are allowed to return a signaling NaN. So currently, GCC is either buggy because it violates the IEEE 754 spec, or there's a documentation bug in that the actual floating point spec GCC intends to implement is not documented. At least, all I was able to find is https://gcc.gnu.org/wiki/FloatingPointMath, which just says "does not care about signalling NaNs". (I hope this does not mean that any arithmetic operation may arbitrarily produce signaling NaNs. That would be an issue for operations which are sensitive to the difference between quiet NaN and signaling NaN, such as `pow`.) As a point of comparison, LLVM recently added this to their documentation to answer these kinds of questions: https://llvm.org/docs/LangRef.html#behavior-of-floating-point-nan-values. (That PR was authored by me but received input from a lot of people.) LLVM goes further than to just document signaling vs quiet NaN there, since in practice there's some critical code that would break if arithmetic operations returned NaNs with arbitrary bits in their payload (specifically, that would break NaN boxing as performing by some JavaScript engines, or at least make it a lot less efficient since engines would have to re-normalize NaNs after every single operation -- which to my knowledge, they don't actually do in practice).
[Bug c/112449] Arithmetic operations can produce signaling NaNs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112449 --- Comment #5 from post+gcc at ralfj dot de --- > See > https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#index-fsignaling-nans That's unrelated. That's about whether operation on signaling NaNs can trap. I am asking when operations can output a signaling NaN. So, for code like float x = y z; return is_signaling_nan(x); when can that code return `true`? Normal IEEE semantics would say "never". And yet if "z" is the constant 1, is `*`, and "y" is a signaling NaN, then this evidently can output a signaling NaN. I would hope the answer is "this can output a signaling NaN only if one of the inputs is a signaling NaN", but is that documented anywhere? > Note mips and sh and a few other targets have the quiet bit meaning the > opposite. I know. LLVM is currently buggy on those targets. > GCC does document some of this on > https://gcc.gnu.org/onlinedocs/gcc-13.2.0/gcc/Floating-point-implementation.html > but not the signaling nan part. This seems to list a bunch of implementation-defined aspects of C? To my knowledge, my question is not implementation defined. C (with the annex for floating-point arithmetic) requires the above operations to always return "false". GCC violates the C spec here (since it defines __STDC_IEC_559__, declaring support for the annex), and it'd be good to know how far it is going in that violation.
[Bug c/112449] Arithmetic operations can produce signaling NaNs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112449 --- Comment #6 from post+gcc at ralfj dot de --- Hm, OTOH the C standard says > The expressions 1×x, x/1, and x are equivalent (on IEC 60559 machines, among others). So, it seems like when they say "The + ,- , * , and / operators provide the IEC 60559 add, subtract, multiply, and divide operations.", they don't quite mean that. This seems internally inconsistent in the C standard, since C also permits `pow(1, sNaN)` to behave different from `pow(1, qNaN)` -- and in fact they do behave different in GNU's libm. So on the one hand `pow(1, x * y)` must always be `1` but on the other hand it can return a NaN when `x` is an sNaN and `y` is `1`?
[Bug c/112449] Arithmetic operations can produce signaling NaNs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112449 --- Comment #7 from post+gcc at ralfj dot de --- I guess the idea is that by passing a signaling NaN to a float operation, I am already entering unspecified behavior, so it's okay for that float operation to violate its usual contract and return a signaling NaN?
[Bug c/112449] Arithmetic operations can produce signaling NaNs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112449 --- Comment #10 from post+gcc at ralfj dot de --- The standard says > This annex does not require the full support for signaling NaNs specified in > IEC 60559. This annex uses the term NaN, unless explicitly qualified, to denote quiet NaNs. Where specification of signaling NaNs is not provided, the behavior of signaling NaNs is implementation-defined (either treated as an IEC 60559 quiet NaN or treated as an IEC 60559 signaling NaN). I have no idea how that allows a situation where the *output* of an operation becomes signaling -- that can't usually happen no matter whether the inputs are signaling or quiet. But that seems to be the common interpretation. Still, it seems important that `pow(1.0, 0.0/0.0)` returns `1.0` and not a NaN. That's what the `pow` docs say. So for this there must be a guarantee that `0.0/0.0` is a quiet NaN.
[Bug c/112449] Arithmetic operations can produce signaling NaNs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112449 --- Comment #12 from post+gcc at ralfj dot de --- > GCC will not create an sNaN out of nowhere. That's the part I was hoping for. :) I just don't think it obviously follows from any docs (the C standard or GCC docs).
[Bug middle-end/32667] block copy with exact overlap is expanded as memcpy
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=32667 --- Comment #30 from post+gcc at ralfj dot de --- There have been several assertions above that a certain way to solve this either has no performance cost at all or severe performance cost. That sounds like we are missing data -- ideally, someone would benchmark the actual cost of emitting that branch. It seems kind of pointless to just make assertions about the impact of this change without real data. > On the other hand, expecting the libc memcpy to make this check greatly > pessimizes every reasonable small use of memcpy with a gratuitous branch for > what is undefined behavior and should never appear in any valid program. I don't think this is true. As far as I can see, the performance impact of having memcpy support the src==dest case is zero -- the assembly generated by the current implementations already supports that case. (At least I have not seen any evidence to the contrary.) No new check in memcpy is required.
[Bug middle-end/32667] block copy with exact overlap is expanded as memcpy
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=32667 --- Comment #41 from post+gcc at ralfj dot de --- > This entitles a compiler to emit asm equivalent to if (src==dest) system("rm > -rf /") if it likes. No it does not. restrict causes UB if the two pointers are used to access the same memory. It has nothing to do with whether the pointers are equal. So it would have to be "if (src==dest && n>0)" and the compiler would have to first prove that "n>0" implies that later accesses through both pointers occur at offset 0 (and at least one of them is a write). But it's still UB to call this the way GCC does, that much I agree with. > Our memcpy is not written in asm but in C, and it has the restrict qualifier > on src and dest. The question is, does that qualifier help? If you remove it, does the generated assembly change in any way, does the performance change? If not, it clearly doesn't matter and can just be removed. If yes, then yeah compilers clearly shouldn't call this with identical ranges. Basically, compiler devs are claiming that libc can support the src==dest case "for free", without any noticeable cost to other uses of the function. libc devs are claiming that compilers can insert a branch that tests for equality, without any noticeable cost. Both of these are testable hypotheses. I'm not a compiler dev nor a libc dev, I just want to make sure that my compiler and my libc use the same contract when talking to each other -- but I hope someone who is a compiler dev or a libc dev can go and actually test these hypotheses, rather than just speculating about it as has been happening so far.
[Bug middle-end/32667] block copy with exact overlap is expanded as memcpy
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=32667 --- Comment #43 from post+gcc at ralfj dot de --- That is not my reading of the standard, but absent a proper (formal, mathematical) spec I guess it is hard to tell. With your reading, "if ((uintptr_t)src == 0x400400)" is UB, since changing the "src" argument to a different copy located at that address would change the execution. I strongly doubt that is the intent of the standard.
[Bug middle-end/32667] block copy with exact overlap is expanded as memcpy
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=32667 --- Comment #48 from post+gcc at ralfj dot de --- > Note, clang makes the same assumption apparently (while MSVC emits rep movs > inline and ICC either that, or calls _intel_fast_memcpy). MSVC does the same thing as clang and GCC, if godbolt is to be trusted: https://rust.godbolt.org/z/o7TevfvcY
[Bug middle-end/32667] block copy with exact overlap is expanded as memcpy
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=32667 --- Comment #49 from post+gcc at ralfj dot de --- Even glibc itself seems to use `restrict`: https://codebrowser.dev/glibc/glibc/string/string.h.html#43 So the compiler building glibc might inadvertently rely on the memory written through dst and the memory read through src being disjoint, making even the perfectly-overlapping case UB (unless the implementation has a guard somewhere that skips the copy when src==dst, but I was not able to find such a guard). (The implementation at https://codebrowser.dev/glibc/glibc/string/memcpy.c.html does not have the `restrict`, but it includes the string.h header and I think the compiler is allowed to apply attributes from the declaration to the definition. Or, alternatively, it might even be UB to have `restrict` in one place and not the other: "All declarations that refer to the same object or function shall have compatible type; otherwise, the behavior is undefined" [C23 §6.2.7.2] and "For two qualified types to be compatible, both shall have the identically qualified version of a compatible type; the order of type qualifiers within a list of specifiers or qualifiers does not affect the specified type" [C23 §6.7.3.11].)
[Bug middle-end/32667] block copy with exact overlap is expanded as memcpy
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=32667 --- Comment #51 from post+gcc at ralfj dot de --- Oh great, I love it when one part of the C standard just adds exceptions to statements made elsewhere. It's almost as if the authors want this to be as hard to understand as possible... That then raises the question which version of the signature is actually used for building (and optimizing) the function: the one in the declaration or the one in the definition. Does the standard have an answer to that?
[Bug middle-end/32667] block copy with exact overlap is expanded as memcpy
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=32667 --- Comment #52 from post+gcc at ralfj dot de --- For the point discussed earlier with the `restrict` in the musl memcpy, I had another look at the definition of `restrict` and it's not entirely clear to me any more that there is UB here. The restrict rules only apply to objects that are "also modified (by any means)". Now the question is, does "*X = *X" modify the object? Is a write always a modification or only if the stores representation changes or only if the stored value changes? If it requires a representation change, then "memcpy(x, x, n)" does not modify anything, and hence there is no UB from "restrict".