[Bug libstdc++/110045] New: std::normal_distribution (and likely others) give wrong min() and max() values
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110045 Bug ID: 110045 Summary: std::normal_distribution (and likely others) give wrong min() and max() values Product: gcc Version: 12.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: gccbugs at elkpod dot com Target Milestone: --- >From my (non-expert) reading of the C++ spec (I'm using https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/n4950.pdf for the current 2023-05-10 draft), the min() and max() methods for std::normal_distribution are returning incorrect values. "28.5.3.6 Random number distribution requirements" says that "A class D meets the requirements of a random number distribution if the expressions shown in Table 97 are valid and have the indicated semantics", and that in Table 97 "x" is a "(possibly const) value[s] of D", and that "x.min()" "Returns glb", and that "x.max()" "Returns lub", and that "glb and lub are values of T respectively corresponding to the greatest lower bound and the least upper bound on the values potentially returned by d's operator(), as determined by the current values of d's parameters". By my reading, this means that if I have "std::normal_distribution norm {0, 1};", then "norm.max()" should return the least upper bound on the values potentially returned by norm's operator(). It does not seem to do that. For example, the highest value I was able to get norm() to generate was 0x1.ff13ccp+2 (== 7.985583). norm.max() reports 0x1.fep+127 (== 340282346638528859811704183484516925440.00). While the values returned are technically valid bounds, they do not seem to be the least upper or greatest lower. I could find no Generator outputs which produced a value of 0x1.fep+127 from std::normal_distribution. It is possible that my interpretation of the spec is wrong, and that somehow the min/max values should encompass all possible instantiations of std::normal_distribution, or some other loophole may exist. I could not find a better forum to find the answer to that; sorry. I have looked at the implementation of std::normal_distribution, and written the following code to generate what I believe to be the actual extreme values that are "potentially returned by norm's operator()". Reproduction code: #include #include #include int32_t offsets[8] = { -1, +0, +0, -1, +0, +1, +1, +0 }; class SimpleGen { using result_type = uint32_t; public: result_type val, ctr = 0; static constexpr result_type min() { return 0; } static constexpr result_type max() { return 0xff; } result_type operator()() { val = 0x80 + offsets[(ctr++)%8]; printf("\tG 0x%06x\n", val); return val; } }; int main(void) { SimpleGen gen; std::normal_distribution norm {0, 1}; for (int i = 0; i < 8; i++) { float r = norm(gen); printf("%d %f %a\n", i, r, r); } printf("\n%f %a\n%f %a\n", norm.min(), norm.min(), norm.max(), norm.max()); } Build output: $ g++-12.2 -Wall -Wextra -o normdist2 normdist2.c -fno-strict-aliasing -fwrapv -fno-aggressive-loop-optimizations -fsanitize=undefined $ echo $? 0 Output: G 0x7f G 0x80 0 0.00 0x0p+0 1 -7.985583 -0x1.ff13ccp+2 G 0x80 G 0x7f 2 -7.985583 -0x1.ff13ccp+2 3 0.00 0x0p+0 G 0x80 G 0x81 4 7.985583 0x1.ff13ccp+2 5 0.00 0x0p+0 G 0x81 G 0x80 6 0.00 0x0p+0 7 7.985583 0x1.ff13ccp+2 -340282346638528859811704183484516925440.00 -0x1.fep+127 340282346638528859811704183484516925440.00 0x1.fep+127 Expected output: G 0x7f G 0x80 0 0.00 0x0p+0 1 -7.985583 -0x1.ff13ccp+2 G 0x80 G 0x7f 2 -7.985583 -0x1.ff13ccp+2 3 0.00 0x0p+0 G 0x80 G 0x81 4 7.985583 0x1.ff13ccp+2 5 0.00 0x0p+0 G 0x81 G 0x80 6 0.00 0x0p+0 7 7.985583 0x1.ff13ccp+2 -7.985583 -0x1.ff13ccp+2 7.985583 0x1.ff13ccp+2 $ g++-12.2 -v Using built-in specs. COLLECT_GCC=g++-12.2 COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-slackware-linux/12.2.0/lto-wrapper Target: x86_64-slackware-linux Configured with: ../gcc-12.2.0/configure --prefix=/usr/local --program-suffix=-12.2 -enable-languages=c,c++,lto --enable-lto --disable-multilib --with-gnu-ld --enable-threads --verbose --target=x86_64-slackware-linux --build=x86_64-slackware-linux --host=x86_64-slackware-linux --enable-tls --with-fpmath=avx --enable-__cxa_atexit --enable-gnu-indirect-function --enable-bootstrap --enable-libssp Thread model: posix Supported LTO compression algorithms: zlib gcc version 12.2.0 (GCC)
[Bug libstdc++/110045] std::normal_distribution (and likely others) give wrong min() and max() values
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110045 --- Comment #2 from Frank J. T. Wojcik --- (In reply to Andrew Pinski from comment #1) > The heavy weight goes to potentially. The way I understand it is not the max > of what operator() has produced currently but will potentially return in the > future. And I would agree with that interpretation. My inquiry is based on the fact that I can find *no* Generator outputs which produce the currently-given max() value. The goal of the demo program was to show how I arrived at my "7.985583" value for what seems to be the actual maximum value, and that I didn't make up some arbitrary value. If it helps, imagine the min/max printf() to appear before any generation has been done, or even without any generation. I would expect the same result, modulo order of printouts.
[Bug libstdc++/110045] std::normal_distribution (and likely others) give wrong min() and max() values
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110045 --- Comment #4 from Frank J. T. Wojcik --- (In reply to Andrew Pinski from comment #3) > With 24bit precision, maybe it is ~8 standard deviations away from the mean. > But the generator argument can change for each call though so that does not > mean the next call to operator() could produce one with more bits ... > > Also the standard says: "as determined by the current values of d's > parameters" > > The parameters is only mean and standard deviations and not the generator. I would agree with all of this also, I think. :) But can you or someone demonstrate *any* generator which produces (e.g.) the current value of max() for std::normal_distribution {0, 1}? I can find no generator implementation which does that, and by my reading of the implementation there cannot be one.
[Bug libstdc++/110045] std::normal_distribution (and likely others) give wrong min() and max() values
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110045 --- Comment #6 from Frank J. T. Wojcik --- (In reply to Andrew Pinski from comment #5) > 2 things, I think your result_type in SimpleGen needs to be public. Sure, I'll change that. > Second is LLVM's libc++ outputs: > -inf -inf > inf inf I only have clang 6.0 on my system and its outputs are identical to g++'s. And I do not have access to MSVC, but I believe it also produces similar values. These data points are why I bring up the possibility that there is some detail in the specification that I'm missing. Still, the behavior I'm looking for is useful, and it does seem to be what the spec is requiring, so I would like a fix if appropriate.
[Bug libstdc++/110045] std::normal_distribution (and likely others) give wrong min() and max() values
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110045 --- Comment #7 from Frank J. T. Wojcik --- After playing with this a bit more, I was able to find a Generator which produces a slightly wider bound, but still nowhere near 0x1.fep+127. This also includes the requested change to SimpleGen. #include #include #include #define MAXOFFSET 3 // Return 25-bit values centered around 0x100 (the halfway-point of all // possible 25-bit outputs). Widening the Generator does not alter the // program's output. class SimpleGen { public: using result_type = uint32_t; result_type val, offset, ctr = 0; static constexpr result_type min() { return 0; } static constexpr result_type max() { return 0x1ff; } result_type operator()() { offset = (ctr & 1) ? ((ctr / 2) / (2 * MAXOFFSET)) : ((ctr / 2) % (2 * MAXOFFSET)); val = 0x100 + offset - MAXOFFSET; printf("\tG 0x%07x\n", val); ++ctr; return val; } }; int main(void) { SimpleGen gen; std::normal_distribution norm {0, 1}; printf("min: %+f %a\nmax: %+f %a\n\n", norm.min(), norm.min(), norm.max(), norm.max()); for (int i = 0; i < ((2 * MAXOFFSET) * (2 * MAXOFFSET - 1) * 2) ; i++) { float r = norm(gen); printf("%d %f %a\n", i, r, r); } } Build output: $ g++-12.2 -Wall -Wextra -o normdist2 normdist2.cpp -fno-strict-aliasing -fwrapv -fno-aggressive-loop-optimizations -fsanitize=undefined $ echo $? 0 Actual outputs (excerpts only!): min: -340282346638528859811704183484516925440.00 -0x1.fep+127 max: +340282346638528859811704183484516925440.00 0x1.fep+127 G 0x100 G 0x0ff 30 -8.157336 -0x1.0508e6p+3 31 0.00 0x0p+0 G 0x101 G 0x0ff 32 -8.157336 -0x1.0508e6p+3 33 0.00 0x0p+0 G 0x0ff G 0x100 40 0.00 0x0p+0 41 -8.157336 -0x1.0508e6p+3 G 0x102 G 0x100 42 0.00 0x0p+0 43 7.985583 0x1.ff13ccp+2 G 0x0ff G 0x101 48 0.00 0x0p+0 49 -8.157336 -0x1.0508e6p+3 G 0x102 G 0x101 50 0.00 0x0p+0 51 7.985583 0x1.ff13ccp+2 G 0x100 G 0x102 58 7.985583 0x1.ff13ccp+2 59 0.00 0x0p+0 Expected outputs (excerpt only!): min: -8.157336 -0x1.0508e6p+3 max: +7.985583 0x1.ff13ccp+2