[Bug libstdc++/110045] New: std::normal_distribution (and likely others) give wrong min() and max() values

2023-05-30 Thread gccbugs at elkpod dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110045

Bug ID: 110045
   Summary: std::normal_distribution (and likely others)
give wrong min() and max() values
   Product: gcc
   Version: 12.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gccbugs at elkpod dot com
  Target Milestone: ---

>From my (non-expert) reading of the C++ spec (I'm using
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/n4950.pdf for the
current 2023-05-10 draft), the min() and max() methods for
std::normal_distribution are returning incorrect values.

"28.5.3.6 Random number distribution requirements" says that "A class D meets
the requirements of a random number distribution if the expressions shown in
Table 97 are valid and have the indicated semantics", and that in Table 97 "x"
is a "(possibly const) value[s] of D", and that "x.min()" "Returns glb", and
that "x.max()" "Returns lub", and that "glb and lub are values of T
respectively corresponding to the greatest lower bound and the least upper
bound on the values potentially returned by d's operator(), as determined by
the current values of d's parameters".

By my reading, this means that if I have "std::normal_distribution norm
{0, 1};", then "norm.max()" should return the least upper bound on the values
potentially returned by norm's operator(). It does not seem to do that. 

For example, the highest value I was able to get norm() to generate was
0x1.ff13ccp+2 (== 7.985583). norm.max() reports 0x1.fep+127 (==
340282346638528859811704183484516925440.00). While the values returned are
technically valid bounds, they do not seem to be the least upper or greatest
lower. I could find no Generator outputs which produced a value of
0x1.fep+127 from std::normal_distribution.

It is possible that my interpretation of the spec is wrong, and that somehow
the min/max values should encompass all possible instantiations of
std::normal_distribution, or some other loophole may exist. I could not find a
better forum to find the answer to that; sorry.

I have looked at the implementation of std::normal_distribution, and written
the following code to generate what I believe to be the actual extreme values
that are "potentially returned by norm's operator()".

Reproduction code:
#include 
#include 
#include 

int32_t offsets[8] = { -1, +0, +0, -1, +0, +1, +1, +0 };
class SimpleGen {
using result_type = uint32_t;
 public:
result_type val, ctr = 0;
static constexpr result_type min() { return 0; }
static constexpr result_type max() { return 0xff; }
result_type operator()()   { val = 0x80 + offsets[(ctr++)%8];
 printf("\tG 0x%06x\n", val);
 return val; }
};

int main(void) {
SimpleGen gen;
std::normal_distribution norm {0, 1};

for (int i = 0; i < 8; i++) {
float r = norm(gen);
printf("%d %f %a\n", i, r, r);
}
printf("\n%f %a\n%f %a\n", norm.min(), norm.min(), norm.max(), norm.max());
}

Build output:
$ g++-12.2 -Wall -Wextra -o normdist2 normdist2.c -fno-strict-aliasing -fwrapv
-fno-aggressive-loop-optimizations -fsanitize=undefined
$ echo $?
0

Output:
G 0x7f
G 0x80
0 0.00 0x0p+0
1 -7.985583 -0x1.ff13ccp+2
G 0x80
G 0x7f
2 -7.985583 -0x1.ff13ccp+2
3 0.00 0x0p+0
G 0x80
G 0x81
4 7.985583 0x1.ff13ccp+2
5 0.00 0x0p+0
G 0x81
G 0x80
6 0.00 0x0p+0
7 7.985583 0x1.ff13ccp+2

-340282346638528859811704183484516925440.00 -0x1.fep+127
340282346638528859811704183484516925440.00 0x1.fep+127

Expected output:
G 0x7f
G 0x80
0 0.00 0x0p+0
1 -7.985583 -0x1.ff13ccp+2
G 0x80
G 0x7f
2 -7.985583 -0x1.ff13ccp+2
3 0.00 0x0p+0
G 0x80
G 0x81
4 7.985583 0x1.ff13ccp+2
5 0.00 0x0p+0
G 0x81
G 0x80
6 0.00 0x0p+0
7 7.985583 0x1.ff13ccp+2

-7.985583 -0x1.ff13ccp+2
7.985583 0x1.ff13ccp+2

$ g++-12.2 -v
Using built-in specs.
COLLECT_GCC=g++-12.2
COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-slackware-linux/12.2.0/lto-wrapper
Target: x86_64-slackware-linux
Configured with: ../gcc-12.2.0/configure --prefix=/usr/local
--program-suffix=-12.2 -enable-languages=c,c++,lto --enable-lto
--disable-multilib --with-gnu-ld --enable-threads --verbose
--target=x86_64-slackware-linux --build=x86_64-slackware-linux
--host=x86_64-slackware-linux --enable-tls --with-fpmath=avx
--enable-__cxa_atexit --enable-gnu-indirect-function --enable-bootstrap
--enable-libssp
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 12.2.0 (GCC)

[Bug libstdc++/110045] std::normal_distribution (and likely others) give wrong min() and max() values

2023-05-30 Thread gccbugs at elkpod dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110045

--- Comment #2 from Frank J. T. Wojcik  ---
(In reply to Andrew Pinski from comment #1)
> The heavy weight goes to potentially. The way I understand it is not the max
> of what operator() has produced currently but will potentially return in the
> future.

And I would agree with that interpretation. My inquiry is based on the fact
that I can find *no* Generator outputs which produce the currently-given max()
value.

The goal of the demo program was to show how I arrived at my "7.985583" value
for what seems to be the actual maximum value, and that I didn't make up some
arbitrary value.

If it helps, imagine the min/max printf() to appear before any generation has
been done, or even without any generation. I would expect the same result,
modulo order of printouts.

[Bug libstdc++/110045] std::normal_distribution (and likely others) give wrong min() and max() values

2023-05-30 Thread gccbugs at elkpod dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110045

--- Comment #4 from Frank J. T. Wojcik  ---
(In reply to Andrew Pinski from comment #3)
> With 24bit precision, maybe it is ~8 standard deviations away from the mean.
> But the generator argument can change for each call though so that does not
> mean the next call to operator() could produce one with more bits ...
> 
> Also the standard says: "as determined by the current values of d's
> parameters"
> 
> The parameters is only mean and standard deviations and not the generator.

I would agree with all of this also, I think. :)

But can you or someone demonstrate *any* generator which produces (e.g.) the
current value of max() for std::normal_distribution {0, 1}? I can find
no generator implementation which does that, and by my reading of the
implementation there cannot be one.

[Bug libstdc++/110045] std::normal_distribution (and likely others) give wrong min() and max() values

2023-05-30 Thread gccbugs at elkpod dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110045

--- Comment #6 from Frank J. T. Wojcik  ---
(In reply to Andrew Pinski from comment #5)
> 2 things, I think your result_type in SimpleGen needs to be public.

Sure, I'll change that.

> Second is LLVM's libc++ outputs:
> -inf -inf
> inf inf

I only have clang 6.0 on my system and its outputs are identical to g++'s. And
I do not have access to MSVC, but I believe it also produces similar values.
These data points are why I bring up the possibility that there is some detail
in the specification that I'm missing.

Still, the behavior I'm looking for is useful, and it does seem to be what the
spec is requiring, so I would like a fix if appropriate.

[Bug libstdc++/110045] std::normal_distribution (and likely others) give wrong min() and max() values

2023-05-30 Thread gccbugs at elkpod dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110045

--- Comment #7 from Frank J. T. Wojcik  ---
After playing with this a bit more, I was able to find a Generator which
produces a slightly wider bound, but still nowhere near 0x1.fep+127.
This also includes the requested change to SimpleGen.

#include 
#include 
#include 

#define MAXOFFSET 3

// Return 25-bit values centered around 0x100 (the halfway-point of all
// possible 25-bit outputs). Widening the Generator does not alter the
// program's output.
class SimpleGen {
 public:
using result_type = uint32_t;
result_type val, offset, ctr = 0;
static constexpr result_type min() { return 0; }
static constexpr result_type max() { return 0x1ff; }
result_type operator()()   { offset = (ctr & 1) ?
((ctr / 2) / (2 * MAXOFFSET)) :
((ctr / 2) % (2 * MAXOFFSET));
 val = 0x100 + offset - MAXOFFSET;
 printf("\tG 0x%07x\n", val);
 ++ctr; return val; }
};

int main(void) {
SimpleGen gen;
std::normal_distribution norm {0, 1};

printf("min: %+f %a\nmax: %+f %a\n\n", norm.min(), norm.min(),
norm.max(), norm.max());

for (int i = 0; i < ((2 * MAXOFFSET) * (2 * MAXOFFSET - 1) * 2) ; i++) {
float r = norm(gen);
printf("%d %f %a\n", i, r, r);
}
}

Build output:
$ g++-12.2 -Wall -Wextra -o normdist2 normdist2.cpp -fno-strict-aliasing
-fwrapv -fno-aggressive-loop-optimizations -fsanitize=undefined
$ echo $?
0

Actual outputs (excerpts only!):
min: -340282346638528859811704183484516925440.00 -0x1.fep+127
max: +340282346638528859811704183484516925440.00 0x1.fep+127

G 0x100
G 0x0ff
30 -8.157336 -0x1.0508e6p+3
31 0.00 0x0p+0

G 0x101
G 0x0ff
32 -8.157336 -0x1.0508e6p+3
33 0.00 0x0p+0

G 0x0ff
G 0x100
40 0.00 0x0p+0
41 -8.157336 -0x1.0508e6p+3

G 0x102
G 0x100
42 0.00 0x0p+0
43 7.985583 0x1.ff13ccp+2

G 0x0ff
G 0x101
48 0.00 0x0p+0
49 -8.157336 -0x1.0508e6p+3

G 0x102
G 0x101
50 0.00 0x0p+0
51 7.985583 0x1.ff13ccp+2

G 0x100
G 0x102
58 7.985583 0x1.ff13ccp+2
59 0.00 0x0p+0

Expected outputs (excerpt only!):
min: -8.157336 -0x1.0508e6p+3
max: +7.985583 0x1.ff13ccp+2