I don’t think there is any guarantee that unrepresentable numbers are parsed 
into defined patterns, because printing is done by the OS while parsing is done 
by R. The way R parses decimal numbers[1] is simply by using the obvious res = 
res * 10 + digit and it can be easily checked that for doubles the 
representation such obtained from 10000000000000000905969664 is 
0x1.08b2a2c280292p+83 (see below if you want to see it yourself) which is not 
the same as 10^25 which is 0x1.08b2a2c280291p+83. This is true on all 
platforms, it is not specific to M1. The only difference is if your were to use 
a different type you can obtain a different result - and that is not 
well-defined (e.g. long doubles have no guarantees at all as of the precision). 
 Note that the decimal string above would require 83-bits of precision which is 
not representable.

(BTW: to make it even more fun, if you were to use double res = 1; repeat(25) 
res = res * 10; in C, so the naive computation of the original 10^25 you’d get 
9999999999999998758486016 and 0x1.08b2a2c28029p+83)

Given that printing is done by the OS and parsing by R, I don’t think R 
guarantees anything. If you want representable number you’d use the binary 
representation (sprintf(“%a”) or hex-mode deparse as noted). One could argue 
that it could make sense to change it one way or another - either having R do 
it all or having the OS do it all. In the latter case one may obtain more 
consistent results (e.g. system stdtod() yields the original value even on M1), 
but it would be OS-specific. In the former R could impose its own guarantees - 
but currently it does not.

Cheers,
Simon

[1] - 
https://github.com/r-devel/r-svn/blob/97c0a73f1758d09088c200f924d27b362d55ccdc/src/main/util.c#L2094


#include <stdio.h>
#include <math.h>
#include <stdlib.h>

int main() {
  const char *str = "10000000000000000905969664", *c = str;
  double ans = 0;
  while (*c) {
    ans = 10 * ans + (*(c++) - '0');
    printf("%a\n", ans);
  }
  printf("atof: %a\n", atof(str));
  double pow1025 = pow(10.0, 25);
  printf("--\n10^25:\n%25.f\n%a\n", pow1025, pow1025);
  return 0;
}

0x1p+0
0x1.4p+3
0x1.9p+6
0x1.f4p+9
0x1.388p+13
0x1.86ap+16
0x1.e848p+19
0x1.312dp+23
0x1.7d784p+26
0x1.dcd65p+29
0x1.2a05f2p+33
0x1.74876e8p+36
0x1.d1a94a2p+39
0x1.2309ce54p+43
0x1.6bcc41e9p+46
0x1.c6bf52634p+49
0x1.1c37937e08p+53
0x1.6345785d8a001p+56
0x1.bc16d674ec801p+59
0x1.158e460913d01p+63
0x1.5af1d78b58c41p+66
0x1.b1ae4d6e2ef51p+69
0x1.0f0cf064dd593p+73
0x1.52d02c7e14af8p+76
0x1.a784379d99db6p+79
0x1.08b2a2c280292p+83
atof: 0x1.08b2a2c280291p+83
--
10^25:
10000000000000000905969664
0x1.08b2a2c280291p+83


> On 11/07/2022, at 02:00, Antoine Fabri <antoine.fa...@gmail.com> wrote:
> 
> Dear r-devel,
> 
> For some numbers, the printed value is not equivalent to the input :
> 
> options(scipen = 999)
> ## GOOD
> 1e24
> #> [1]  999999999999999983222784
> 1e24 == 999999999999999983222784
> #> [1] TRUE
> 
> ## BAD
> 1e25
> #> [1] 10000000000000000905969664
> 1e25 == 10000000000000000905969664
> #> [1] FALSE
> 
> ## STILL BAD
> 10000000000000000905969664
> #> [1] 10000000000000003053453312
> 
> ## GOOD AGAIN
> 10000000000000003053453312
> #> [1] 10000000000000003053453312
> 
> # Additionally
> 10000000000000000000000000 == 1e25
> #> [1] FALSE
> 
> Are these bugs ?
> 
>       [[alternative HTML version deleted]]
> 
> ______________________________________________
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
> 

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to