On 07-Jan-10 12:31:42, Ulrich Keller wrote: > I have encountered a strange behaviour of as.integer() which > does not seem correct to me. Sorry if this is just an indication > of me not understanding floating point arithmetic.
I'm afraid it probably is -- but being aware of what the problem is, is 0.875 of solving it (sticking to binary-compatible fractions)! See below. >> .57 * 100 > [1] 57 >> .29 * 100 > [1] 29 So it seems, but: 57 - .57 * 100 # [1] 7.105427e-15 (.57 * 100 < 57) # [1] TRUE So things are not what they seem. Now: > So far, so good. But: > >> as.integer(.57 * 100) > [1] 56 >> as.integer(.29 * 100) > [1] 28 But if you look at ?as.integer you see: "Non-integral numeric values are truncated towards zero (i.e., ?as.integer(x)? equals ?trunc(x)? there)" so since .57 * 100 is stored as the equivalent of 56.999<something> its fractional part i discarded, resulting in 56. > Then again: > >> all.equal(.57 * 100, as.integer(57)) > [1] TRUE >> all.equal(.29 * 100, as.integer(29)) > [1] TRUE And now you should also read ?all.equal: "'all.equal(x,y)' is a utility to compare R objects 'x' and 'y' testing 'near equality'. [...] Usage: [...] all.equal(target, current, tolerance = .Machine$double.eps ^ 0.5, scale = NULL, check.attributes = TRUE, ...) [...] tolerance: numeric >= 0. Differences smaller than 'tolerance' are not considered." Now, on my R, .Machine$double.eps ^ 0.5 # [1] 1.490116e-08 whereas (see above) (57 - .57 * 100) = 7.105427e-15, which is smaller than .Machine$double.eps ^ 0.5. > This behaviour is the same in R 2.10.1 (Ubuntu and Windows) and 2.9.2 > (Windows), all 32 bit versions. Is this really intended? Yes! And, as you suspect, it is all down to the binary representation of fractional numbers input as decimal. There is no finite-length binary fraction which is exactly equal to 0.57[decimal]. If there were, then for some power k of 2 (2^k)*0.57 would be an exact integer. You can easily verify that this is not the case. Just keep doubling 0.57: the series starts as 0.57 1.14 2.28 4.56 9.12 18.24 ... and finally, at the 23rd position, you get 2390753.28 and you are now back at the "****.28" fractional part (as at position 3 above). Hence the fractional parts will cycle through .28, .56, .12, ... forever, so there is no exact binary representation of 0.57. To be absolutely sure of it, you should do it by hand on paper, (lest you tickle rounding errors in R) but R will in fact give you the sequence: 0.57*2^(0:24) # [1] 0.57 1.14 *2.28* 4.56 # [5] 9.12 18.24 36.48 72.96 # [9] 145.92 291.84 583.68 1167.36 # [13] 2334.72 4669.44 9338.88 18677.76 # [27] 37355.52 74711.04 149422.08 298844.16 # [21] 597688.32 1195376.64 *2390753.28* 4781506.56 Hoping this helps! Ted. -------------------------------------------------------------------- E-Mail: (Ted Harding) <ted.hard...@manchester.ac.uk> Fax-to-email: +44 (0)870 094 0861 Date: 07-Jan-10 Time: 13:32:31 ------------------------------ XFMail ------------------------------ ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.