See below. >> From: r-help-boun...@r-project.org >> [mailto:r-help-boun...@r-project.org] On Behalf Of Ulrich Keller >> Sent: Thursday, January 07, 2010 4:32 AM >> To: r-help@r-project.org >> Subject: [R] Strange behaviour of as.integer() >> >> I have encountered a strange behaviour of as.integer() which does not >> seem correct to me. Sorry if this is just an indication of me not >> understanding floating point arithmetic. >> >> > .57 * 100 >> [1] 57 >> > .29 * 100 >> [1] 29 >> >> So far, so good. But: >> >> > as.integer(.57 * 100) >> [1] 56 >> > as.integer(.29 * 100) >> [1] 28 >> >> Then again: >> >> > all.equal(.57 * 100, as.integer(57)) >> [1] TRUE >> > all.equal(.29 * 100, as.integer(29)) >> [1] TRUE >> >> This behaviour is the same in R 2.10.1 (Ubuntu and Windows) and 2.9.2 >> (Windows), all 32 bit versions. Is this really intended?
I would like to add a salutory tail-piece to this correspondence. It is a simple recursive calculation which, after not many steps, R will in almost all cases get badly wrong as a result of the finite binary representation of fractions. (I have posted it before, some years ago, but it may be worth bringing it back since it shows very vividly what can go wrong if you do not pay attention to this aspect of numerical computation). Working in the interval [0,1], X[n+1] is calculated from x[n] according to: if 0 <= X[n] <= 1/2 then X[n+1] = 2*X[n] if 1/2 <= X[n] <= 1 then X[n+1] = 2*(1 - X[n]) X = 2/3 is a fixed point of this, since 2/3 -> 2*(1 - 2/3) = 2/3; X = 2/5 -> 4/5 -> 2*(1 - 4/5) = 2/5 has period 2; X = 2/7 -> 4/7 -> 2*(1 - 4/7) = 6/7 -> 2*(1 - 6/7) = 2/7 (period 3) and so on. All fractions which are multiples of 1/(2^k) for some k>0 eventually end up at 0 and stay there. Irrational numbers, mathematically, never repeat. However, none of the above periodic numbers 2/(2*m + 1) can be represented exactly in a finite binary representation, and that is where the trouble starts. So, in R: nextX <- function(x){if(x <= 0.5) (2*x) else (2*(1 - x))} Now try the fixed point x=2/3: i<-0; x<-2/3; while(x>0){i<-(i+1); x<-nextX(x) ; print(c(i,x))} of which the last few lines are: [1] 46.0000000 0.6640625 [1] 47.000000 0.671875 [1] 48.00000 0.65625 [1] 49.0000 0.6875 [1] 50.000 0.625 [1] 51.00 0.75 [1] 52.0 0.5 [1] 53 1 [1] 54 0 Similarly try any of the other periodic values, e.g. i<-0; x<-2/11; while(x>0){i<-(i+1); x<-nextX(x) ; print(c(i,x))} They will all halt at x=0 after 50-55 iterations. Similarly a non-periodic number such as 1/sqrt(2) or 1/pi will also fail. Thus the results of such calculations will eventually be grossly wrong. So do not try this kind of calculation in R! -- at any rate not without adopting special measures. For instance, it would be possible, for rational x = m/n, to program a function which kept track of M and N in the rational expression M/N of the result after each iteration, by working out what M and N would be. Then, at any iteration, the numerical value of M/N could be computed (to within the precision used by R) and returned. This would still give wrong answers for irrational starting numbers, since they would have to be stored as rational approximations and then would either end up at 0 or be periodic while the mathematical result would never repeat, so eventually they would be arbitrarily far apart (within [0,1]). Ted. -------------------------------------------------------------------- E-Mail: (Ted Harding) <ted.hard...@manchester.ac.uk> Fax-to-email: +44 (0)870 094 0861 Date: 08-Jan-10 Time: 18:13:41 ------------------------------ XFMail ------------------------------ ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.