Ben Bolker wrote:
Lo, Ken <ken.lo <at> roche.com> writes:

Hi all,

I am running into a snag using quantile function in stats.  Basically, I
don't understand why the loop below throws the error that it does.

test.data <- rnorm(1000, 0, 1)

for (i in seq(0.00001, 0.001, 0.00001)){
        test <- quantile(test.data, probs=seq(0,1,i));
        print(i);
}

It runs fine from 1e-05 to 0.00024, but then throws the error

Error in quantile.default(test.data, probs=seq(0,1,i)):
'probs' outside [0,1]

I tested it some more by using

test <- quantile(test.data, probs=seq(0,1,0.00024));
test <- quantile(test.data, probs=seq(0,1,0.00025));

both ran fine.  So, I'm baffled as to what the error actually is.


 This is a result of numerical imprecision
(see FAQ 7.31), but I think it's not your fault.

set.seed(1001) ## for reproducibility, although not
 ## really necessary here

test.data <- rnorm(1000,0,1)
for (i in seq(0.00001, 0.001, 0.00001)){
        test <- quantile(test.data, probs=seq(0,1,i))
        print(i)
}

fails as reported
let's see what's going on --

zz <- seq(1e-5,1e-3,1e-5)
test <- quantile(test.data, probs=seq(0,1,2.5e-4)) ## OK
test <- quantile(test.data, probs=seq(0,1,zz[25])) ## not OK

zz[25]-2.5e-4
[1] 5.421011e-20

so let's look closer ...


zzz <- range(seq(0,1,2.5e-4))
zzz
zzz-c(0,1)

zzz2 <- range(seq(0,1,zz[25]))
zzz2
zzz2-c(0,1)
[1] 0.000000e+00 2.220446e-16

so the results of the seq() call exceed 1 by a little bit.

I think this may actually count as a bug in R (congratulations!)
the documentation for ?seq says.

##     The second form generates 'from, from+by', ..., up to the sequence
##     value LESS THAN OR EQUAL TO 'TO'.  Specifying 'to - from' and 'by'
##     of opposite signs is an error.
[emphasis added]

1 + 2.220446e-16 (which is the maximum of the
sequence that 'seq' generates) IS NOT <= 1 ...

In the meantime, here's a workaround ...

for (i in seq(0.00001, 0.001, 0.00001)){
  test <- quantile(test.data, probs=pmin(seq(0,1,i),1))
  print(i)
}

  If I don't hear otherwise I will submit this as
a bug to r-devel  ...
It IS deliberate.... We discussed this when R was still in the toddler stages.

If you don't allow slight overruns like that, you end up with sequences sometimes being one element short, so seq() has a small "do what I mean" fudge factor built in. We might polish off the end value though, or fix quantile() to be less hysterical about values slightly out of range.

--
  O__  ---- Peter Dalgaard             Ă˜ster Farimagsgade 5, Entr.B
 c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
(*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45) 35327918
~~~~~~~~~~ - ([EMAIL PROTECTED])              FAX: (+45) 35327907

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to