On 02/01/2009 10:07 AM, Wacek Kusnierczyk wrote:
Stavros Macrakis wrote:
On Wed, Dec 31, 2008 at 12:44 PM, Guillaume Chapron
<carnivorescie...@gmail.com> wrote:
m[-sample(which(m[,1]<8 & m[,2]>12),2),]
Supposing I sample only one row among the ones matching my criteria. Then
consider the case where there is just one row matching this criteria. Sure,
there is no need to sample, but the instruction would still be executed.
Then if this row index is 15, my instruction becomes which(15,1), and this
can gives me any row from 1 to 15, which is not correct. I have to make a
condition in case there is only one row matching the criteria.
Yes, this is a (documented!) design flaw in 'sample' -- see the man page.

For some reason, the designers of R have chosen to document the flaw
and leave it up to individual users to work around it rather than fix
it definitively.  A related case is sample(c(),0), which gives an
error rather than giving an empty vector, though in general R deals
with empty vectors correctly (e.g. sum(c()) => 0).


interestingly, ?sample says:

"
     'sample' takes a sample of the specified size from the elements of
     'x' using either with or without replacement.

       x: Either a (numeric, complex, character or logical) vector of
          more than one element from which to choose, or a positive
          integer.

    If 'x' has length 1, is numeric (in the sense of 'is.numeric') and
     'x >= 1', sampling takes place from '1:x'.  _Note_ that this
     convenience feature may lead to undesired behaviour when 'x' is of
     varying length 'sample(x)'.  See the 'resample()' example below.

"

yet the following works, even though x has length 1 and is *not* numeric:

x = "foolme"
is.numeric(x)
sample(x, 1)
sample(x)

x = NA
is.numeric(NA)
sample(x, 1)
sample(x)

is this a bug in the code, or a bug in the documentation?



To my mind, it is bizarre to have an important basic function which
works for some argument lengths but not others.  The convenience of
being able to write sample(5,2) for sample(1:5,2) hardly seems worth
inflicting inconsistency on all users -- but perhaps one of the
designers of R/S can enlighten us on the design rationale here.


hopefully.

This is more of an R-devel sort of question. My guess is that this is in the S blue book, but I don't have a copy here to check.

Duncan Murdoch

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to