Currently, sample of a data.frame is a sample of the columns:

e.g. sample(data.frame(a=1,b=2:3,c=4),2) => data.frame(b=2:3,c=c(4,4))

I'd have thought it would be much more common to want a sample of the rows.

It's easy enough to define an appropriate function for this:

sample.data.frame <- function(x,size,replace=FALSE,prob=NULL)
  # no auto-dispatch; sample is not a generic function
  {
    x[sample(nrow(x),size,replace,prob),]
  }

Would it be a bad idea for this to be the standard behavior for sample?

There is always, of course, the backwards-compatiblity argument.  Is sample
in fact used in practice to select random columns?  I realize it is hard to
quantify that, but perhaps there is some wisdom in the community about that.

            -s

        [[alternative HTML version deleted]]

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to