Good morning Stavos,

I currently use the following definition in my own environment.

sample.df <- function (df, n = 3) {
    df[sample(nrow(df), min(nrow(df), n)), ]
}

I also added in the possibility of returning n sequential rows which I used
when examining address files... but I haven't used it in ages :-)

Kind regards,
Sean O'Riordain
Dublin
Ireland

On Fri, Feb 19, 2010 at 9:05 PM, Stavros Macrakis <macra...@alum.mit.edu>wrote:

> Currently, sample of a data.frame is a sample of the columns:
>
> e.g. sample(data.frame(a=1,b=2:3,c=4),2) => data.frame(b=2:3,c=c(4,4))
>
> I'd have thought it would be much more common to want a sample of the rows.
>
> It's easy enough to define an appropriate function for this:
>
> sample.data.frame <- function(x,size,replace=FALSE,prob=NULL)
>  # no auto-dispatch; sample is not a generic function
>  {
>    x[sample(nrow(x),size,replace,prob),]
>  }
>
> Would it be a bad idea for this to be the standard behavior for sample?
>
> There is always, of course, the backwards-compatiblity argument.  Is sample
> in fact used in practice to select random columns?  I realize it is hard to
> quantify that, but perhaps there is some wisdom in the community about
> that.
>
>            -s
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

        [[alternative HTML version deleted]]

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to