Re: [R] Complex sort problem

Petr Savicky Mon, 21 May 2012 12:46:05 -0700

On Fri, May 18, 2012 at 09:20:59PM -0400, Axel Urbiz wrote:
[...]
> Petr: I kind of see your line of thought, but still cannot see how it works
> on a specific example like this one.


I did not have email in the last few days.

The previous suggestion from

  https://stat.ethz.ch/pipermail/r-help/2012-May/313197.html

was meant for the situation that we want to keep the result of
sorting according to several variables, so that later, sorting
of a subset can be done only by sorting according to a single
variable. Now, i see, all sortings are already according to
a single variable, so this is not helpful.

Try the following, which uses the example from your code.
In particular, it uses a matrix (not a data frame) and
there are no duplicates in the data.

  set.seed(1)
 
  dframe <- matrix(runif(250), 50, 5)
 
  ### store sort indexes
 
  sort_matrix <- matrix(ncol = ncol(dframe), nrow = nrow(dframe))
 
  for (i in 1:ncol(dframe)) {
    xtemp <- dframe[, i]
    sort_matrix[, i] <- sort.list(xtemp, method = "shell")
  }
 
  ### take a bootstrap sample
 
  nr_samples <- nrow(dframe)
  b.ind <- sample(1:nr_samples, nr_samples*0.5, replace = TRUE)
  freq <- tabulate(b.ind, nbins=nr_samples)
 
  ### create bootstrap sample sorted with respect to an arbitrary variable
 
  var1 <- 1
  ind <- sort_matrix[, var1]
  DF1 <- dframe[ind, ]    # this can be computed in advance (before b.ind)
  NDF1 <- DF1[rep(1:nrow(DF1), times=freq[ind]), ]
 
  ### compare with a straightforward method

  subDF <- dframe[b.ind, ]
  subDF1 <- subDF[order(subDF[, var1]), ]
  identical(NDF1, subDF1)

  [1] TRUE

The main step is that "ind" is used to transform both the data
and the frequency table. So, they remain consistent and the
reordered frequencies may be used for the reordered data.

Hope this helps.

Petr Savicky.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Complex sort problem

Reply via email to