On May 13, 2009, at 9:12 AM, Martial Sankar wrote:


Dear All,

I would like to use the 'split' function on the dataframe elements contained in a list L.

For example :

(df <- data.frame(cbind(c(rep('A',2), rep('B',2)), rep(1:4))))
 X1 X2
1  A  1
2  A  2
3  B  3
4  B  4
(L<-split(df, df$X1))
$A
 X1 X2
1  A  1
2  A  2

$B
 X1 X2
3  B  3
4  B  4

Now, I would like to split EACH data frame, ie, according to column 2(X2).

lapply(L, split, df$X2)

$A
$A$`1`
 X1 X2
1  A  1

$A$`2`
 X1 X2
2  A  2

$A$`3`
[1] X1 X2
<0 rows> (or 0-length row.names)

$A$`4`
[1] X1 X2
<0 rows> (or 0-length row.names)


$B
$B$`1`
 X1 X2
3  B  3

$B$`2`
 X1 X2
4  B  4

$B$`3`
[1] X1 X2
<0 rows> (or 0-length row.names)

$B$`4`
[1] X1 X2
<0 rows> (or 0-length row.names)


Warning messages:
1: In split.default(seq_len(nrow(x)), f, drop = drop, ...) :
 data length is not a multiple of split variable
2: In split.default(seq_len(nrow(x)), f, drop = drop, ...) :
 data length is not a multiple of split variable



I works but it's dirty.
How could I do it properly, without warnings and 0 rows data frame in output ? I thought accessing to the current element of 'lapply' to recuperate the vector of the column 2 would work.
i.e:

lapply(L,split, L[[current]][,2])


Is there a way to do something like that in R ?


Thanks in advance !

- Martial

# Split on BOTH columns and drop unused levels
L <- split(df, list(df$X1, df$X2), drop = TRUE)

> L
$A.1
  X1 X2
1  A  1

$A.2
  X1 X2
2  A  2

$B.3
  X1 X2
3  B  3

$B.4
  X1 X2
4  B  4


Is that what you want?

HTH,

Marc Schwartz

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to