On May 13, 2009, at 9:12 AM, Martial Sankar wrote:
Dear All,
I would like to use the 'split' function on the dataframe elements
contained in a list L.
For example :
(df <- data.frame(cbind(c(rep('A',2), rep('B',2)), rep(1:4))))
X1 X2
1 A 1
2 A 2
3 B 3
4 B 4
(L<-split(df, df$X1))
$A
X1 X2
1 A 1
2 A 2
$B
X1 X2
3 B 3
4 B 4
Now, I would like to split EACH data frame, ie, according to column
2(X2).
lapply(L, split, df$X2)
$A
$A$`1`
X1 X2
1 A 1
$A$`2`
X1 X2
2 A 2
$A$`3`
[1] X1 X2
<0 rows> (or 0-length row.names)
$A$`4`
[1] X1 X2
<0 rows> (or 0-length row.names)
$B
$B$`1`
X1 X2
3 B 3
$B$`2`
X1 X2
4 B 4
$B$`3`
[1] X1 X2
<0 rows> (or 0-length row.names)
$B$`4`
[1] X1 X2
<0 rows> (or 0-length row.names)
Warning messages:
1: In split.default(seq_len(nrow(x)), f, drop = drop, ...) :
data length is not a multiple of split variable
2: In split.default(seq_len(nrow(x)), f, drop = drop, ...) :
data length is not a multiple of split variable
I works but it's dirty.
How could I do it properly, without warnings and 0 rows data frame
in output ?
I thought accessing to the current element of 'lapply' to recuperate
the vector of the column 2 would work.
i.e:
lapply(L,split, L[[current]][,2])
Is there a way to do something like that in R ?
Thanks in advance !
- Martial
# Split on BOTH columns and drop unused levels
L <- split(df, list(df$X1, df$X2), drop = TRUE)
> L
$A.1
X1 X2
1 A 1
$A.2
X1 X2
2 A 2
$B.3
X1 X2
3 B 3
$B.4
X1 X2
4 B 4
Is that what you want?
HTH,
Marc Schwartz
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.