Hello William, hello David,

thanks a lot for helping and keeping me going on what sometimes seems to be a long way to R mastery! :)

I found that the two solutions William proposed were in fact easier to understand for me at the moment as David's (and has the additional advantage of producing the desired data types ('numeric'/'integer') in the columns 2-5), however I think all of the code you provided will be extremely helpful to learn some new tricks by analyzing it in detail.

For everyone concerned with similar data manipulation tasks, here's a short summary of the thread:

>>> The original data (a matrix of _lists_, of cours - mea culpa - hence the modified name of the thread):

x <- list(c(1,2,4),c(1,3,5),c(0,1,0),
         c(1,3,6,5),c(3,4,4,4),c(0,1,0,1),
         c(3,7),c(1,2),c(0,1))
data <- matrix(x,byrow=TRUE,nrow=3)
colnames(data) <- c("First", "Length", "Value")
rownames(data) <- c("Case1", "Case2", "Case3")

> data
     First     Length    Value
Case1 Numeric,3 Numeric,3 Numeric,3
Case2 Numeric,4 Numeric,4 Numeric,4
Case3 Numeric,2 Numeric,2 Numeric,2


>>> The desired output (a dataframe of a database-like 'flat' structure):

>      Case Sequence First Length Value
>   1 Case1        1     1      1     0
>   2 Case1        2     2      3     1
>   3 Case1        3     4      5     0
>   4 Case2        1     1      3     0
>   5 Case2        2     3      4     1
>   6 Case2        3     6      4     0
>   7 Case2        4     5      4     1
>   8 Case3        1     3      1     0
>   9 Case3        2     7      2     1


>>> Ways to do it:

(1)
>  lengths<-sapply(data[,1],length)
>  data.frame(Case=rep(rownames(data),lengths),
Sequence=sequence(lengths), apply(data,2,unlist),
             row.names=NULL)

> It assumes that sapply(data[,k],length) is the
> same for all k in 1:ncol(data).

Which is, as you inferred correctly from the given example dataset (because I forgot to mention explicitly), is always the case.

(2)
> data.frame(Case=rep(rownames(data),lengths),
            Sequence=sequence(lengths),
            lapply(split(data,colnames(data)[col(data)]), unlist),
            row.names=NULL)

(3)
(David's code with some additions to produce nearly the same output as (1) and (2))
(however there's still one difference: columns 2-5 are 'factors')
> result <- data.frame(do.call(rbind,
        sapply(rownames(data),  function(.x) cbind(.x,
        # those were the rownames
        cbind(1:length(data[.x, "First"][[1]]),
        # and that was the incremental counter
        sapply(data[.x, ],
# and finally the values which unfortunately get turned into characters
        function(.y) return(.y ) ) ) )  )))
> colnames(result)[1:2] <- c("Case","Sequence")
> result

Cheers,
Oliver

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to