[R] Reshaping matrix of lists as dataframe

Oliver Gondring Mon, 01 Feb 2010 00:00:07 -0800

Hello William, hello David,

thanks a lot for helping and keeping me going on what sometimes seemsto be a long way to R mastery! :)

I found that the two solutions William proposed were in fact easier tounderstand for me at the moment as David's (and has the additionaladvantage of producing the desired data types ('numeric'/'integer') inthe columns 2-5), however I think all of the code you provided will beextremely helpful to learn some new tricks by analyzing it in detail.

For everyone concerned with similar data manipulation tasks, here's ashort summary of the thread:

>>> The original data (a matrix of _lists_, of cours - mea culpa -hence the modified name of the thread):


x <- list(c(1,2,4),c(1,3,5),c(0,1,0),
         c(1,3,6,5),c(3,4,4,4),c(0,1,0,1),
         c(3,7),c(1,2),c(0,1))
data <- matrix(x,byrow=TRUE,nrow=3)
colnames(data) <- c("First", "Length", "Value")
rownames(data) <- c("Case1", "Case2", "Case3")

> data
     First     Length    Value
Case1 Numeric,3 Numeric,3 Numeric,3
Case2 Numeric,4 Numeric,4 Numeric,4
Case3 Numeric,2 Numeric,2 Numeric,2


>>> The desired output (a dataframe of a database-like 'flat' structure):

>      Case Sequence First Length Value
>   1 Case1        1     1      1     0
>   2 Case1        2     2      3     1
>   3 Case1        3     4      5     0
>   4 Case2        1     1      3     0
>   5 Case2        2     3      4     1
>   6 Case2        3     6      4     0
>   7 Case2        4     5      4     1
>   8 Case3        1     3      1     0
>   9 Case3        2     7      2     1


>>> Ways to do it:

(1)
>  lengths<-sapply(data[,1],length)
>  data.frame(Case=rep(rownames(data),lengths),

Sequence=sequence(lengths),apply(data,2,unlist),

             row.names=NULL)

> It assumes that sapply(data[,k],length) is the
> same for all k in 1:ncol(data).

Which is, as you inferred correctly from the given example dataset(because I forgot to mention explicitly), is always the case.


(2)
> data.frame(Case=rep(rownames(data),lengths),
            Sequence=sequence(lengths),
            lapply(split(data,colnames(data)[col(data)]), unlist),
            row.names=NULL)

(3)

(David's code with some additions to produce nearly the same output as(1) and (2))

(however there's still one difference: columns 2-5 are 'factors')
> result <- data.frame(do.call(rbind,
        sapply(rownames(data),  function(.x) cbind(.x,
        # those were the rownames
        cbind(1:length(data[.x, "First"][[1]]),
        # and that was the incremental counter
        sapply(data[.x, ],

# and finally the values which unfortunately get turned intocharacters

        function(.y) return(.y ) ) ) )  )))
> colnames(result)[1:2] <- c("Case","Sequence")
> result

Cheers,
Oliver

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Reshaping matrix of lists as dataframe

Reply via email to