Michael Pearmain wrote on 12/20/2011 05:21:42 AM:

> Hi All,
> 
> I'm wanting to convert a ragged list of values into a structured matrix 
for
> further analysis later on, i have a solution to this problem (below) but
> i'm dealing with datasets upto 1GB in size, (i have 24GB of memory so 
can
> load it) but it takes a LONG time to run the code on a large dataset.  I
> was wondering if anyone had any tips or tricks that may make this run
> faster?
> 
> Below is some sample code of what ive been doing, (in the full version i
> use snowfall to spread the work via sfSapply)
> 
> bhvs <- c(1,2,3,4,5,6)
> ragged.list <- list('23' = c(13,4,5,6,3,65,67,2),
>                     '34' = c(1,2,3,4,56,7,8),
>                     '45' = c(5,6,89,87,56))
> 
> # Define the matrix to store results
> cluster.data <- as.data.frame(matrix(0, length(bhvs), nrow =
> length(ragged.list)))
> # Keep the names of the bhvs,
> names(cluster.data) <- bhvs
> cluster.data <- t(sapply(rep(1:length(ragged.list)), function (i) {
>   cluster.data[i,] <- as.numeric(names(cluster.data) %in%
> (ragged.list[[i]][]))
>   return(cluster.data[i,])
> }))
> cluster.data <- matrix(unlist(cluster.data),
>                        ncol = ncol(cluster.data),
>                        dimnames = list(NULL, colnames(cluster.data)))
> > cluster.data
>      1 2 3 4 5 6
> [1,] 0 1 1 1 1 1
> [2,] 1 1 1 1 0 0
> [3,] 0 0 0 0 1 1
> >
> 
> The returned matrix is as i desire it, with the bhv being the colnames 
and
> a binary for each row representing if it was present or not in that list
> 
> Many thanks in advance
> 
> Mike


Try this:

cluster.data <- 1*t(sapply(ragged.list, function(x) bhvs %in% x))

Jean
        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to