Dear Chuck, John, Vikas, and useRs, thank you very much for your great suggestions.
I received three replies providing different ways to reshape my original data.frame (original question at the bottom). There are however some discrepancies in their results (most likely because I didn't explain my question clearly enough) so I think it is better to discuss them a little bit. I have prepared a slightly simplified version of the data set, to facilitate comparisons. Please see below Cheers! Ahimsa #============================================================= # data in its original shape: indiv <- rep(c("A","B"),c(3,3)) level.1 <- c(7, 5, 1, 2, 5, 3) # comes from <- rpois(6,lambda=3) covar.1 <- c(26.4, 48.9, 62.7, 135.3, 40.1, 17.4) # comes from <- rlnorm(6,3,1) level.2 <- c(5, 2, 1, 3, 6, 0) # comes from <- rpois(6,lambda=3) covar.2 <- c(4.5, 58.6, 6.4, 47.2, 16.9, 59.4) # comes from <- rlnorm(6,3,1) my.dat <- data.frame(indiv,level.1,covar.1,level.2,covar.2) # level.1 and level.2 are levels from a common factor and their values represent the number # of replicates for that combination of factor level and value of the covariate (covar.1 for # level.1 and covar.2 for level.2 cases). I want each replicate to be represented as a row, # and therefore the number of rows in the new data frame should be: sum(level.1) + sum(level.2) # [1] 40 # and the number of rows from individual A are: sum(my.dat[my.dat$indiv=="A",c(2,4)]) # [1] 21 # solution 1: ======================================== long <- reshape(my.dat, varying = list(c("level.1","level.2"), c("covar.1","covar.2")), timevar="level", idvar="case.id", v.names=c("ncases","covar"), direction="long") newdf <- with(long, data.frame(indiv = rep( indiv, ncases), level = rep( level, ncases), covar = rep( covar, ncases), case.id = rep(case.id, ncases))) summary(newdf) # we have 40 cases (rows) of which 21 belong to indiv A; # this is provides exactly what I was looking for # solution 2: ======================================== fact1 <- rep("level.1", length(my.dat[,1])) fact2 <- rep("level.2", length(my.dat[,1])) lels <- c(fact1,fact2) nams <- c("indiv", "case.id", "covar") set1 <- my.dat[, c(1,2,3)] ; names(set1) <- nams set2 <- my.dat[,c(1, 4,5)] ; names(set2) <- nams newdata <- cbind(lels, rbind(set1,set2)) mydata <- rbind(newdata[, c(2,1,4,3)], newdata[, c(2,1,4,3)]) names(mydata) <- c("indiv", "factor", "covar", "caseid") mydata[order(mydata$indiv, mydata$caseid, mydata$factor),] summary(mydata) head(mydata) # this is not exactly what I meant # it provides 24 rows, half of them from indiv A # caseid has inherited the values from level.1 and level.2 # up to newdata the process is correct but the next step # duplicates newdata, obtaining 24 (rows) in which each row is repeated # what we actually wanted was to create as many replicates of each case (row) # as the value of that case's level.x (now renamed as case.id). # we should do as in the 2nd paragraph of solution one with(newdata, data.frame(lels=rep(lels,case.id),.... # solution 3: ======================================== library(reshape) melt(my.dat,id=c("indiv","covar.1","covar.2"))->my.dat.1 names(my.dat.1)[4:5]<-c("level","case.id") melt(my.dat.1,id=c("indiv","level","case.id"))->my.dat.2 summary(my.dat.2) # this is by far the most elegant solution # unfortunately it provides similar results to solution 2 and # it has another issue: it alters the relationship # between factor (level.1 and level.2) and covar (covar.1 and covar.2) # solution 1 is the adequate one thus! # Thanks a lot for the three stimulant solutions. And apologies for not # explaining the case more clearly. # Cheers! ## Dear all, I'm having a few problems trying to reshape a data frame. I tried with reshape{stats} and melt{reshape} but I was missing something. Any help is very welcome. Please find details below: ################################# # data in its original shape: indiv <- rep(c("A","B"),c(10,10)) level.1 <- rpois(20, lambda=3) covar.1 <- rlnorm(20, 3, 1) level.2 <- rpois(20, lambda=3) covar.2 <- rlnorm(20, 3, 1) my.dat <- data.frame(indiv,level.1,covar.1,level.2,covar.2) # the values of level.1 and level.2 represent the number of cases for the particular # combination of indiv*level*covar value # I would like to do two things: # 1. reshape to long reducing my.dat[,2:5] into two colums "factor" (levels= level.1 & level.2) # and the covariate # 2. create one new row for each case in level.1 and level.2 # the new reshaped data.frame would should look like this: # indiv factor covar case.id # A level.1 4.614105 1 # A level.1 4.614105 2 # A level.2 31.064405 1 # A level.2 31.064405 2 # A level.2 31.064405 3 # A level.2 31.064405 4 # A level.1 19.185784 1 # A level.2 48.455929 1 # A level.2 48.455929 2 # A level.2 48.455929 3 # etc... ############################ -- ahimsa campos-arceiz www.camposarceiz.com [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.