Re: [R] reshaping data frame

ahimsa campos-arceiz Fri, 22 Feb 2008 05:17:06 -0800

Dear Chuck, John, Vikas, and useRs,

thank you very much for your great suggestions.


I received three replies providing different ways to reshape my original
data.frame (original question at the bottom). There are however some
discrepancies in their results (most likely because I didn't explain my
question clearly enough) so I think it is better to discuss them a little
bit. I have prepared a slightly simplified version of the data set, to
facilitate comparisons. Please see below

Cheers!

Ahimsa

#=============================================================
# data in its original shape:
indiv <- rep(c("A","B"),c(3,3))
level.1 <- c(7, 5, 1, 2, 5, 3)  # comes from <- rpois(6,lambda=3)
covar.1 <- c(26.4, 48.9, 62.7, 135.3, 40.1, 17.4) # comes from <-
rlnorm(6,3,1)
level.2 <- c(5, 2, 1, 3, 6, 0)  # comes from <- rpois(6,lambda=3)
covar.2 <- c(4.5, 58.6, 6.4, 47.2, 16.9, 59.4) # comes from <- rlnorm(6,3,1)
my.dat <- data.frame(indiv,level.1,covar.1,level.2,covar.2)

# level.1 and level.2 are levels from a common factor and their values
represent the number
# of replicates for that combination of factor level and value of the
covariate (covar.1 for
# level.1 and covar.2 for level.2 cases). I want each replicate to be
represented as a row,
# and therefore the number of rows in the new data frame should be:

sum(level.1) + sum(level.2)
# [1] 40

# and the number of rows from individual A are:
sum(my.dat[my.dat$indiv=="A",c(2,4)])
# [1] 21

# solution 1:  ========================================
long <- reshape(my.dat, varying = list(c("level.1","level.2"),
                                       c("covar.1","covar.2")),
                        timevar="level", idvar="case.id",
                        v.names=c("ncases","covar"),
                        direction="long")

newdf <- with(long, data.frame(indiv = rep(  indiv, ncases),
                               level = rep(  level, ncases),
                               covar = rep(  covar, ncases),
                             case.id = rep(case.id, ncases)))
summary(newdf)
# we have 40 cases (rows) of which 21 belong to indiv A;
# this is provides exactly what I was looking for


# solution 2:  ========================================
fact1 <- rep("level.1", length(my.dat[,1]))
fact2 <- rep("level.2", length(my.dat[,1]))
lels <- c(fact1,fact2)
nams <- c("indiv", "case.id", "covar")
set1 <-  my.dat[, c(1,2,3)] ; names(set1) <- nams
set2 <-  my.dat[,c(1, 4,5)] ; names(set2) <- nams

newdata <- cbind(lels, rbind(set1,set2))
mydata <- rbind(newdata[, c(2,1,4,3)], newdata[,
      c(2,1,4,3)])
names(mydata) <- c("indiv", "factor", "covar",
      "caseid")
mydata[order(mydata$indiv, mydata$caseid,
      mydata$factor),]

summary(mydata)
head(mydata)
# this is not exactly what I meant
# it provides 24 rows, half of them from indiv A
# caseid has inherited the values from level.1 and level.2

# up to newdata the process is correct but the next step
# duplicates newdata, obtaining 24 (rows) in which each row is repeated

# what we actually wanted was to create as many replicates of each case
(row)
# as the value of that case's level.x (now renamed as case.id).
# we should do as in the 2nd paragraph of solution one
with(newdata, data.frame(lels=rep(lels,case.id),....


# solution 3:  ========================================
library(reshape)
melt(my.dat,id=c("indiv","covar.1","covar.2"))->my.dat.1
names(my.dat.1)[4:5]<-c("level","case.id")
melt(my.dat.1,id=c("indiv","level","case.id"))->my.dat.2

summary(my.dat.2)
# this is by far the most elegant solution
# unfortunately it provides similar results to solution 2 and
# it has another issue: it alters the relationship
# between factor (level.1 and level.2) and covar (covar.1 and covar.2)

# solution 1 is the adequate one thus!

# Thanks a lot for the three stimulant solutions. And apologies for not
# explaining the case more clearly.

# Cheers!

##



Dear all,

I'm having a few problems trying to reshape a data frame. I tried with
reshape{stats} and melt{reshape} but I was missing something. Any help is
very welcome. Please find details below:

#################################
# data in its original shape:

indiv <- rep(c("A","B"),c(10,10))
level.1 <- rpois(20, lambda=3)
covar.1 <- rlnorm(20, 3, 1)
level.2 <- rpois(20, lambda=3)
covar.2 <- rlnorm(20, 3, 1)
my.dat <- data.frame(indiv,level.1,covar.1,level.2,covar.2)

# the values of level.1 and level.2 represent the number of cases for the
particular
# combination of indiv*level*covar value

# I would like to do two things:
# 1. reshape to long reducing my.dat[,2:5] into two colums "factor" (levels=
level.1 & level.2)
# and the covariate
# 2. create one new row for each case in level.1 and level.2

# the new reshaped data.frame would should look like this:

# indiv  factor    covar   case.id
#   A   level.1   4.614105    1
#   A   level.1   4.614105    2
#   A   level.2  31.064405    1
#   A   level.2  31.064405    2
#   A   level.2  31.064405    3
#   A   level.2  31.064405    4
#   A   level.1  19.185784    1
#   A   level.2  48.455929    1
#   A   level.2  48.455929    2
#   A   level.2  48.455929    3
# etc...

############################


-- 
ahimsa campos-arceiz
www.camposarceiz.com

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] reshaping data frame

Reply via email to