On Aug 26, 2010, at 3:40 PM, Marc Schwartz wrote: > On Aug 26, 2010, at 3:33 PM, Bos, Roger wrote: > >> I created a small example to show something that I do a lot of. "scale" >> data by month and return a data.frame with the output. "id" represents >> repeated observations over "time" and I want to scale the "slope" >> variable. The "out" variable shows the output I want. My for..loop >> does the job but is probably very slow versus other methods. ddply >> seems ideal, but despite playing with the baseball examples quite a bit >> I can't figure out how to get it to work with my sample dataset. >> >> TIA for any help, Roger >> >> Here is the sample code: >> >> dat <- data.frame(id=rep(letters[1:5],3), >> time=c(rep(1,5),rep(2,5),rep(3,5)), slope=1:15) >> dat >> >> for (i in 1:3) { >> mat <- dat[dat$time==i, ] >> outi <- data.frame(mat$time, mat$id, slope=scale(mat$slope)) >> if (i==1) { >> out <- outi >> } else { >> out <- rbind(out, outi) >> } >> } >> out >> >> Here is the sample output: >> >>> dat <- data.frame(id=rep(letters[1:5],3), >> time=c(rep(1,5),rep(2,5),rep(3,5)), slope=1:15) >> >>> dat >> id time slope >> 1 a 1 1 >> 2 b 1 2 >> 3 c 1 3 >> 4 d 1 4 >> 5 e 1 5 >> 6 a 2 6 >> 7 b 2 7 >> 8 c 2 8 >> 9 d 2 9 >> 10 e 2 10 >> 11 a 3 11 >> 12 b 3 12 >> 13 c 3 13 >> 14 d 3 14 >> 15 e 3 15 >> >>> for (i in 1:3) { >> + mat <- dat[dat$time==i, ] >> + outi <- data.frame(mat$time, mat$id, slope=scale(mat$slope)) >> + if (i==1) { >> + out .... [TRUNCATED] >> >>> out >> mat.time mat.id slope >> 1 1 a -1.2649111 >> 2 1 b -0.6324555 >> 3 1 c 0.0000000 >> 4 1 d 0.6324555 >> 5 1 e 1.2649111 >> 6 2 a -1.2649111 >> 7 2 b -0.6324555 >> 8 2 c 0.0000000 >> 9 2 d 0.6324555 >> 10 2 e 1.2649111 >> 11 3 a -1.2649111 >> 12 3 b -0.6324555 >> 13 3 c 0.0000000 >> 14 3 d 0.6324555 >> 15 3 e 1.2649111 >>> >> *************************************************************** > > > Roger, seems like you might want: > > See ?ave > >> cbind(dat, slope = ave(dat$slope, list(dat$time), FUN = scale)) > id time slope slope > 1 a 1 1 -1.2649111 > 2 b 1 2 -0.6324555 > 3 c 1 3 0.0000000 > 4 d 1 4 0.6324555 > 5 e 1 5 1.2649111 > 6 a 2 6 -1.2649111 > 7 b 2 7 -0.6324555 > 8 c 2 8 0.0000000 > 9 d 2 9 0.6324555 > 10 e 2 10 1.2649111 > 11 a 3 11 -1.2649111 > 12 b 3 12 -0.6324555 > 13 c 3 13 0.0000000 > 14 d 3 14 0.6324555 > 15 e 3 15 1.2649111
Quick fine tune, as I forgot to remove the original 'slope' column above. > cbind(dat[, -3], slope = ave(dat$slope, list(dat$time), FUN = scale)) id time slope 1 a 1 -1.2649111 2 b 1 -0.6324555 3 c 1 0.0000000 4 d 1 0.6324555 5 e 1 1.2649111 6 a 2 -1.2649111 7 b 2 -0.6324555 8 c 2 0.0000000 9 d 2 0.6324555 10 e 2 1.2649111 11 a 3 -1.2649111 12 b 3 -0.6324555 13 c 3 0.0000000 14 d 3 0.6324555 15 e 3 1.2649111 Marc ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.