This one should be easy but it's giving me a hard time mostly because tapply puts the results in a list. I want to calculate the cumulative sum of a variable in a dataframe, but with the accumulation only within each level of a factor. For a very simple example, take:
> df <- data.frame(x=c(rep(1,5),rep(2,5),rep(3,5)),fac=gl(3,5,labels=letters[1:3])) > df x fac 1 1 a 2 1 a 3 1 a 4 1 a 5 1 a 6 2 b 7 2 b 8 2 b 9 2 b 10 2 b 11 3 c 12 3 c 13 3 c 14 3 c 15 3 c I'd like to create another column in the dataframe so it looks like this, and make sure that the cumulative sums still match the right levels of the factor. I've included a "willdo" column that's just a cumulative sum, and an "ideal" column that's the cumulative sum minus the current value - the column headings are self explanatory. > answer x fac willdo ideal 1 1 a 1 0 2 1 a 2 1 3 1 a 3 2 4 1 a 4 3 5 1 a 5 4 6 2 b 2 0 7 2 b 4 2 8 2 b 6 4 9 2 b 8 6 10 2 b 10 8 11 3 c 3 0 12 3 c 6 3 13 3 c 9 6 14 3 c 12 9 15 3 c 15 12 [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.