Sandy Small wrote: > Hi > I am a newbie to R but have tried a number of ways in R to do this and > can't find a good solution. (I could do it out of R in perl or awk but > would like to know how to do this in R). > > I have a large data frame 49 variables and 7000 observations however for > simplicity I can express it in the following data frame > > Base, Image, LVEF, ES_Time > A, 1, 4.32, 0.89 > A, 2, 4.98, 0.67 > A, 3, 3.7, 0.5 > A, 3. 4.1, 0.8 > B, 1, 7.4, 0.7 > B, 3, 7.2, 0.8 > B, 4, 7.8, 0.6 > C, 1, 5.6, 1.1 > C, 4, 5.2, 1.3 > C, 5, 5.9, 1.2 > C, 6, 6.1, 1.2 > C, 7. 3.2, 1.1 > > For each value of LVEF and ES_Time I would like to normalise the value > to the maximum for that factor grouped by Base or Image number, adding > an extra column to the data frame with the normalised value in it. > > So for the Base = B group in the data frame (the data frame should have > the same length I'm just showing the B part) I would get a modified data > frame as follows. > > Base, Image, LVEF, ES_Time, Norm_LVEF, Norm_ES_Time > ... > B,1,7.4, 0.7, 7.4/7.8, 0.7/0.8 > B, 3, 7.2, 0.8, 7.2/7.8, 0.8/0.8 > B, 4, 7.8, 0.6, 7.8/7.8, 0.6/0.8 > ... > > Where the results of the division would replace the division shown here. > I hope this makes sense. > If anyone can help I would be very grateful. > You want to look at the by(), tapply() or sparseby() functions (the latter in the reshape package, the others are in base R).
For example, I think this untested code does what you want: newdf <- sparseby(olddf, c("Base", "Image"), function(subset) within(subset, { Norm_LVEF <- LVEF/max(LVEF) Norm_ES_Time <- ES_Time/max(ES_Time) })) where olddf is the old dataframe, and newdf is newly created. Duncan Murdoch ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.