Dear all,

Assume that I have the following data structure:

d <- expand.grid(subj=1:5, time=1:3, treatment=LETTERS[1:3])
d$value <- 10 ^ (as.numeric(d$treatment) + 1) + 10 * d$subj + d$time
d$value2 <- 100000 + d$value

where d$treatment == "C" stands for my control group. What I want to achieve 
now is to subtract the values corresponding to d$treatment == "C" from all 
values in order to get the difference between the treatments. If I do that by 
hand, it will look like:

va <- rep(d$value[d$treatment == "C"], 3) # don't need to rep because R would 
do the recycling for me anyways
d$value - va
va2 <- rep(d$value2[d$treatment == "C"], 3)
d$value2 - va2

This works because the data frame is sorted in the right way and all cases are 
present. Furthermore, it would be a bit elaborative if you want to that for 
more than a couple of columns and it is not very error prone nor scalable (what 
if somebody changes the order of the data frame before, or somebody assumes 
that the data frame is in a certain order afterwards? If I want to add some 
columns later, I have to  add new lines. What if some cases are missing?) Thus, 
this approach is clearly not a good one, especially since I don't like 
solutions which depend on a certain order.

So my questions:
1. Is there a ready made solution for that?
2. If not (what I assume), what would be an elegant way of solving this? Is the 
only way to sort the data? Not that I have any problem with sorting, but I 
would appreciate any solution which works w/o sorting, because I don't want to 
run into the risk of having issues downstream with people who assume a certain 
order in the data (which is of course anyways a no-go, but I assume that the 
time to find a solution w/o altering the order is shorter than the time it 
takes to educate these guys [not on the long run though, but this battle has to 
be fought later] ;)
3. This solution should be easily extendable to an arbitrary set of columns and 
should work with missing cases for the treatments like d <- d[-c(2, 21)]

Thanks for your input, I am looking forward to your suggestions.


Kind Regards,

Thorn Thaler
Mathematician

Applied Mathematics 
Nestec Ltd,
Nestlé Research Center
PO Box 44 
CH-1000 Lausanne 26
Phone: +41 21 785 8220
Fax: +41 21 785 9486

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to