On 2020-07-23 18:54 -0400, Duncan Murdoch wrote: > On 23/07/2020 6:15 p.m., Sorkin, John wrote: > > Colleagues, > > The by function in the R program below is not giving me the sums > > I expect to see, viz., > > 382+170=552 > > 4730+170=4900 > > 5+6=11 > > 199+25=224 > > ################################################### > > #full R program: > > mydata <- data.frame(covid=c(0,0,0,0,1,1,1,1), > > sex=(rep(c(1,1,0,0),2)), > > status=rep(c(1,0),2), > > values=c(382,4730,5,199,170,497,6,25)) > > mydata > > by(mydata,list(mydata$sex,mydata$status),sum) > > by(mydata,list(mydata$sex,mydata$status),print) > > ################################################### > > The problem is that you are summing the mydata values, not the mydata$values > values. That will include covid, sex and status in the sums. I think > you'll get what you should (though it doesn't match what you say you > expected, which looks wrong to me) with this code: > > by(mydata$values,list(mydata$sex,mydata$status),sum) > > for 0,0, the sum is 224 = 199+25 > for 0,1, the sum is 11 = 5+6 > for 1,0, the sum is 5227 = 4730 + 497 (not 4730 + 170) > for 1,1, the sum is 552 = 382 + 170
Dear John, Aggregate also does this, but sex and status are columns in a data.frame and not attributes of the double. aggregate(x=list("values"=mydata$values), by=list("sex"=mydata$sex, "status"=mydata$status), FUN=sum) yields sex status values 1 0 0 224 2 1 0 5227 3 0 1 11 4 1 1 552 Best, Rasmus
signature.asc
Description: PGP signature
______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.