On 2020-07-23 18:54 -0400, Duncan Murdoch wrote:
> On 23/07/2020 6:15 p.m., Sorkin, John wrote:
> > Colleagues,
> > The by function in the R program below is not giving me the sums
> > I expect to see, viz.,
> > 382+170=552
> > 4730+170=4900
> > 5+6=11
> > 199+25=224
> > ###################################################
> > #full R program:
> > mydata <- data.frame(covid=c(0,0,0,0,1,1,1,1),
> > sex=(rep(c(1,1,0,0),2)),
> > status=rep(c(1,0),2),
> > values=c(382,4730,5,199,170,497,6,25))
> > mydata
> > by(mydata,list(mydata$sex,mydata$status),sum)
> > by(mydata,list(mydata$sex,mydata$status),print)
> > ###################################################
> 
> The problem is that you are summing the mydata values, not the mydata$values
> values.  That will include covid, sex and status in the sums.  I think
> you'll get what you should (though it doesn't match what you say you
> expected, which looks wrong to me) with this code:
> 
> by(mydata$values,list(mydata$sex,mydata$status),sum)
> 
> for 0,0, the sum is 224 = 199+25
> for 0,1, the sum is  11 = 5+6
> for 1,0, the sum is 5227 = 4730 + 497 (not 4730 + 170)
> for 1,1, the sum is 552 = 382 + 170

Dear John,

Aggregate also does this, but sex and 
status are columns in a data.frame and 
not attributes of the double.

        aggregate(x=list("values"=mydata$values),
                  by=list("sex"=mydata$sex,
                          "status"=mydata$status),
                  FUN=sum)

yields

          sex status values
        1   0      0    224
        2   1      0   5227
        3   0      1     11
        4   1      1    552

Best,
Rasmus

Attachment: signature.asc
Description: PGP signature

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to