[R] by function with sum does not give what is expected from by function with print
Rasmus Liland
jr@| @end|ng |rom po@teo@no
Fri Jul 24 01:48:30 CEST 2020
On 2020-07-23 18:54 -0400, Duncan Murdoch wrote:
> On 23/07/2020 6:15 p.m., Sorkin, John wrote:
> > Colleagues,
> > The by function in the R program below is not giving me the sums
> > I expect to see, viz.,
> > 382+170=552
> > 4730+170=4900
> > 5+6=11
> > 199+25=224
> > ###################################################
> > #full R program:
> > mydata <- data.frame(covid=c(0,0,0,0,1,1,1,1),
> > sex=(rep(c(1,1,0,0),2)),
> > status=rep(c(1,0),2),
> > values=c(382,4730,5,199,170,497,6,25))
> > mydata
> > by(mydata,list(mydata$sex,mydata$status),sum)
> > by(mydata,list(mydata$sex,mydata$status),print)
> > ###################################################
>
> The problem is that you are summing the mydata values, not the mydata$values
> values. That will include covid, sex and status in the sums. I think
> you'll get what you should (though it doesn't match what you say you
> expected, which looks wrong to me) with this code:
>
> by(mydata$values,list(mydata$sex,mydata$status),sum)
>
> for 0,0, the sum is 224 = 199+25
> for 0,1, the sum is 11 = 5+6
> for 1,0, the sum is 5227 = 4730 + 497 (not 4730 + 170)
> for 1,1, the sum is 552 = 382 + 170
Dear John,
Aggregate also does this, but sex and
status are columns in a data.frame and
not attributes of the double.
aggregate(x=list("values"=mydata$values),
by=list("sex"=mydata$sex,
"status"=mydata$status),
FUN=sum)
yields
sex status values
1 0 0 224
2 1 0 5227
3 0 1 11
4 1 1 552
Best,
Rasmus
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20200724/dd903768/attachment.sig>
More information about the R-help
mailing list