[R] by function with sum does not give what is expected from by function with print

Sorkin, John j@ork|n @end|ng |rom @om@um@ry|@nd@edu
Fri Jul 24 00:15:10 CEST 2020


Colleagues,
 
The by function in the R program below is not giving me the sums
I expect to see, viz., 
382+170=552
4730+170=4900
5+6=11
199+25=224
###################################################
#full R program:
mydata <- data.frame(covid=c(0,0,0,0,1,1,1,1),
sex=(rep(c(1,1,0,0),2)),
status=rep(c(1,0),2),
values=c(382,4730,5,199,170,497,6,25))
mydata
by(mydata,list(mydata$sex,mydata$status),sum)
by(mydata,list(mydata$sex,mydata$status),print)
###################################################

More complete explanation of my question
 
I have created a simple dataframe having three factors:
 mydata <- data.frame(covid=c(0,0,0,0,1,1,1,1),
 sex=(rep(c(1,1,0,0),2)),
 status=rep(c(1,0),2),
 values=c(382,4730,5,199,170,497,6,25))
 
 > mydata
  covid sex status values
1     0   1      1    382
2     0   1      0   4730
3     0   0      1      5
4     0   0      0    199
5     1   1      1    170
6     1   1      0    497
7     1   0      1      6
8     1   0      0     25
 
When I use the by function with a sum as an argument, I don’t
get the sums that I would expect to 
receive based either on the listing of the dataframe above,
or from using by with print as an argument:
 
> by(mydata,list(mydata$sex,mydata$status),sum)
: 0
: 0
[1] 225
------------------------------------------------------------------------------- 
: 1
: 0
[1] 5230
------------------------------------------------------------------------------- 
: 0
: 1
[1] 14
------------------------------------------------------------------------------- 
: 1
: 1
[1] 557
 
I expected to see the following sums: 
382+170=552
4730+170=4900
5+6=11
199+25=224
Which as can be seen by the output above, I am not getting. 
 
Using print as an argument to the by function, I get the values
grouped as I would expect, but for some reason I get a double
printing of the values!
 
> by(mydata,list(mydata$sex,mydata$status),print)
  covid sex status values
4     0   0      0    199
8     1   0      0     25
  covid sex status values
2     0   1      0   4730
6     1   1      0    497
  covid sex status values
3     0   0      1      5
7     1   0      1      6
  covid sex status values
1     0   1      1    382
5     1   1      1    170
: 0
: 0
  covid sex status values
4     0   0      0    199
8     1   0      0     25
------------------------------------------------------------------------------- 
: 1
: 0
  covid sex status values
2     0   1      0   4730
6     1   1      0    497
------------------------------------------------------------------------------- 
: 0
: 1
  covid sex status values
3     0   0      1      5
7     1   0      1      6
------------------------------------------------------------------------------- 
: 1
: 1
  covid sex status values
1     0   1      1    382
5     1   1      1    170
 
What am I doing wrong, or what don’t I understand
About the by function?
 
Thank you
John
 
 

















John David Sorkin M.D., Ph.D.

Professor of Medicine

Chief, Biostatistics and Informatics

University of Maryland School of Medicine Division of Gerontology and Geriatric Medicine

Baltimore VA Medical Center

10 North Greene Street

GRECC (BT/18/GR)

Baltimore, MD 21201-1524

(Phone) 410-605-7119

(Fax) 410-605-7913 (Please call phone number above prior to faxing) 





More information about the R-help mailing list