[R] aggregate() function and na.rm = TRUE

David Afshartous dafshartous at med.miami.edu
Tue Jul 8 22:56:55 CEST 2008



All,

I've been using aggregate() to compute means and standard deviations at
time/treatment combinations for a longitudinal dataset, using na.rm = TRUE
for missing data. 

This was working fine before, but now when I re-run some old code it isn't.
I've backtracked my steps and can't seem to find out why it was working
before but not now.  In any event, below is a reproducible example of the
current problem, viz., calculating the standard deviation via aggregate and
employing na.rm = TRUE is not working.

Thanks,
David






dat = data.frame( Hour = c(0, 0, 0, 0, 1, 1,1, 1), Drug = factor(c("P", "D",
"P", "D", "P", "D", "P", "D")), Y1 = rnorm(8, 0),
Y2 = c(NA, NA, NA, NA, 1, 2, 3, 4) )

> aggregate(dat[c(3,4)], dat[c(1,2)], mean)
  Hour Drug          Y1 Y2
1    0    D -0.75534554 NA
2    1    D  0.27529835  3
3    0    P -0.03949923 NA
4    1    P  0.02627489  2
> aggregate(dat[c(3,4)], dat[c(1,2)], sd)
Error in var(x, na.rm = na.rm) : missing observations in cov/cor
> aggregate(dat[c(3,4)], dat[c(1,2)], sd, na.rm = TRUE)
Error in var(x, na.rm = na.rm) : no complete element pairs


> sessionInfo()
R version 2.7.1 (2008-06-23)
i386-apple-darwin8.10.1

locale:
en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

loaded via a namespace (and not attached):
[1] grid_2.7.1     lattice_0.17-8 nlme_3.1-89
>



More information about the R-help mailing list