[R] "with" and "by" and NA:
Duncan Murdoch
murdoch at stats.uwo.ca
Thu Mar 26 00:56:04 CET 2009
On 25/03/2009 7:36 PM, Aldi Kraja wrote:
> Hi,
>
> I have a data.frame with many variables for which I am performing the
> mean by subgroup, for a pair of variables at a time, where one of them
> for each pair defines the subgroup. The subgroups in the x$cm1 are 0, 1
> and 2.
> x
> ph1 cm1
> 0.2345 2
> 1.2222 1
> 2.0033 0
> 0.0000 2
> 1.0033 1
> 0.2345 0
> 1.2222 2
> 2.0033 0
> 0.0000 1
> 1.0033 2
>
> > meanbygroup <- as.vector(with(x, by(x$ph1, x$cm1, mean)))
You don't need with() here, as you are explicitly extracting the vectors
from x.
> > meanbygroup
> if the ph1 has no missing values the above statements work fine:
> [1] 1.4137000 0.7418333 0.6150000
>
> In the moment that I introduce in the ph1 a missing value in the ph1 as NA
> x
> ph1 cm1
> 0.2345 2
> NA 1
> 1.2222 1
> .............
>
> the above transforms into
> [1] 1.4137000 NA 0.6150000
>
> Question: is there a way I can protect this calculations from the NA
> values in the ph1 (some kind of: na.rm=T)?
You could use with(), and extract the vectors from a subset of x:
with(x[!is.na(x$ph1),], by(ph1, cm1, mean))
This is untested. If you had provided sample data in a usable format I
would have tried it, but you didn't, and I'm too lazy to create my own.
Duncan Murdoch
More information about the R-help
mailing list