[R] length, mean, na.rm, na.omit...
Duncan Murdoch
murdoch at stats.uwo.ca
Fri May 18 17:10:00 CEST 2007
On 5/18/2007 10:32 AM, Muenchen, Robert A (Bob) wrote:
> Hi All,
>
> Can anyone tell me why the length function does not use na.rm? I know
> how to work around it, I'm just curious to know why such a useful option
> was left out.
length() is used very frequently in other functions, so it is encoded as
a primitive for speed. Adding an optional argument to it would slow it
down.
> I'm also interested in the logic of setting na.rm=TRUE as the default on
> mean, sd, etc. This is the opposite of the many other stat packages I
> have used, so I assume it provides some programming benefit that is not
> obvious to me.
That's also the opposite of what R does. Did you mean to ask why
na.rm=FALSE is the default? I think it follows from thinking of NA as
meaning "not known", rather than "missing at random". If you don't know
why values are missing, you may get biased results by calculating the
mean of the others: and R would rather not give you biased results.
Duncan Murdoch
More information about the R-help
mailing list