[R] Role of na.rm inside mean()

Tue Jul 12 18:38:12 CEST 2011

On 12/07/2011 12:26 PM, Doran, Harold wrote:
> This is just posed out of curiosity, (not as a criticism per se). But what is the functional role of the argument na.rm inside the mean() function? If there are missing values, mean() will always return an NA as in the example below. But, is there ever a purpose in computing a mean only to receive NA as a result?

The general idea in R is that NA stands for "unknown".  If some of the 
values in a vector are unknown, then the mean of the vector is also 
unknown.  NA is also used in other ways sometimes; then it makes sense 
to remove it and compute the mean of the other values.

Duncan Murdoch

> In 10 years of using R, I have always used mean() in order to get a result, which is the opposite of its default behavior (when there are NAs). Can anyone suggest a reason why it is in fact desired to get NA as a result of computing mean()?
>
> >  x<- rnorm(100)
> >  x[1]<- NA
>
> >  mean(x)
> [1] NA
>
> >  mean(x, na.rm=TRUE)
> [1] 0.08136736
>
> If the reason is to alert the user that the vector has missing values, I suppose I could buy that. But, I think other checks are better
>
> Harold
>
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.