[R] Role of na.rm inside mean()

Tue Jul 12 18:43:27 CEST 2011

Hi Harold,

Many (most?) of the statistics function have a similar argument.  I
suspect it is sort of to warn the user---you have to be explicit about
it rather than the program just silently removing or ignoring values
that would not work in the function called.  I can think of one
example where I want a missing value returned.  In psychology we often
create scores on some construct (say optimism), by averaging
individuals' response to several questions.  In certain cases if a
subject does not respond to one question, their overall score should
be missing.  This is easily accomplished by letting na.rm = FALSE.

Cheers,

Josh

On Tue, Jul 12, 2011 at 9:26 AM, Doran, Harold <HDoran at air.org> wrote:
> This is just posed out of curiosity, (not as a criticism per se). But what is the functional role of the argument na.rm inside the mean() function? If there are missing values, mean() will always return an NA as in the example below. But, is there ever a purpose in computing a mean only to receive NA as a result?
>
> In 10 years of using R, I have always used mean() in order to get a result, which is the opposite of its default behavior (when there are NAs). Can anyone suggest a reason why it is in fact desired to get NA as a result of computing mean()?
>
>> x <- rnorm(100)
>> x[1] <- NA
>
>> mean(x)
> [1] NA
>
>> mean(x, na.rm=TRUE)
> [1] 0.08136736
>
> If the reason is to alert the user that the vector has missing values, I suppose I could buy that. But, I think other checks are better
>
> Harold
>
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
https://joshuawiley.com/