[R] handling NA by mean replacement

Berton Gunter gunter.berton at gene.com
Mon Jan 30 18:20:15 CET 2006


Lots of other folks will give you the simple answer (hint: ?'['  ?is.na)

Yours is one of those "iceberg" questions  -- 2/3 hidden underwater.

Two points:

Point 1: Generally you **don't have to do such replacement** as most of R's
functions have a na.rm or na.action argument (unfortunately, for historical
reasons, the argument names and meanings aren't consistent) that does
basically what you want anyway.

Point 2: Doing what you ask is probably a bad idea, as it creates mythical
degrees of freedom and biases results --> gives wrong statistical answers.

As a general matter, handling missing values "correctly" is a difficult
statistical issue that you may want to avoid if you can (R has plenty of
packages that can deal with it, but it requires background expertise).
Honestly, I'm not sure "if you can" makes any sense here (how do you know?),
but let's just say that I think your potential for mischief is reduced if
you use R's inbuilt arguments for ignoring missings rather than imputing
them naively.

Having said that, I believe that clustering procedures, for example, may not
permit this (but they have builtin missing imputation capabilities of their
own, do they not?), so you may have to impute. In this case, try to do so
wisely (e.g. via multiple imputation?). 

Perhaps this will stimulate real experts to offer you some advice. Good
luck.

Cheers,
Bert
 
Bert Gunter
Genentech

> -----Original Message-----
> From: r-help-bounces at stat.math.ethz.ch 
> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Julie Bernauer
> Sent: Monday, January 30, 2006 8:50 AM
> To: r-help at stat.math.ethz.ch
> Subject: [R] handling NA by mean replacement
> 
> Hello
> 
> I am sorry fuch such a stupid question. Suppose I have a 
> table of data having a
> lot of NAs and I want to replace those NAs by the mean of the 
> column before NA
> replacement. How is it possible to do that efficiently ?
> 
> Thanks in advance,
> 
> Julie
> 
> -- 
> Julie Bernauer
> Yeast Structural Genomics
> http://www.genomics.eu.org
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>




More information about the R-help mailing list