[R] Normalization and missing values

Chris Bergstresser chris at subtlety.com
Thu Apr 14 04:05:46 CEST 2005


    I'd just like to thank everyone who wrote in in response to my 
questions -- it's been greatly helpful, and appreciated.

Jonathan Baron wrote:
> On 04/13/05 11:36, Chris Bergstresser wrote:
>      First, I didn't see a function in R which does normalization -- did
>  I miss it?  What's the best way to do it?
> 
> Look at scale().  Might be what you mean.

    Yeah; I should have remembered that.  I did search the help files 
for "normalization" and "normalize" but that isn't in the help files. 
Somewhat oddly, I think, since it's exactly what "scale" is doing.

>  But, in general, the "right" way
> to deal with missing data depends on the assumptions you make.
> As a novice, I found the following article to be helpful:
> 
> Schafer, J. L., & Graham, J. W. (2002). Missing data: Our view of 
> the state of the art. Psychological Methods, 7, 147-177.

    This article is great; thanks for providing it.  The authors 
recommend either using "ML Estimation" or "Multiple Imputation" to fill 
in the missing data.  They don't talk much about which is better for 
certain situations, however.
    I don't think my data are particularly sensitive to the method I use 
-- I've got about 1,100 cases, with 85 variables, and there are only 
about 1,000 missing values overall, spread pretty evenly across the data 
file.
    Are there any recommendations for specific packages?  "transcan()" 
and "aregImpute()" look promising; based on the documentation (and what 
I can understand from it) I'm assuming they both provide Multiple 
Imputation?

-- Chris




More information about the R-help mailing list