[R] Imputing missing values

Jan Smit janpsmit at yahoo.co.uk
Wed Sep 1 10:43:46 CEST 2004


Dear all, 

Apologies for this beginner's question. I have a
variable Price, which is associated with factors
Season and Crop, each of which have several levels.
The Price variable contains missing values (NA), which
I want to substitute by the mean of the remaining
(non-NA) Price values of the same Season-Crop
combination of levels. 

Price     Crop    Season 
10        Rice    Summer 
12        Rice    Summer 
NA        Rice    Summer 
8         Rice    Winter 
9         Wheat    Summer 

Price[is.na(Price)] gives me the missing values, and
by(Price, list(Crop, Season), mean, na.rm = T) the
values I want to impute. What I've not been able to
figure out, by looking at by and the various
incarnations of apply, is how to do the actual
substitution. 

Any help would be much appreciated. 

Jan Smit




More information about the R-help mailing list