[R] missing values imputation

(Ted Harding) Ted.Harding at nessie.mcc.ac.uk
Wed May 12 23:05:08 CEST 2004


On 12-May-04 Rolf Turner wrote:
> The EM algorithm requires an ``E'' step and an ``M'' step.  Harding
> and Rossini appear to be seriously suggesting that an R function
> could be written which would
> 
>       (a) Perform the E step in arbitrary contexts, and
>       (b) For that given expected value, work out a procedure
>           to effect its maximization.
> 
> Or maybe they're not serious.
> 
> For the M step (b) general numerical optimization would theoretically
> do the trick.  (But would be fraught with peril.)  For the E step
> (a), forget it.
> 
> The point is, the EM ``algorithm'' is NOT an algorithm which could be
> effected by an R function.
> [...]
> The original questioner wanted an R function to effect the EM
> algorithm.  My point was that this is a silly request because such a
> function would be impossible to write.

Well, I think there's been enough hair-splitting on the "algorithm"
issue!

To revert to the point about the original query from Anne Piotet.
She said she would prefer to use maximum likelihood methods, and asked
if the EM algorithm was available, in the context of imputing missing
data.

I don't think she was asking about whether R was blessed with a "universal
EM algorithm" into which any incomplete-data problem could be plugged
(and I agree that the generality of the problem, especially expressing
the conditioning corresponding to arbitrary incompleteness, would make
such a thing very elusive).

What I believe she *was* asking was whether, using R, she could do
imputation with maximum-likelihood methods using the EM algorithm.
There are plenty of imputation methods which dodge likelihood altogether,
and thereby lose efficiency, so the question has a lot of point, and
the EM algorithm is of course the natural approach since no information
is more manifestly incomplete than when there are holes in the data.

Schafer's methods (and thanks, Chuck, for the pointer to "pan") all
implement the EM algorithm to obtain maximum likelihood estimates in
the first instance. As far as replying to Anne was concerned, I think
all that was needed was to give this information.

To receive a response which asserted (in effect) that it was
unimplementable must have come as a bit of a surpise, in the context!

Anyway, 'nuff said, probably ...

Best wishes to all,
Ted.


--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk>
Fax-to-email: +44 (0)870 167 1972
Date: 12-May-04                                       Time: 22:05:08
------------------------------ XFMail ------------------------------




More information about the R-help mailing list