[R] EM algorithm for Missing Data.

(Ted Harding) ted.harding at nessie.mcc.ac.uk
Mon Jul 9 10:19:00 CEST 2007


On 09-Jul-07 02:20:47, Marcus Vinicius wrote:
>  Dear all,
> I need to use the EM algorithm where data are missing.
> Example:
> x<- c(60.87, NA, 61.53, 72.20, 68.96, NA, 68.35, 68.11, NA, 71.38)
> 
> May anyone help me?
> 
> Thanks.
> 
> Marcus Vinicius

The Dempster, Laird & Rubin reference given by Simon Blomberg
is the classical account of the EM Algorithm for incomplete
information, though there has been a lot more published since.

However, more to the point in the present case: If the above
is typical of your data, you had better state what you want to
do with the data.

Do you want to fit a distribution by estimating parameters?
Are they observations of a "response" variable with covariates
and you want to fit a linear model estimating the coefficients?
Are they data from a time-series and you need to interpolate
at the missing values?
Etc.??

Depending on what you want to do, the way you apply the general
EM Algorithm procedure may be very different; and a lot of
applications are not covered by Dempster, Laird & Rubin (1977).

And there may possibly be no point anyway: If all you want to do
is estimate the mean of the distribution of the data, then the
best procedure may simply be to ignore the missing data.

Best wishes,
Ted.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <ted.harding at nessie.mcc.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 09-Jul-07                                       Time: 09:18:56
------------------------------ XFMail ------------------------------



More information about the R-help mailing list