[R] multiple imputation with fit.mult.impute in Hmisc

Mon Jul 28 14:18:09 CEST 2003

Thanks for the quick reply!  One more question, below.

On 07/27/03 22:20, Frank E Harrell Jr wrote:
>On Sun, 27 Jul 2003 14:47:30 -0400
>Jonathan Baron <baron at psych.upenn.edu> wrote:
>
>> I have always avoided missing data by keeping my distance from
>> the real world.  But I have a student who is doing a study of
>> real patients.  We're trying to test regression models using
>> multiple imputation.  We did the following (roughly):
>> 
>> f <- aregImpute(~ [list of 32 variables, separated by + signs],
>>  n.impute=20, defaultLinear=T, data=t1)
>> # I read that 20 is better than the default of 5.
>> # defaultLinear makes sense for our data.
>> 
>> fmp <- fit.mult.impute(Y ~ X1 + X2 ... [for the model of interest],
>>  xtrans=f, fitter=lm, data=t1)
>> 
>> and all goes well (usually) except that we get the following
>> message at the end of the last step:
>> 
>>  Warning message: Not using a Design fitting function;
>>  summary(fit) will use standard errors, t, P from last imputation
>>  only.  Use Varcov(fit) to get the correct covariance matrix,
>>  sqrt(diag(Varcov(fit))) to get s.e.
>> 
>> I did try using sqrt(diag(Varcov(fmp))), as it suggested, and it
>> didn't seem to change anything from when I did summary(fmp).
>> 
>> But this Warning message sounds scary.  It sounds like the whole
>> process of multiple imputation is being ignored, if only the last
>> one is being used.
>
>The warning message may be ignored.  But the advice to use Varcov(fmp) is faulty for 
>lm fits - I will fix that in the next release of Hmisc.  You may get the 
>imputation-corrected covariance matrix for now using fmp$var

Then it seems to me that summary(fmp) is also giving incorrect
std err.r, t, and p.  Right?  It seems to use Varcof(fmp) and not
fmp$var.

>> So I discovered I could get rid of this warning by loading the
>> Design library and then using ols instead of lm as the fitter in
>> fit.mult.imput.  It seems that ols provides a variance/covariance
>> matrix (or something) that fit.mult.impute can use.
>
>That works too.

That gives me what I get if I use lm and then recalculate the t
values "by hand" from fmp$var.  Thus, ols seems like the way to
go for now, if only to avoid additional calculations.

Jon