[R] logistic regression in an incomplete dataset

Desmond D Campbell d.campbell at ucl.ac.uk
Tue Apr 6 00:06:38 CEST 2010

Dear JoAnn,

Thank you very much for your reply.
If that is the case I am surprised.
I would have though ML could incorporate study cases with some missingness
in them.
Furthermore I believe ML estimates should generally be more robust than
complete case based estimates.
For unbiased estimates I think
  ML requires the data is MAR,
  complete case requires the data is MCAR
Maybe it is more difficult to make the ML estimate on incomplete data than
I imagine. My knowledge is patchy.

Thanks again.


> Hello Desmond,
> The only way to not drop cases with incomplete data would be some sort
> of imputation for the missing covariates.
> JoAnn
> Desmond Campbell wrote:
>> Dear all,
>> I want to do a logistic regression.
>> So far I've only found out how to do that in R, in a dataset of complete
>> cases.
>> I'd like to do logistic regression via max likelihood, using all the
>> study cases (complete and incomplete). Can you help?
>> I'm using glm() with family=binomial(logit).
>> If any covariate in a study case is missing then the study case is
>> dropped, i.e. it is doing a complete cases analysis.
>> As a lot of study cases are being dropped, I'd rather it did maximum
>> likelihood using all the study cases.
>> I tried setting glm()'s na.action to NULL, but then it complained about
>> NA's present in the study cases.
>> I've about 1000 unmatched study cases and less than 10 covariates so
>> could use unconditional ML estimation (as opposed to conditional ML
>> estimation).
>> regards
>> Desmond
> --
> JoAnn Álvarez
> Biostatistician
> Department of Biostatistics
> D-2220 Medical Center North
> Vanderbilt University School of Medicine
> 1161 21st Ave. South
> Nashville, TN 37232-2158
> http://biostat.mc.vanderbilt.edu/JoAnnAlvarez

More information about the R-help mailing list