[R] missing values in logistic regression
Frank E Harrell Jr
f.harrell at vanderbilt.edu
Fri Oct 29 21:14:17 CEST 2004
(Ted Harding) wrote:
> On 29-Oct-04 Avril Coghlan wrote:
>
>>Dear R help list,
>>
>> I am trying to do a logistic regression
>>where I have a categorical response variable Y
>>and two numerical predictors X1 and X2. There
>>are quite a lot of missing values for predictor X2.
>>eg.,
>>
>>Y X1 X2
>>red 0.6 0.2 *
>>red 0.5 0.2 *
>>red 0.5 NA
>>red 0.5 NA
>>green 0.2 0.1 *
>>green 0.1 NA
>>green 0.1 NA
>>green 0.05 0.05 *
>>
>>I am wondering can I combine X1 and X2 in
>>a logistic regression to predict Y, using
>>all the data for X1, even though there are NAs in
>>the X2 data?
>>
>>Or do I have to take only the cases for which
>>there is data for both X1 and X2? (marked
>>with *s above)
>
>
> I don't know of any R routine directly aimed at logistic regression
> with missing values as you describe.
>
The aregImpute function in the Hmisc package can handle this, using
predictive mean matching with weighted multinomial sampling of donor
observations' binary covariate values.
. . ..
> Ted.
>
>
> --------------------------------------------------------------------
> E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk>
--
Frank E Harrell Jr Professor and Chair School of Medicine
Department of Biostatistics Vanderbilt University
More information about the R-help
mailing list