[R-sig-ME] ignoring rather than omitting NA covariates

Thu Jan 22 21:37:15 CET 2009

On 23/01/2009, at 1:40 AM, Daniel Ezra Johnson wrote:

> Dear all,
> This is not primarily a mixed models question, so I'll ask it in the
> framework of glm(). But I have the same question w/r/t glmer().
>
> In my field, sociolinguistics, researchers have used a software tool  
> for
> some thirty years that performs logistic regression assuming  
> categorical
> predictors. This software is usually called VARBRUL (the current  
> version of
> it is called GoldVarb).
>
> Assume a data file like this:
>
> response pred1 pred2
> 0 a x
> 1 a y
> 1 a x
> 0 a y
> 0 b x
> 1 b y
> 0 b x
> 0 b y
> 1 a /
> 0 b /
>
> My question is about the behavior of the slash (/) used in the last  
> two
> lines. Assume sum contrasts. The software estimates the values of a  
> and b
> (which sum to zero) and of x and y (which sum to zero). The  
> interesting part
> is that for the last two data points, the predicted values are  
> calculated on
> the basis of pred1 only, and pred2 is ignored.
>
> Looking at the various options of na.action, I do not see anything  
> that
> would correspond to this. Basically we have NA in a certain  
> predictor column
> and we want this predictor ignored for the row in question - we  
> don't want
> the whole row omitted.
>
> Any way to accomplish this in R?

There are ways to deal with missing data through multiple imputation,  
mi package is one.

There are other ways the GoldVarb package may use, possibly it builds  
a model for the missing data. This could be done in R it just has to  
be programmed. :-)  Possibly they aren't actually covariates but part  
of multivariate responses in which case missing data is much easier to  
deal with.

Ken