[R] na.action within glmmADMB package?
Ben Bolker
bbolker at gmail.com
Tue Oct 8 15:17:55 CEST 2013
Marta Lomas <lomasvega <at> hotmail.com> writes:
>
> Hello everybody,
>
> I would like to know if within the glmmADMB package into R interface
> there is a way to deal with the NAs
> different than applying "dataformodeling= na.omit(dataframe)".
> This way as you may know removes all
> the rows of the data set with at leas 1 NA.
> I would rather prefer to run my models with more observations. Thus,
> I am trying to find the way that the model takes into account the
> rest of information in the affected rows with at least 1 NA that,
> otherwise. with "na.omit", is eliminated.
I don't think the NA-handling machinery in R really does what you
think it does. In general, other than na.omit and na.fail (the latter
obviously won't do you wany good), the typical choices are na.pass
(which just passes NA values through as is, which will lead to all of
the answers being NA) and na.exclude. The last is useful, but it is
just a convenient function; it still strips the NA values out before
fitting the model but re-introduces them when predicting or returning
residuals.
The basic problem is that you generally *can't* fit statistical
models with NA values in the predictor variables; the mathematics
just wouldn't make sense in general. You either have to do
imputation of some kind to fill in the missing values, or
possibly use some kind of 'random forest' technique to average
over the predictions of different models with different sets
of predictors.
Imputation is non-trivial; Frank Harrell's _Regression Modeling
Strategies_ book and library("sos"); findFn("imputation") will
get you started if you want to go that direction.
More information about the R-help
mailing list