[R-sig-ME] Packages of variables?

Tue Apr 5 15:31:50 CEST 2011

On Tue, Apr 5, 2011 at 7:01 AM, Chris Mcowen <chrismcowen at gmail.com> wrote:
> Dear List,
>
> I am relatively new to mixed models and models in general so i apologise in advance
>
> I have a binary response, 0,1 and two random effects  A & B. I then have a range of predictor values which can be put into groups.
>
> Biological -C,D,E,F,G,H,I,J,K
> Human - L,M,N,O,P
> Environment - Q,R,S,T,U
> Spatial - V,W,X
>
> I am interested in asking two questions - Which predictors in each group are the most important
>
> Biological_Model <- lmer(yesno~1+(1|A/B)+C,D,E,F,G,H,I,J,K, family=binomial)
> I have constructed a series of pre-dfinied models and compared AIC rather than using stepwise regression. I have done this for all the groups.
>
> I am now interested, and this is where i am struggling, to investigate which of the four groups and combinations therein are most important in determining the response and how much do they each contribute?
>
> I have tried this, based on the results from above
>
> Biological <-cbind(C,F,G,J)
> Human <-cbind(M,N)
> Environment <- cbind(R,S,T)
> Spatial <-cbind(X)
>
>
> I have then run through the combinations of the groups of predictors i.e
>
> Human_Environment_model <- lrm(yesno~Human+Environment)
> Spatial_Biological_model <- lrm(yesno~Spatial+Biological)
>
> I have compared the r-squared values and not surprisingly the greater the number of packages the better the fit. To counter this i also calculated the AIC value of the various models, and this pretty much agrees.

For nested models (meaning that the predictors in the smaller model
are a subset of the predictors in the larger model) the larger model
should explain more of the variability, so your conclusion about the
R-squared values would hold - except that I don't know how one would
define an R-squared value for a model with a binary response.  Or, for
that matter, for a model with random effects.  You could perform some
kind of calculation related to the correlation of the fitted and
observed responses but the interpretation of the R-squared is, as far
as I know, specific to linear models assuming a Gaussian response
distribution.

Information criteria such as AIC, BIC and DIC are designed to compare
more general models and I think I would go with them or with
likelihood ratio tests for the comparisons.

> The idea was to look at how much influence these packages have - can you look at the difference in R-squared values i.e if i have a Biological model with a r-squared value of 0.2 then i add human impact and it goes up to 0.35, can i say the addition of human impact added 0.15 or is this not statistically sensible? Or would it be more sensible to use the AIC values which indicate Human & Environment describe the relationship best?
>
> Thanks
>
> Chris
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>