[R-sig-ME] glmm (binomial, logit) with transformed/scaled predictors
Ben Bolker
bbolker at gmail.com
Wed Sep 11 14:47:18 CEST 2013
Johannes Radinger <johannesradinger at ...> writes:
>
> Hi,
> First I'd like to apologize for probably very novice questions: I am
> new to this list as well as rather new in the field of mixed
> model. Jobwise, I am aquatic/river ecologist (actually PhD student)
> and not a statistician/mathematician and so most of my questions are
> probably related to the topic of rivers/fish.
Actually, this is actually another not-really-about-mixed-models question
(see below).
> I want use a GLMM with presence/absence data (logit-model) as response. The
> model contains the binary response (presence/absence of a species at a
> site) and 3 continuous predictors (fixed effects: habitat quality,
> dispersal metric, metric of influence of barriers) as well as two random
> effects (in my case species and sites). For that purpose I am using the
> R-function (g)lme from the package lme4.
That's (g)lmer ...
> Now several questions appeared:
>
> 1) My predictors are partly highly skewed:
> > summary(Pred1)
> Min. 1st Qu. Median Mean 3rd Qu. Max.
> 0.0000 0.0000 0.1143 9.0720 3.8400 616.4000
> > summary(Pred2)
> Min. 1st Qu. Median Mean 3rd Qu. Max.
> -7.44400 -0.00031 0.00000 0.41560 0.00019 255.70000
> > summary(Pred3)
> Min. 1st Qu. Median Mean 3rd Qu. Max.
> 0.0000 0.1221 0.3716 0.3734 0.5914 0.9626
>
> In a standard regression model they would definitely need transformation
> (e.g. log).
> Is that the first step in a logistic mixed model too?
This is a common misconception. The distribution of the
predictors _is not relevant_ to the correctness of a linear or
generalized (or LMM or GLMM) model (or additivity of
predictor effects); you might want to transform
the predictors in order to improve the _linearity_ of the model
(on the linear predictor scale, i.e. the logit scale in this case),
but there is _a priori_ nothing wrong with a skewed distribution
of predictors.
> 2) As I am interested in a way to compare the relative size effects of the
> three predictors I had been referred to the simple approach of Schielzeth
> 2010 (thank you for that tip Mr. Bolker!). In this article it is
> recommended to scale the predictors for comparison of their importance
> and/or to scale and center in case one also wants to model interactions
> between continuous predictor variables.
> So my question how would that interfere with any transformation of the
> variables as described in my first question? And what is the correct order:
> first transformation than scaling?
If you find it useful to transform your predictors, it will
probably be easier to transform first and then center/scale.
> 3) After reading some list posts like this here:
> https://stat.ethz.ch/pipermail/r-sig-mixed-models/2011q1/015591.html I
> learnt that comparing two lmer-models (likelihood ratio test with anova())
> is the way to find out if a predictor or interaction is significant in a
> model. Thus this method (comparing complex with less complex) models can
> lead to the most parsimonious model comprising only of the most important
> predictors and interactions. After reading Schielzeth 2010, I think this
> comparison test should be made with centered and scaled variables (and if
> needed transformed before)?
In principle, centering and scaling variables (only) should not
affect the overall fit/log-likelihood of a model, it just eases
interpretation. So the answer is "it doesn't matter".
> 4) This brings me than to my final interpretation of the results. In
> relation to that I came across the post as mentioned before and a tutorial:
> http://www.ats.ucla.edu/stat/mult_pkg/faq/general/odds_ratio.htm I think
> that tutorial really helped me to understand the meaning of odds etc. But
> how can odds be interpreted if the predictors were transformed before?
I don't think transforming the _predictors_ changes the interpretation
of the _response variable_ ...
> First I'd calculate the final parsimonious model once with transformed and
> scaled predictors to get an idea about the relative impact of each
> independent predictor. Second, the same model with transformed but with
> unscaled predictors can provide absolute parameter estimates for the odds.
Something like that.
More information about the R-sig-mixed-models
mailing list