[R-sig-ME] Predictor standardized transformation in GLMM

Fri Oct 22 03:44:20 CEST 2021

   The first way is more standard and makes more sense to me.

   Note that standardizing variables doesn't make any difference to the 
*statistical* results; it may improve the computational stability of the 
model, and it definitely changes the interpretation of the parameters.

   I understand the meaning of the parameters in the first case: "what 
is the expected change in log-odds of the outcome for a 1-SD change in 
predictor x1, holding everything else fixed"?  I'm not so sure how I 
would interpret "1 SD of the unique values of x1", but if you can (and 
can explain it!), and that version makes more sense, then you should go 
ahead and use it.

   The structure of your example seems a bit odd -- is this a nested 
design, i.e. the predictors only vary across levels of the 
random-effects grouping factor, not within them?  In that case (if your 
real data follow the same structure), you would probably be better 
collapsing the values rather than dealing with the complexities of a 
random-effect linear regression - in other words,

   y <- c(mean(1:3), mean(4:5), 6, 7)
   x1 <- c(6,5,4,3)
   x2 <- c(11, 5, 6, 8)

lm(y~x1 + x2, weights=c(3,2,1,1))

  (see Murtaugh, "Simplicity and complexity in ecological data analysis")

On 10/21/21 9:32 PM, Di Zeng wrote:
> Dear all,
> 
> My colleagues and I have a question when we use the generalized linear
> mixed models to analyze our data:
> 
> # Creating an example dataset
> 
> group <- factor(c('A','A','A','B','B','C','D'))#Random effects
> y <- c(1:7)
> x1 <- c(6,6,6,5,5,4,3)
> x2 <- c(11,11,11,5,5,6,8)
> 
> 
> Because the predictors (x1, x2) have different units, we need to
> standardize them before running our models. There are two ways to conduct
> this standardized transformation.
> 
> First, standardizing x1, x2 directly, like:
> 
> scale(dt$x1)
> scale(dt$x2)
> 
> Second, standardizing x1, x2 based on unique group, like:
> 
> scale(unique(dt$x1))
> scale(unique(dt$x2))
> 
> 
> We wonder which way is reasonable? In my own idea, we should use the second
> one. Because data points in the same group are non-independent replication
> in read dataset.
> 
> Could you mind giving us some suggestions or ideas on this problem?
> 
> Thanks very much,
> 
> Di
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> R-sig-mixed-models using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> 

-- 
Dr. Benjamin Bolker
Professor, Mathematics & Statistics and Biology, McMaster University
Director, School of Computational Science and Engineering
Graduate chair, Mathematics & Statistics