[R-sig-ME] Representation model problem with standardized variables

Thierry Onkelinx th|erry@onke||nx @end|ng |rom |nbo@be
Tue Apr 5 10:05:43 CEST 2022


Dear Alexandre,

I prefer to manually scale the variables. Instead of centering to the mean,
I prefer to center to some relevant value within or close to the data. And
instead of dividing by the standard error, I divide by a sensible power of
10. The resulting variable and its coefficient are much easier to interpret.

Best regards,

ir. Thierry Onkelinx
Statisticus / Statistician

Vlaamse Overheid / Government of Flanders
INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND
FOREST
Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
thierry.onkelinx using inbo.be
Havenlaan 88 bus 73, 1000 Brussel
www.inbo.be

///////////////////////////////////////////////////////////////////////////////////////////
To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey
///////////////////////////////////////////////////////////////////////////////////////////

<https://www.inbo.be>


Op ma 4 apr. 2022 om 13:57 schreef Alexandre Santos via R-sig-mixed-models <
r-sig-mixed-models using r-project.org>:

> Dear Daniel,
>
> Thank you so much and my problem was solved.
>
> The hard to me is to retrieve the center and scale from standardization
> and after calculating back, but now with your answer looks easy!
>
> Best wishes,
>
> Alexandre
>
>
>  --Alexandre dos Santos
> Geotechnologies and Spatial Statistics applied to Forest Entomology
> Instituto Federal de Mato Grosso (IFMT) - Campus Caceres
> Caixa Postal 244 (PO Box)
> Avenida dos Ramires, s/n - Vila RealCaceres - MT - CEP 78201-380 (ZIP code)
> Phone: (+55) 65 99686-6970 / (+55) 65 3221-2674
> Lattes CV: http://lattes.cnpq.br/1360403201088680
> OrcID: orcid.org/0000-0001-8232-6722
> ResearchGate: http://www.researchgate.net/profile/Alexandre_Santos10
> Publons: https://publons.com/researcher/3085587/alexandre-dos-santos/--
>
>
>
>
>
>
> Em segunda-feira, 4 de abril de 2022 06:26:04 AMT, Daniel Lüdecke <
> d.luedecke using uke.de> escreveu:
>
>
>
>
>
> Dear Alexandre,
> this one might solve your issue. I use the plot() method from ggeffects,
> but you can of course use ggplot2 as well, to build your own plot from
> scratch. The basic idea is to map the values of the range from the
> unstandardized variable with those from the standardized one, and then use
> "scale_x_...()" from ggplot to change the labels. At the bottom, you find
> the two plots when you use the unstandardized data in your model, and you
> can see, that the axis labels (range) represents the original scale.
>
> Best
> Daniel
>
> library(datawizard)
> library(lme4)
> library(ggeffects)
> library(ggplot2)
>
> myds <- read.csv("
> https://raw.githubusercontent.com/Leprechault/trash/main/ds.desenvol.csv")
> d.scale <- standardize(myds, select = c("temp", "storage"))
> m_6 <- glmer.nb(development ~ poly(temp,2) + poly(storage,2) + (1 |
> storage ), data = d.scale)
>
>
> # for temp
> mydf <- ggpredict(m_6, terms = "temp [all]")
>
> # retrieve center and scale from standardization
> center_temp <- attributes(d.scale)$center["temp"]
> scale_temp <- attributes(d.scale)$scale["temp"]
>
> # scaled range, calculate back to range of unstandardized
> scaled_range <- c(-1, 0, 1, 2)
> new_range <- round(scaled_range * scale_temp + center_temp)
>
> # scaled range
> plot(mydf, add.data = TRUE)
>
> # original range
> plot(mydf, add.data = TRUE) +
>   scale_x_continuous(
>     breaks = scaled_range,
>     labels = new_range
>   )
>
>
> # for storage
> mydf <- ggpredict(m_6, terms = "storage [all]")
>
> # retrieve center and scale from standardization
> center_storage <- attributes(d.scale)$center["storage"]
> scale_storage <- attributes(d.scale)$scale["storage"]
>
> # scaled range, calculate back to range of unstandardized
> scaled_range <- c(-1, 0, 1)
> new_range <- round(scaled_range * scale_storage + center_storage)
>
> # scaled range
> plot(mydf, add.data = TRUE)
>
> # original range
> plot(mydf, add.data = TRUE) +
>   scale_x_continuous(
>     breaks = scaled_range,
>     labels = new_range
>   )
>
>
> # compare to plots w/o standardization
> m_7 <- glmer.nb(development ~ poly(temp,2) + poly(storage,2) + (1 |
> storage ), data = myds)
> ggpredict(m_7, terms = "temp [all]") |> plot(add.data = TRUE)
> ggpredict(m_7, terms = "storage [all]") |> plot(add.data = TRUE)
>
> -----Ursprüngliche Nachricht-----
> Von: R-sig-mixed-models <r-sig-mixed-models-bounces using r-project.org> Im
> Auftrag von Alexandre Santos via R-sig-mixed-models
> Gesendet: Mittwoch, 30. März 2022 13:34
> An: r-sig-mixed-models using r-project.org
> Betreff: [R-sig-ME] Representation model problem with standardized
> variables
>
> Hi Everyone!!
>
> I standardized my input variables (ds.scale) before glmm adjustments but
> in the final plot, I have a problem with the real-world scale of my
> variables and the predicted values by model (m_6). I´d like the original
> scale of my temp and storage variables represented in my better model
> (m_6). What is the correct approach for this? Do not standardise my input
> variables, despite I lot of warmings? Some data transformation at the end?
> I make:
>
> #Packages
> library(lme4)
> library(ggplot2)
> library(ggeffects)
> library(tidyverse)
> library(bbmle)
> library(broom)
>
> #Open my dataset
> myds<-read.csv("
> https://raw.githubusercontent.com/Leprechault/trash/main/ds.desenvol.csv")
> str(myds)
> # 'data.frame': 400 obs. of  4 variables:
> #  $ temp      : num  0 0 0 0 0 0 0 0 0 0 ...
> #  $ storage    : int  5 5 5 5 5 5 5 5 5 5 ...
> #  $ rep        : chr  "r1" "r2" "r3" "r4" ...
> #  $ development: int  0 23 22 27 24 25 24 22 0 22 ...
>
> # Storage (days) is temporally correlated with temperature then mixed model
> ds.scale<- myds %>%
>   mutate(across(c(temp, storage), ~ drop(scale(.))))
>
> # Models creation Poisson/Negative binomial
> m_1 <- glmer(development ~ temp + storage +
>               (1 | storage ), data = ds.scale,
>                 family = "poisson")
> m_2 <- glmer(development ~ poly(temp,2) + storage +
>               (1 | storage ), data = ds.scale,
>                 family = "poisson")
> m_3 <- glmer(development ~ poly(temp,2) + poly(storage,2) +
>               (1 | storage ), data = ds.scale,
>                 family = "poisson")
> m_4 <- glmer.nb(development ~ temp + storage +
>               (1 | storage ), data = ds.scale)
> m_5 <- glmer.nb(development ~ poly(temp,2) + storage +
>               (1 | storage ), data = ds.scale)
> m_6 <- glmer.nb(development ~ poly(temp,2) + poly(storage,2) +
>               (1 | storage ), data = ds.scale)
> modList <- tibble::lst(m_1,m_2,m_3,m_4,m_5,m_6)
> bbmle::AICtab(modList)
>
> #    dAIC df
> # m_6  0.0 7
> # m_3  1.0 6
> # m_5  3.3 6
> # m_2  5.0 5
> # m_4 17.9 5
> # m_1 21.0 4
>
> # Plot the results for my better model (m_6)
> mydf <- ggpredict(m_6, terms = c("temp [all]", "storage[all]"))
>
> # For temp
> ggplot(mydf, aes(x, predicted)) +
>   geom_point(data=myds, aes(temp, development), alpha = 0.5) +
>   geom_line() +
>   labs(x = "temp", y = "development")
>
> # For storage
> ggplot(mydf, aes(x, predicted)) +
>   geom_point(data=myds, aes(storage, development), alpha = 0.5) +
>   geom_line() +
>   labs(x = "storage", y = "development")
> #
> -------------------------------------------------------------------------------------------
>
>
>
> Please, any help with it?
> --
> Alexandre dos Santos
> Geotechnologies and Spatial Statistics applied to Forest Entomology
>
>
> _______________________________________________
> R-sig-mixed-models using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
> --
>
> _____________________________________________________________________
>
> Universitätsklinikum Hamburg-Eppendorf; Körperschaft des öffentlichen
> Rechts; Gerichtsstand: Hamburg | www.uke.de
> Vorstandsmitglieder: Prof. Dr. Burkhard Göke (Vorsitzender), Joachim
> Prölß, Prof. Dr. Blanche Schwappach-Pignataro, Marya Verdel
> _____________________________________________________________________
>
> SAVE PAPER - THINK BEFORE PRINTING
>
> _______________________________________________
> R-sig-mixed-models using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>

	[[alternative HTML version deleted]]



More information about the R-sig-mixed-models mailing list