[R-sig-ME] strange model fit- help

Thu Mar 26 16:08:15 CET 2020

Thank a lot for the hints.
Rescaling into minutes helped a little. I would like to retain both the 
raw polynomial terms, both the non-centered time (for ease of 
readability). The model was written explicitly (not poly(time, 2, 
raw=TRUE) for the same reasons.

The random part is quite complex but i will retain it as it is, because 
of the almost perfect fit.

Thanks,

Marko

On 15. 03. 2020. 13:14, Thierry Onkelinx wrote:
> Dear Marko,
> 
> Keep in mind that squating time in seconds lead to large numbers (489 ^ 
> 2 = 239121). This forces the parameters estimates to be small. You can 
> solve this either by using orthogonal polynomial (poly(t, 2)) or by 
> rescaling t (e.g. in minutes rather than seconds: 489 s = 8.15 min, 8.15 
> ^ 2 = 66.4225) If you go for rescaling, then create 2 variables: t_min 
> and t_min2 (= t_min ^ 2). That will make your formula more reable.
> 
> It looks like you coded stim as an ordered factor. That not required 
> since you have only two levels. Use a default factor with before as 
> reference.
> 
> The problem with the strong correlations between t and t^2 random 
> effects is that they are highly correlated themselves. cor(0:489, 
> (0:489) ^ 2) = 0.986 Note that is isn't solved by rescaling. Only 
> centering works .eg centering at 4 minutes yields cor(0:489 - 4 * 60 / 
> 60, (0:489 - 4 * 60) ^ 2) = 0.071 Orthogonal polynomials have the 
> benefit that they are uncorrelated by definition. cor(poly(0:489, 2))
> 
> Bottomline: always scale and center polynomials. I prefer to scale and 
> center to revelant values, e.g. scale to a different unit and center to 
> an important point near the middle of the domain. That keeps your 
> parameters interpretable without the need to recalculate them (as you 
> would when scaling by the standard deviation and center to the mean).
> 
> Best regards,
> 
> Thierry
> 
> 
> ir. Thierry Onkelinx
> Statisticus / Statistician
> 
> Vlaamse Overheid / Government of Flanders
> INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE 
> AND FOREST
> Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
> thierry.onkelinx using inbo.be <mailto:thierry.onkelinx using inbo.be>
> Havenlaan 88 bus 73, 1000 Brussel
> www.inbo.be <http://www.inbo.be>
> 
> ///////////////////////////////////////////////////////////////////////////////////////////
> To call in the statistician after the experiment is done may be no more 
> than asking him to perform a post-mortem examination: he may be able to 
> say what the experiment died of. ~ Sir Ronald Aylmer Fisher
> The plural of anecdote is not data. ~ Roger Brinner
> The combination of some data and an aching desire for an answer does not 
> ensure that a reasonable answer can be extracted from a given body of 
> data. ~ John Tukey
> ///////////////////////////////////////////////////////////////////////////////////////////
> 
> <https://www.inbo.be>
> 
> 
> Op za 14 mrt. 2020 om 17:10 schreef marKo via R-sig-mixed-models 
> <r-sig-mixed-models using r-project.org 
> <mailto:r-sig-mixed-models using r-project.org>>:
> 
>     Hi.
> 
>     I have fitted a relatively complicated model to electrodermal data (a
>     simple resting and stimulus situation). The data summary follows.
> 
>       > summary(data)
>              id             sc              t              stim
>        g1_1   :  49   Min.   :26798   Min.   :  1.0   before  :3201
>        g1_12  :  49   1st Qu.:32299   1st Qu.:123.0   after   :1543
>        g1_13  :  49   Median :32486   Median :245.0
>        g1_14  :  49   Mean   :32253   Mean   :244.9
>        g1_15  :  49   3rd Qu.:32587   3rd Qu.:367.0
>        g1_2   :  49   Max.   :32761   Max.   :489.0
>        (Other):4450
> 
>     id (person), and stim are factors, t is time (in s) and sc is skin
>     conductance level. Sc distribution is quite negatively asymmetrical at
>     the dataset level, although not that bad at the id level. As the
>     stimulus occur at a specified time, those two variables are correlated
>     (0.81).
> 
>     The model follows.
> 
>     m1<-lmer(sc~1+t+I(t^2)+stim+stim:t+stim:I(t^2)+(1+t+I(t^2)+stim+stim:t+stim:I(t^2)|id),
> 
>     data=data)
> 
>     Here goes the summary.
> 
>       > summary(m1)
>     Linear mixed model fit by maximum likelihood  ['lmerMod']
>     Formula: sc ~ 1 + t + I(t^2) + stim + stim:t + stim:I(t^2) + (1 + t +
>     I(t^2) + stim + stim:t + stim:I(t^2) | id)
>          Data: data
> 
>            AIC      BIC   logLik deviance df.resid
>        62325.9  62506.9 -31134.9  62269.9     4716
> 
>     Scaled residuals:
>            Min       1Q   Median       3Q      Max
>     -24.3783  -0.1551  -0.0074   0.1392  12.5288
> 
>     Random effects:
>        Groups   Name          Variance  Std.Dev.  Corr
>        id       (Intercept)   4.681e+04 2.164e+02
>                 t             7.925e+00 2.815e+00  1.00
>                 I(t^2)        2.559e-05 5.059e-03 -0.87 -0.87
>                 stim.L        1.591e+05 3.989e+02 -0.12 -0.12  0.17
>                 t:stim.L      1.105e+00 1.051e+00 -0.58 -0.58  0.78  0.33
>                 I(t^2):stim.L 2.367e-05 4.865e-03  0.06  0.06 -0.21
>     -0.76 -0.45
>        Residual               2.049e+04 1.432e+02
>     Number of obs: 4744, groups:  id, 97
> 
>     Fixed effects:
>                       Estimate Std. Error t value
>     (Intercept)    2.960e+04  1.637e+02  180.85
>     t              1.291e+01  8.493e-01   15.20
>     I(t^2)        -1.579e-02  1.110e-03  -14.21
>     stim.L        -3.956e+03  2.329e+02  -16.98
>     t:stim.L       2.047e+01  1.136e+00   18.01
>     I(t^2):stim.L -2.477e-02  1.478e-03  -16.76
> 
>     Correlation of Fixed Effects:
>                   (Intr) t      I(t^2) stim.L t:st.L
>     t           -0.886
>     I(t^2)       0.811 -0.966
>     stim.L       0.972 -0.930  0.868
>     t:stim.L    -0.989  0.911 -0.826 -0.973
>     I(t^2):st.L  0.917 -0.858  0.761  0.870 -0.947
> 
>     The fit of the model is quite good (pseudo r2 is 0.96), but have some
>     problems:
>     1: quite “extreme” residuals (-24.3783,  12.5288)
>     2: quite high correlations among random effects
>     3: lousy qqplot (apart from the perfect fit on the  from -2 to +2 std
>     normal quantiles)
> 
>     Help please? What is wrong with the model (something is, I’m sure).
> 
> 
> 
> 
>     -- 
>     Marko Tončić, PhD
>     Postdoctoral research assistant
>     University of Rijeka
>     Faculty of Humanities and Social Sciences
>     Department of Psychology
>     Sveucilisna avenija 4, 51000 Rijeka, CROATIA
>     e-mail: mtoncic using ffri.hr <mailto:mtoncic using ffri.hr>
> 
>     _______________________________________________
>     R-sig-mixed-models using r-project.org
>     <mailto:R-sig-mixed-models using r-project.org> mailing list
>     https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> 

-- 
Marko Tončić, PhD
Postdoctoral research assistant
University of Rijeka
Faculty of Humanities and Social Sciences
Department of Psychology
Sveucilisna avenija 4, 51000 Rijeka, CROATIA
e-mail: mtoncic using ffri.hr