[R-sig-ME] strange model fit- help

Sun Mar 15 13:14:02 CET 2020

Dear Marko,

Keep in mind that squating time in seconds lead to large numbers (489 ^ 2 =
239121). This forces the parameters estimates to be small. You can solve
this either by using orthogonal polynomial (poly(t, 2)) or by rescaling t
(e.g. in minutes rather than seconds: 489 s = 8.15 min, 8.15 ^ 2 = 66.4225)
If you go for rescaling, then create 2 variables: t_min and t_min2 (= t_min
^ 2). That will make your formula more reable.

It looks like you coded stim as an ordered factor. That not required since
you have only two levels. Use a default factor with before as reference.

The problem with the strong correlations between t and t^2 random effects
is that they are highly correlated themselves. cor(0:489, (0:489) ^ 2) =
0.986 Note that is isn't solved by rescaling. Only centering works .eg
centering at 4 minutes yields cor(0:489 - 4 * 60 / 60, (0:489 - 4 * 60) ^
2) = 0.071 Orthogonal polynomials have the benefit that they are
uncorrelated by definition. cor(poly(0:489, 2))

Bottomline: always scale and center polynomials. I prefer to scale and
center to revelant values, e.g. scale to a different unit and center to an
important point near the middle of the domain. That keeps your parameters
interpretable without the need to recalculate them (as you would when
scaling by the standard deviation and center to the mean).

Best regards,

Thierry

ir. Thierry Onkelinx
Statisticus / Statistician

Vlaamse Overheid / Government of Flanders
INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND
FOREST
Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
thierry.onkelinx using inbo.be
Havenlaan 88 bus 73, 1000 Brussel
www.inbo.be

///////////////////////////////////////////////////////////////////////////////////////////
To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey
///////////////////////////////////////////////////////////////////////////////////////////

<https://www.inbo.be>

Op za 14 mrt. 2020 om 17:10 schreef marKo via R-sig-mixed-models <
r-sig-mixed-models using r-project.org>:

> Hi.
>
> I have fitted a relatively complicated model to electrodermal data (a
> simple resting and stimulus situation). The data summary follows.
>
>  > summary(data)
>         id             sc              t              stim
>   g1_1   :  49   Min.   :26798   Min.   :  1.0   before  :3201
>   g1_12  :  49   1st Qu.:32299   1st Qu.:123.0   after   :1543
>   g1_13  :  49   Median :32486   Median :245.0
>   g1_14  :  49   Mean   :32253   Mean   :244.9
>   g1_15  :  49   3rd Qu.:32587   3rd Qu.:367.0
>   g1_2   :  49   Max.   :32761   Max.   :489.0
>   (Other):4450
>
> id (person), and stim are factors, t is time (in s) and sc is skin
> conductance level. Sc distribution is quite negatively asymmetrical at
> the dataset level, although not that bad at the id level. As the
> stimulus occur at a specified time, those two variables are correlated
> (0.81).
>
> The model follows.
>
> m1<-lmer(sc~1+t+I(t^2)+stim+stim:t+stim:I(t^2)+(1+t+I(t^2)+stim+stim:t+stim:I(t^2)|id),
>
> data=data)
>
> Here goes the summary.
>
>  > summary(m1)
> Linear mixed model fit by maximum likelihood  ['lmerMod']
> Formula: sc ~ 1 + t + I(t^2) + stim + stim:t + stim:I(t^2) + (1 + t +
> I(t^2) + stim + stim:t + stim:I(t^2) | id)
>     Data: data
>
>       AIC      BIC   logLik deviance df.resid
>   62325.9  62506.9 -31134.9  62269.9     4716
>
> Scaled residuals:
>       Min       1Q   Median       3Q      Max
> -24.3783  -0.1551  -0.0074   0.1392  12.5288
>
> Random effects:
>   Groups   Name          Variance  Std.Dev.  Corr
>   id       (Intercept)   4.681e+04 2.164e+02
>            t             7.925e+00 2.815e+00  1.00
>            I(t^2)        2.559e-05 5.059e-03 -0.87 -0.87
>            stim.L        1.591e+05 3.989e+02 -0.12 -0.12  0.17
>            t:stim.L      1.105e+00 1.051e+00 -0.58 -0.58  0.78  0.33
>            I(t^2):stim.L 2.367e-05 4.865e-03  0.06  0.06 -0.21 -0.76 -0.45
>   Residual               2.049e+04 1.432e+02
> Number of obs: 4744, groups:  id, 97
>
> Fixed effects:
>                  Estimate Std. Error t value
> (Intercept)    2.960e+04  1.637e+02  180.85
> t              1.291e+01  8.493e-01   15.20
> I(t^2)        -1.579e-02  1.110e-03  -14.21
> stim.L        -3.956e+03  2.329e+02  -16.98
> t:stim.L       2.047e+01  1.136e+00   18.01
> I(t^2):stim.L -2.477e-02  1.478e-03  -16.76
>
> Correlation of Fixed Effects:
>              (Intr) t      I(t^2) stim.L t:st.L
> t           -0.886
> I(t^2)       0.811 -0.966
> stim.L       0.972 -0.930  0.868
> t:stim.L    -0.989  0.911 -0.826 -0.973
> I(t^2):st.L  0.917 -0.858  0.761  0.870 -0.947
>
> The fit of the model is quite good (pseudo r2 is 0.96), but have some
> problems:
> 1: quite “extreme” residuals (-24.3783,  12.5288)
> 2: quite high correlations among random effects
> 3: lousy qqplot (apart from the perfect fit on the  from -2 to +2 std
> normal quantiles)
>
> Help please? What is wrong with the model (something is, I’m sure).
>
>
>
>
> --
> Marko Tončić, PhD
> Postdoctoral research assistant
> University of Rijeka
> Faculty of Humanities and Social Sciences
> Department of Psychology
> Sveucilisna avenija 4, 51000 Rijeka, CROATIA
> e-mail: mtoncic using ffri.hr
>
> _______________________________________________
> R-sig-mixed-models using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>

	[[alternative HTML version deleted]]