[R-sig-ME] Comparing Model Performance Across Data Sets: report p values?

Phillip Alday phillip.alday at mpi.nl
Wed Aug 9 11:05:05 CEST 2017


Hi Karista,

it is not surprising that the combined model has a poorer overall fit
than the separate models, for two reasons:

1. It has to model more data.
2. In some sense, it has fewer "independent" (I'm not using this word in
a rigorous sense!) parameters than two distinct models because the two
phases share a common set of parameters and thus the two distinct phases
bias each other.

The latter point is an example of the variance-bias tradeoff (which
really came to light with the Stein paradox for OLS) or equivalently the
overfitting-underfitting continuum. There are lots of good resources on
this, but I particularly like McElreath's discussion in his book
Statistical Rethinking.

The tl;dr version is that the combination model will often have a poorer
fit on the current data (bias away from observed means,etc.) but
generalize better to new data (less variance). Or expressed in terms of
"fitting", the two within-phase models tend to overfit a little bit to
particular details of the data within each phase and thus will
generalize less well the combination model, which will tend to underfit
the data within each phase but generalize better to new data because it
doesn't capture as much noisy detail.

I would modify your combination model in one way though: I would include
a main effect for phase.

Also, when comparing models with different fixed-effects structures, it
is important to use ML, i.e. set REML=FALSE, because the REML criterion
is dependent on the fixed-effects parameterisation.

This doesn't answer your questions directly, but hopefully gives you
more food for thought. :)

Best,
Phillip

On 08/08/2017 06:38 PM, Karista Hudelson wrote:
> Hello again List, 
> 
> Thanks for the clarification question Thierry.  I want to compare the
> predictive ability of the model terms between the two phases.  For
> instance, is Sea Ice more important in phase 1?  This comparison is
> confounded somewhat by the unequal sample sizes in the phases I think,
> but am not sure.  Maybe that is part of my question: should I focus less
> on the p values (as Phillip recommends in his first point I think) and
> instead look at the overall model fit for each phase?
> 
> Phillip, thank you for your second suggestion!  I followed your advice
> and included Phase in the model and also tried running it with
> interactions between the fixed effects and phase.  
> 
> *_Here is the model without phase:_*
> 
> FSVlmer1a<-lmer(logHg~Length+Res_Sea_Ice_Dur+Spring_MST+Summer_Rain+(1|WA),data=FSV2)
> 
> REML criterion at convergence: -389.3
> 
> Scaled residuals: 
>     Min      1Q  Median      3Q     Max 
> -6.1650 -0.6235 -0.0447  0.6380  3.0889 
> 
> Random effects:
>  Groups   Name        Variance Std.Dev.
>  WA       (Intercept) 0.11493  0.3390  
>  Residual             0.03244  0.1801  
> Number of obs: 790, groups:  WA, 5
> 
> Fixed effects:
>                   Estimate Std. Error         df t value Pr(>|t|)    
> (Intercept)     -1.064e+00  1.971e-01  1.130e+01  -5.399 0.000195 ***
> Length           2.204e-02  1.105e-03  7.817e+02  19.952  < 2e-16 ***
> Res_Sea_Ice_Dur  7.917e-04  2.977e-04  7.813e+02   2.660 0.007978 ** 
> Spring_MST       1.892e-02  4.514e-03  7.812e+02   4.190 3.11e-05 ***
> Summer_Rain     -2.194e-03  3.650e-04  7.811e+02  -6.011 2.82e-09 ***
> ---
>> sem.model.fits(FSVlmer1a)
>            Class   Family     Link   n Marginal Conditional
> 1 merModLmerTest gaussian identity 790 0.127793   0.8080115
> 
>> AIC(FSVlmer1a)
> [1] -375.2507
> 
> *_Same model with Phase interactions:_*
> 
>>
> FSV2lmer1bi<-lmer(logHg~Length*Phase+Res_Sea_Ice_Dur*Phase+Spring_MST*Phase+Summer_Rain*Phase+(1|WA),data=FSV2)
> 
> REML criterion at convergence: -360.9
> 
> Scaled residuals: 
>     Min      1Q  Median      3Q     Max 
> -6.2490 -0.6285 -0.0176  0.6076  3.1211 
> 
> Random effects:
>  Groups   Name        Variance Std.Dev.
>  WA       (Intercept) 0.11988  0.3462  
>  Residual             0.03195  0.1788  
> Number of obs: 790, groups:  WA, 5
> 
> Fixed effects:
>                            Estimate Std. Error         df t value
> Pr(>|t|)    
> (Intercept)              -1.179e+00  2.122e-01  1.400e+01  -5.556
> 7.10e-05 ***
> Length                    2.146e-02  1.204e-03  7.767e+02  17.827  <
> 2e-16 ***
> *Phasepre                  8.858e-01  3.945e-01  7.765e+02   2.246
> 0.025014 *  *
> Res_Sea_Ice_Dur           1.389e-03  3.963e-04  7.763e+02   3.504
> 0.000484 ***
> Spring_MST                1.680e-02  4.924e-03  7.761e+02   3.411
> 0.000681 ***
> Summer_Rain              -2.254e-03  3.980e-04  7.760e+02  -5.664
> 2.08e-08 ***
> *Length:Phasepre           2.582e-03  2.917e-03  7.762e+02   0.885
> 0.376294    *
> *Phasepre:Res_Sea_Ice_Dur -4.806e-03  1.607e-03  7.764e+02  -2.990
> 0.002876 ** *
> *Phasepre:Spring_MST      -8.681e-03  2.147e-02  7.760e+02  -0.404
> 0.686088    *
> *Phasepre:Summer_Rain     -4.634e-03  2.072e-03  7.764e+02  -2.236
> 0.025636 *  *
> 
>> AIC(FSV2lmer1bi)
> [1] -336.8567
> 
>> sem.model.fits(FSV2lmer1bi,aicc=T)
>            Class   Family     Link   n Marginal Conditional
> 1 merModLmerTest gaussian identity 790 0.126233   0.8161111
> 
> So the overall fit metrics for these two models are not so different,
> and the simpler one is a bit better.  
> 
> And in case it would be helpful/interesting, here are the fits of the
> models for phase 1 and phase 2 (which were described in my first question):
> 
> FSV2lmer1apre<-lmer(logHg~Length+Res_Sea_Ice_Dur+Spring_MST+Summer_Rain+(1|WA),data=FSV2pre)
> # AIC 10.06269, R2s:0.1508716   0.7681201
> 
> FSV2lmer1apost<-lmer(logHg~Length+Res_Sea_Ice_Dur+Spring_MST+Summer_Rain+(1|WA),data=FSV2post)
> # AIC -335.1748, R2s: 0.1233518   0.8228584
> 
> Thank you Phillip and Thierry for your kind and encouraging attention to
> this question.  I hope I can trouble you and the rest of the list for a
> bit more instruction on this/these questions, as this issue is the crux
> of the interpretation of this data.  
> 
> Looking forward to your thoughts and suggestions,
> Karista
> 
> 
> On Thu, Aug 3, 2017 at 4:40 AM, Phillip Alday <phillip.alday at mpi.nl
> <mailto:phillip.alday at mpi.nl>> wrote:
> 
>     Dear Karista,
> 
>     as Thierry said, knowing more about the inferences you want to make will
>     get you better advice here. That said, I do have two suggestions in the
>     meantime:
> 
>     1. Don't focus on significance, especially of individual predictors, as
>     much as estimates and overall model fit / predictive ability. (cf. The
>     New Statistics, The Difference between Significant and Insignificant is
>     not itself Significant, Choosing prediction over explanation in
>     psychology, etc.)
> 
>     2. Put all your data into one model and include time period as a fixed
>     effect. Such pooling will generally help all your estimates; moreover,
>     it gives you a more principled way to compare time periods (both in the
>     main effect of time period and in its interactions with individual
>     variables).
> 
>     Best,
>     Phillip
> 
>     On 08/03/2017 10:20 AM, Thierry Onkelinx wrote:
>     > Dear Karista,
>     >
>     > Much depends on what you want to compare between the models. The
>     parameter
>     > estimates? The predicted values? The goodness of fit? You 'll need
>     to make
>     > that clear.
>     >
>     > Best regards,
>     >
>     >
>     > ir. Thierry Onkelinx
>     > Instituut voor natuur- en bosonderzoek / Research Institute for
>     Nature and
>     > Forest
>     > team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
>     > Kliniekstraat 25
>     > 1070 Anderlecht
>     > Belgium
>     >
>     > To call in the statistician after the experiment is done may be no
>     more
>     > than asking him to perform a post-mortem examination: he may be
>     able to say
>     > what the experiment died of. ~ Sir Ronald Aylmer Fisher
>     > The plural of anecdote is not data. ~ Roger Brinner
>     > The combination of some data and an aching desire for an answer
>     does not
>     > ensure that a reasonable answer can be extracted from a given body
>     of data.
>     > ~ John Tukey
>     >
>     > 2017-08-02 19:54 GMT+02:00 Karista Hudelson <karistaeh at gmail.com
>     <mailto:karistaeh at gmail.com>>:
>     >
>     >> Hello All,
>     >>
>     >> I am comparing the fit of a mixed model on different time periods
>     of a data
>     >> set.  For the first time period I have 113 observations and only
>     one of the
>     >> fixed effects is significant.  For the second time period I have 322
>     >> observations and all of the fixed effects are significant. 
>     Because n is
>     >> important in the calculation of p, I'm not sure how or even if to
>     interpret
>     >> the differences in p values for the model terms in the two time
>     periods.
>     >> Does anyone have advice on how to compare the fit of the
>     variables in the
>     >> mixed model for the two data sets in a way that is less impacted
>     by the
>     >> difference in the number of observations?  Or is a difference of 209
>     >> observations enough to drive these differences in p values?
>     >>
>     >> Time period 1 output:
>     >> Fixed effects:
>     >>                   Estimate Std. Error         df t value Pr(>|t|)
>     >> (Intercept)      -0.354795   0.811871  82.140000  -0.437    0.663
>     >> Length            0.024371   0.003536 106.650000   6.892 4.01e-10 ***
>     >> Res_Sea_Ice_Dur  -0.002408   0.002623 107.970000  -0.918    0.361
>     >> Sp_MST        0.014259   0.024197 106.310000   0.589    0.557
>     >> Summer_Rain      -0.005015   0.003536 107.970000  -1.418    0.159
>     >>
>     >>
>     >> Time period 2 output:
>     >> Fixed effects:
>     >>                   Estimate Std. Error         df t value Pr(>|t|)
>     >> (Intercept)     -1.183e+00  3.103e-01  6.650e+00  -3.812 0.007281 **
>     >> Length           1.804e-02  1.623e-03  3.151e+02  11.120  < 2e-16 ***
>     >> Res_Sea_Ice_Dur  2.206e-03  5.929e-04  3.153e+02   3.721 0.000235 ***
>     >> Spring_MST       1.022e-02  7.277e-03  3.150e+02   1.404 0.161319
>     >> Summer_Rain     -1.853e-03  5.544e-04  3.150e+02  -3.343 0.000929 ***
>     >>
>     >>
>     >>
>     >>
>     >> Thanks in advance for your time and consideration of this question.
>     >> Karista
>     >>
>     >>         [[alternative HTML version deleted]]
>     >>
>     >> _______________________________________________
>     >> R-sig-mixed-models at r-project.org
>     <mailto:R-sig-mixed-models at r-project.org> mailing list
>     >> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>     <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>
>     >>
>     >
>     >       [[alternative HTML version deleted]]
>     >
>     > _______________________________________________
>     > R-sig-mixed-models at r-project.org
>     <mailto:R-sig-mixed-models at r-project.org> mailing list
>     > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>     <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>
>     >
> 
> 
> 
> 
> -- 
> Karista



More information about the R-sig-mixed-models mailing list