[R-sig-ME] Comparing Model Performance Across Data Sets: report p values?
Karista Hudelson
karistaeh at gmail.com
Thu Aug 10 16:52:33 CEST 2017
Hello Phillip (and List),
Thank you again for your careful consideration of my question. Setting
REML to False was really helpful! In the end I decided to go ahead and
break the data into two phases and then apply the model. I tested the
power of the model on the two subsets to determine if my n values were
sufficient for the interpretation of the p values. I think this approach,
rather than leaving the data all in one set and including phase as a
variable, more directly addresses the hypothesis I was testing. You did
give me lots of "food for thought" which lead me in the right direction!
Happy researching all!
Karista
On Wed, Aug 9, 2017 at 5:05 AM, Phillip Alday <phillip.alday at mpi.nl> wrote:
> Hi Karista,
>
> it is not surprising that the combined model has a poorer overall fit
> than the separate models, for two reasons:
>
> 1. It has to model more data.
> 2. In some sense, it has fewer "independent" (I'm not using this word in
> a rigorous sense!) parameters than two distinct models because the two
> phases share a common set of parameters and thus the two distinct phases
> bias each other.
>
> The latter point is an example of the variance-bias tradeoff (which
> really came to light with the Stein paradox for OLS) or equivalently the
> overfitting-underfitting continuum. There are lots of good resources on
> this, but I particularly like McElreath's discussion in his book
> Statistical Rethinking.
>
> The tl;dr version is that the combination model will often have a poorer
> fit on the current data (bias away from observed means,etc.) but
> generalize better to new data (less variance). Or expressed in terms of
> "fitting", the two within-phase models tend to overfit a little bit to
> particular details of the data within each phase and thus will
> generalize less well the combination model, which will tend to underfit
> the data within each phase but generalize better to new data because it
> doesn't capture as much noisy detail.
>
> I would modify your combination model in one way though: I would include
> a main effect for phase.
>
> Also, when comparing models with different fixed-effects structures, it
> is important to use ML, i.e. set REML=FALSE, because the REML criterion
> is dependent on the fixed-effects parameterisation.
>
> This doesn't answer your questions directly, but hopefully gives you
> more food for thought. :)
>
> Best,
> Phillip
>
> On 08/08/2017 06:38 PM, Karista Hudelson wrote:
> > Hello again List,
> >
> > Thanks for the clarification question Thierry. I want to compare the
> > predictive ability of the model terms between the two phases. For
> > instance, is Sea Ice more important in phase 1? This comparison is
> > confounded somewhat by the unequal sample sizes in the phases I think,
> > but am not sure. Maybe that is part of my question: should I focus less
> > on the p values (as Phillip recommends in his first point I think) and
> > instead look at the overall model fit for each phase?
> >
> > Phillip, thank you for your second suggestion! I followed your advice
> > and included Phase in the model and also tried running it with
> > interactions between the fixed effects and phase.
> >
> > *_Here is the model without phase:_*
> >
> > FSVlmer1a<-lmer(logHg~Length+Res_Sea_Ice_Dur+Spring_MST+
> Summer_Rain+(1|WA),data=FSV2)
> >
> > REML criterion at convergence: -389.3
> >
> > Scaled residuals:
> > Min 1Q Median 3Q Max
> > -6.1650 -0.6235 -0.0447 0.6380 3.0889
> >
> > Random effects:
> > Groups Name Variance Std.Dev.
> > WA (Intercept) 0.11493 0.3390
> > Residual 0.03244 0.1801
> > Number of obs: 790, groups: WA, 5
> >
> > Fixed effects:
> > Estimate Std. Error df t value Pr(>|t|)
> > (Intercept) -1.064e+00 1.971e-01 1.130e+01 -5.399 0.000195 ***
> > Length 2.204e-02 1.105e-03 7.817e+02 19.952 < 2e-16 ***
> > Res_Sea_Ice_Dur 7.917e-04 2.977e-04 7.813e+02 2.660 0.007978 **
> > Spring_MST 1.892e-02 4.514e-03 7.812e+02 4.190 3.11e-05 ***
> > Summer_Rain -2.194e-03 3.650e-04 7.811e+02 -6.011 2.82e-09 ***
> > ---
> >> sem.model.fits(FSVlmer1a)
> > Class Family Link n Marginal Conditional
> > 1 merModLmerTest gaussian identity 790 0.127793 0.8080115
> >
> >> AIC(FSVlmer1a)
> > [1] -375.2507
> >
> > *_Same model with Phase interactions:_*
> >
> >>
> > FSV2lmer1bi<-lmer(logHg~Length*Phase+Res_Sea_Ice_Dur*
> Phase+Spring_MST*Phase+Summer_Rain*Phase+(1|WA),data=FSV2)
> >
> > REML criterion at convergence: -360.9
> >
> > Scaled residuals:
> > Min 1Q Median 3Q Max
> > -6.2490 -0.6285 -0.0176 0.6076 3.1211
> >
> > Random effects:
> > Groups Name Variance Std.Dev.
> > WA (Intercept) 0.11988 0.3462
> > Residual 0.03195 0.1788
> > Number of obs: 790, groups: WA, 5
> >
> > Fixed effects:
> > Estimate Std. Error df t value
> > Pr(>|t|)
> > (Intercept) -1.179e+00 2.122e-01 1.400e+01 -5.556
> > 7.10e-05 ***
> > Length 2.146e-02 1.204e-03 7.767e+02 17.827 <
> > 2e-16 ***
> > *Phasepre 8.858e-01 3.945e-01 7.765e+02 2.246
> > 0.025014 * *
> > Res_Sea_Ice_Dur 1.389e-03 3.963e-04 7.763e+02 3.504
> > 0.000484 ***
> > Spring_MST 1.680e-02 4.924e-03 7.761e+02 3.411
> > 0.000681 ***
> > Summer_Rain -2.254e-03 3.980e-04 7.760e+02 -5.664
> > 2.08e-08 ***
> > *Length:Phasepre 2.582e-03 2.917e-03 7.762e+02 0.885
> > 0.376294 *
> > *Phasepre:Res_Sea_Ice_Dur -4.806e-03 1.607e-03 7.764e+02 -2.990
> > 0.002876 ** *
> > *Phasepre:Spring_MST -8.681e-03 2.147e-02 7.760e+02 -0.404
> > 0.686088 *
> > *Phasepre:Summer_Rain -4.634e-03 2.072e-03 7.764e+02 -2.236
> > 0.025636 * *
> >
> >> AIC(FSV2lmer1bi)
> > [1] -336.8567
> >
> >> sem.model.fits(FSV2lmer1bi,aicc=T)
> > Class Family Link n Marginal Conditional
> > 1 merModLmerTest gaussian identity 790 0.126233 0.8161111
> >
> > So the overall fit metrics for these two models are not so different,
> > and the simpler one is a bit better.
> >
> > And in case it would be helpful/interesting, here are the fits of the
> > models for phase 1 and phase 2 (which were described in my first
> question):
> >
> > FSV2lmer1apre<-lmer(logHg~Length+Res_Sea_Ice_Dur+Spring_
> MST+Summer_Rain+(1|WA),data=FSV2pre)
> > # AIC 10.06269, R2s:0.1508716 0.7681201
> >
> > FSV2lmer1apost<-lmer(logHg~Length+Res_Sea_Ice_Dur+Spring_
> MST+Summer_Rain+(1|WA),data=FSV2post)
> > # AIC -335.1748, R2s: 0.1233518 0.8228584
> >
> > Thank you Phillip and Thierry for your kind and encouraging attention to
> > this question. I hope I can trouble you and the rest of the list for a
> > bit more instruction on this/these questions, as this issue is the crux
> > of the interpretation of this data.
> >
> > Looking forward to your thoughts and suggestions,
> > Karista
> >
> >
> > On Thu, Aug 3, 2017 at 4:40 AM, Phillip Alday <phillip.alday at mpi.nl
> > <mailto:phillip.alday at mpi.nl>> wrote:
> >
> > Dear Karista,
> >
> > as Thierry said, knowing more about the inferences you want to make
> will
> > get you better advice here. That said, I do have two suggestions in
> the
> > meantime:
> >
> > 1. Don't focus on significance, especially of individual predictors,
> as
> > much as estimates and overall model fit / predictive ability. (cf.
> The
> > New Statistics, The Difference between Significant and Insignificant
> is
> > not itself Significant, Choosing prediction over explanation in
> > psychology, etc.)
> >
> > 2. Put all your data into one model and include time period as a
> fixed
> > effect. Such pooling will generally help all your estimates;
> moreover,
> > it gives you a more principled way to compare time periods (both in
> the
> > main effect of time period and in its interactions with individual
> > variables).
> >
> > Best,
> > Phillip
> >
> > On 08/03/2017 10:20 AM, Thierry Onkelinx wrote:
> > > Dear Karista,
> > >
> > > Much depends on what you want to compare between the models. The
> > parameter
> > > estimates? The predicted values? The goodness of fit? You 'll need
> > to make
> > > that clear.
> > >
> > > Best regards,
> > >
> > >
> > > ir. Thierry Onkelinx
> > > Instituut voor natuur- en bosonderzoek / Research Institute for
> > Nature and
> > > Forest
> > > team Biometrie & Kwaliteitszorg / team Biometrics & Quality
> Assurance
> > > Kliniekstraat 25
> > > 1070 Anderlecht
> > > Belgium
> > >
> > > To call in the statistician after the experiment is done may be no
> > more
> > > than asking him to perform a post-mortem examination: he may be
> > able to say
> > > what the experiment died of. ~ Sir Ronald Aylmer Fisher
> > > The plural of anecdote is not data. ~ Roger Brinner
> > > The combination of some data and an aching desire for an answer
> > does not
> > > ensure that a reasonable answer can be extracted from a given body
> > of data.
> > > ~ John Tukey
> > >
> > > 2017-08-02 19:54 GMT+02:00 Karista Hudelson <karistaeh at gmail.com
> > <mailto:karistaeh at gmail.com>>:
> > >
> > >> Hello All,
> > >>
> > >> I am comparing the fit of a mixed model on different time periods
> > of a data
> > >> set. For the first time period I have 113 observations and only
> > one of the
> > >> fixed effects is significant. For the second time period I have
> 322
> > >> observations and all of the fixed effects are significant.
> > Because n is
> > >> important in the calculation of p, I'm not sure how or even if to
> > interpret
> > >> the differences in p values for the model terms in the two time
> > periods.
> > >> Does anyone have advice on how to compare the fit of the
> > variables in the
> > >> mixed model for the two data sets in a way that is less impacted
> > by the
> > >> difference in the number of observations? Or is a difference of
> 209
> > >> observations enough to drive these differences in p values?
> > >>
> > >> Time period 1 output:
> > >> Fixed effects:
> > >> Estimate Std. Error df t value Pr(>|t|)
> > >> (Intercept) -0.354795 0.811871 82.140000 -0.437 0.663
> > >> Length 0.024371 0.003536 106.650000 6.892 4.01e-10
> ***
> > >> Res_Sea_Ice_Dur -0.002408 0.002623 107.970000 -0.918 0.361
> > >> Sp_MST 0.014259 0.024197 106.310000 0.589 0.557
> > >> Summer_Rain -0.005015 0.003536 107.970000 -1.418 0.159
> > >>
> > >>
> > >> Time period 2 output:
> > >> Fixed effects:
> > >> Estimate Std. Error df t value Pr(>|t|)
> > >> (Intercept) -1.183e+00 3.103e-01 6.650e+00 -3.812 0.007281
> **
> > >> Length 1.804e-02 1.623e-03 3.151e+02 11.120 < 2e-16
> ***
> > >> Res_Sea_Ice_Dur 2.206e-03 5.929e-04 3.153e+02 3.721 0.000235
> ***
> > >> Spring_MST 1.022e-02 7.277e-03 3.150e+02 1.404 0.161319
> > >> Summer_Rain -1.853e-03 5.544e-04 3.150e+02 -3.343 0.000929
> ***
> > >>
> > >>
> > >>
> > >>
> > >> Thanks in advance for your time and consideration of this
> question.
> > >> Karista
> > >>
> > >> [[alternative HTML version deleted]]
> > >>
> > >> _______________________________________________
> > >> R-sig-mixed-models at r-project.org
> > <mailto:R-sig-mixed-models at r-project.org> mailing list
> > >> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> > <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>
> > >>
> > >
> > > [[alternative HTML version deleted]]
> > >
> > > _______________________________________________
> > > R-sig-mixed-models at r-project.org
> > <mailto:R-sig-mixed-models at r-project.org> mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> > <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>
> > >
> >
> >
> >
> >
> > --
> > Karista
>
--
Karista
[[alternative HTML version deleted]]
More information about the R-sig-mixed-models
mailing list