[R-sig-ME] Using nlme package in an information-theoretic framework, AIC

Douglas Bates bates at stat.wisc.edu
Fri Jan 14 20:12:55 CET 2011


On Wed, Jan 12, 2011 at 10:32 AM, Gustaf Granath
<Gustaf.Granath at ebc.uu.se> wrote:
> Hi,
> I just wanted to inform about out the paper by Greven and Kneib 2010.
> /Biometrika/, 97(4): 773-789 and their cAIC package
> (http://www.bepress.com/jhubiostat/paper202/).
>
> Btw, the reported AIC by lmer is based on the marginal likelihood, right
> (i.e. mAIC is reported)?

I find such questions difficult to answer because to me there is only
one likelihood.  Once you define a probability model there is only one
way to define the likelihood, which is the value of the probability
density (or mass) function evaluated at the observed responses and
regarded as a function of the parameters.  In this case the density
function is the marginal density of the responses so I guess you could
call it a marginal likelihood, although I can't see any other way of
defining a likelihood.  Even the REML criterion, which is supposed to
determine "restricted maximum likelihood" or "residual maximum
likelihood" estimates, is not a likelihood.

>
> Cheers,
>
> Gustaf Granath
>
>> ontent-Type: text/plain; charset=ISO-8859-1
>>
>> On 11-01-04 01:40 PM, Corey VanStratt wrote:
>>> >  Hi Ben,
>>> >
>>> >  Thanks for the response; good to know that there are not any
>>> >  immediate procedural issues.
>>> >
>>> >  I've looked back through some of the threads and personal
>>> >  correspondences, and I think the main issue raised was that it can be
>>> >  challenging to count the 'degrees of freedom'. For example, from a
>>> >  personal communication: "A second point is that you should proceed a
>>> >  bit cautiously using the AIC that's being outputted in R.  The reason
>>> >  for this is that it's the fixed-effects version of AIC, and what
>>> >  you're wanting is a random-effects version...  The fixed-effects
>>> >  version only adds a penalty of 1 parameter for each type of random
>>> >  effect, which is an underestimate of real complexity." However, from
>>> >  your response, I may not have to worry about this given that I'm
>>> >  comparing models with the same random effects?
>>    Yes, although you still need to count degrees of freedom ascribed to
>> random effects in order to determine your 'residual degrees of freedom'
>> for AICc; this leads to an even murkier corner of the literature,
>> because with blocked and/or correlated data it's also not really clear
>> (as I mentioned obliquely) how to define your 'total number of data
>> points' either.  (AICc was defined for linear models in the first place,
>> and has been stretched way beyond its original derivation, although it's
>> not clear to me whether which is the lesser of two evils -- pretending
>> you have an unlimited amount of data or using a poorly supported
>> finite-size correction ...)
>>    If you were comparing models with different number of random effects,
>> I think Vaida and Blanchard's (2005?) paper on conditional AIC actually
>> gives a pretty strong justification that if you're interested in the
>> population level (and not predictions for the individual random-effects
>> units) that counting variance parameters rather than something more
>> complicated is the right way to go.
>>     If this were really important and you were willing to go to a lot of
>> trouble to make sure the answer was as correct as modern statistical
>> technology could make it, I would probably try some sort of
>> cross-validation approach.
>>     If it makes you feel any better, I think the problem with AIC[c] in
>> this context is not that nlme does something wrong, but that nobody
>> really knows what the right answer is.  (That might make you feel worse
>> rather than better.)
>>
>>> >  And to answer your other question: yes, we are using multi-model
>>> >  inference.  We determined parameter likelihood values by summing the
>>> >  weights of all models that a particular parameter of interest
>>> >  occurred in.  We calculated model averaged parameter estimates and
>>> >  unconditional standard errors for all predictors and we used these
>>> >  values as our prediction values in the models.
>>    Multi-model inference is definitely better than model selection, but
>> there are a number of things I don't like about this approach (but again
>> be aware that I am probably in the minority).
>>
>>     1. "Parameter likelihood values" are not in themselves ridiculous,
>> but they tend to get overinterpreted.  In my opinion (I'd be happy to be
>> corrected) they don't have a coherent statistical meaning; you could
>> consider them a quasi-pseudo-Bayesian estimate of the probability that a
>> parameter 'belongs in the true model', but I don't even want to list how
>> many problems there are with that definition ...  How are you
>> interpreting them?
>>
>>    2. It's tricky to combine parameters from different models whenever
>> the parameters potentially have different meanings in the different
>> models (e.g. when there are interactions in some of the models and not
>> in others, or when some of the predictors are collinear).  It's best to
>> compute model averages for things that have real biological meaning and
>> interest, such as (in your case) foraging rates.  If you want to compute
>> model-averaged effects of different environmental conditions on foraging
>> rates, that may indeed mean that you want to model-average parameters,
>> but you should be careful that the model parameters are really
>> interpretable in similar ways across the models you're considering.
>>    Just don't take your model-averaged parameters and attribute
>> "significance" to whether the model-averaged confidence intervals
>> overlap zero, because then you are truly doing hypothesis testing by the
>> back door ...
>>
>>> >  Ultimately, we'll be
>>> >  comparing rates of foraging at this site to other sites along the
>>> >  wintering latitudinal cline (these ducks winter from Alaska down to
>>> >  Baja), but we're proceeding with the understanding that as far as
>>> >  overall population inference goes, we're limited in our inference due
>>> >  to limited sites (n=5).
>>    Is there geographic structure to your sites?  You might squeeze out a
>> tiny bit more information by incorporating this in the model (rather
>> than 'just' the exponential spatial correlation you're currently using).
>>
>>> >  Thanks again for commenting on the overall approach that I've used
>>> >  thus far.
>>> >
>>> >  Cheers, Corey
>>> >
>>> >
>>> >  Message: 3 Date: Mon, 03 Jan 2011 20:34:26 -0500 From: Ben Bolker
>>> >  <bbolker at gmail.com>  To:r-sig-mixed-models at r-project.org  Subject: Re:
>>> >  [R-sig-ME] Using nlme package in an information-theoretic framework,
>>> >  AIC Message-ID:<4D227922.6030206 at gmail.com>  Content-Type:
>>> >  text/plain; charset=UTF-8
>>> >
>>> >  On 11-01-03 06:51 PM, Corey VanStratt wrote:
>>>> >>  Dear Listers,
>>>> >>
>>>> >>  I am studying the foraging effort of surf scoters (a sea duck)
>>>> >>  during the winter period @ a site in southeast Alaska. This
>>>> >>  particular species only dives for its food, and therefore the
>>>> >>  response of interest is both minutes underwater per hour and
>>>> >>  minutes underwater per day.  These 2 metrics provide a reliable
>>>> >>  estimate of how often these birds are feeding during the winter,
>>>> >>  and they can be compared to estimates collected in the same manner
>>>> >>  at different winter sites. To collect these data, we attached VHF
>>>> >>  transmitters to 110 individuals; we monitored individuals during
>>>> >>  1-hour foraging observations; we knew a bird was diving when a
>>>> >>  signal was lost, as the VHF radio signal does not transmit through
>>>> >>  water.  For each observation, we calculated total minutes
>>>> >>  underwater/hour and extrapolated to minutes underwater/day (these
>>>> >>  ducks are diurnal foragers, so based daily rates on amount of
>>>> >>  daylight hours).  Of the 110 individuals marked, we were able to
>>>> >>  get at least 2 observations on 65 individuals. Over 2 seasons,
>>>> >>  there were a total of 959 1-hour observations on these 65
>>>> >>  individuals.
>>>> >>
>>>> >>  The data set was unbalanced and we observed the same individuals
>>>> >>  over time (repeated measures, therefore requiring mixed models);
>>>> >>  the number of observation periods was not equal among all
>>>> >>  individuals and time between observations within and among
>>>> >>  individuals was not equal. Therefore, we required a mixed modeling
>>>> >>  procedure to account for these repeated measures of individuals.
>>>> >>  We used the nlme package in R to specify individual as a random
>>>> >>  effect.
>>>> >>
>>>> >>  Foraging effort variation at all sites was evaluated using an
>>>> >>  information-theoretic approach.  Therefore, rather than 'model
>>>> >>  building', as is described in much of the nlme documentation and
>>>> >>  resources, I am testing several pre-determined, biologically
>>>> >>  plausible models. However, I have read in prior posts that AIC
>>>> >>  values may not be reliable from this package, and I have been using
>>>> >>  these values to compare the models. So, I have three main
>>>> >>  questions:
>>>> >>
>>>> >>  1.) Does the process I describe below to analyze these data make
>>>> >>  sense?
>>>> >>
>>>> >>  2.) In the case of this data set and the candidate model set, is
>>>> >>  it appropriate to use the given AIC values from the nlme output
>>>> >>  for determining which model is most parsimonious?
>>>> >>
>>>> >>  AND
>>>> >>
>>>> >>  3.) If it is not appropriate to use these AIC values, what is the
>>>> >>  best alternative (e.g. how could I obtain AIC values that are
>>>> >>  reliable)? So, moving onto the process:
>>>> >>
>>>> >>  We evaluated the variation in response of both minutes underwater
>>>> >>  per hr and per day in relation to several covariates.  We
>>>> >>  considered 16 biologically plausible a priori candidate models (see
>>>> >>  below).  We explored date as an explanatory variable, and we
>>>> >>  included it both as a date and date squared term (referred to as
>>>> >>  DATE), to allow variation in foraging effort to vary non-linearly
>>>> >>  over time; we adjusted all date values in both years to correspond
>>>> >>  to 2 December = day 1.  We always considered sex (male or female)
>>>> >>  and age [hatch year (HY) or after hatch year (AHY)] together as
>>>> >>  COHORT, as these four groups often behave differently.  We included
>>>> >>  a YEAR term (2008/09 or 2009/10) to account for variation between
>>>> >>  years.  We considered the following environmental factors (ENVIRON)
>>>> >>  in each model except for the true null: tide height (m),
>>>> >>  temperature (?C), time of day of observation, and wind speed
>>>> >>  (categorical with three levels?low, mid, high).  We did this to
>>>> >>  both limit the number of models in our set, and because we merely
>>>> >>  wanted to account for environmental effects. Finally, we included
>>>> >>  the interaction effect terms COHORT*DATE and COHORT*YEAR, because
>>>> >>  we believed it quite feasible that the different cohorts would
>>>> >>  respond to environmental pressures differently both across time
>>>> >>  within a winter season and between the winter seasons.
>>>> >>
>>>> >>  MODEL   EXPLANATORY VARIABLES
>>> >
>>> >  fmN       Null fm1        Environ fm2     Date  + environ fm3     Cohort  + environ fm4
>>> >  Year  + environ fm5       Date + cohort + environ fm6     Date + year +
>>> >  environ fm7       Cohort + year + environ fm8     Date + cohort + year +
>>> >  environ fm9       Date + environ + cohort x date fm10     Year + environ +
>>> >  cohort x date fm11        Date + cohort + environ + cohort x date fm12    Date
>>> >  + cohort + year + environ + cohort x date fm13    Cohort + environ +
>>> >  cohort x year fm14        Date + cohort + environ + cohort x year fm15    Date
>>> >  + cohort + year + environ + cohort x date + cohort x year
>>> >
>>>> >>  For each response variable, we first ran the most global model
>>>> >>  (fm15) in our set using a model with no random effects, a model
>>>> >>  with a random intercept (allowed intercept to vary among
>>>> >>  individuals), and a model with a random intercept and slope
>>>> >>  (allowed both intercept to vary among individuals and slope to vary
>>>> >>  within each individual over DATE).  'ID' in the random effects
>>>> >>  refers to each of the 65 uniquely identified ducks.  Using an AIC
>>>> >>  approach, we found that the random intercept and slope model was
>>>> >>  the most parsimonious.  Proceeding with this global model and
>>>> >>  continuing with an AIC information-theoretic approach, we ran the
>>>> >>  global model (inc. the random intercept/slope) against several
>>>> >>  different feasible covariance structures, and found that for both
>>>> >>  response variables (minutes underwater hr-1 and day-1), an
>>>> >>  exponential covariance structure was the best-fitting based on AIC
>>>> >>  values.  Therefore, we applied this covariance structure to all of
>>>> >>  the models in the candidate model set and proceeded with the
>>>> >>  overall best-fitting model selection process. That final most
>>>> >>  parameterized model specification as written for R is as follows:
>>>> >>
>>>> >>  fm15<-
>>>> >>  lme(fixed=MINh~((Date+DateSq)*(Cohort))+((Cohort)*Year)+Tide+
>>>> >>  Temp+factor(Sea)+Time.start,data=data1,
>>>> >>  random=~Date+DateSq|ID,corr=exp2,method="ML")
>>>> >>
>>>> >>  We calculated Akaike?s Information Criterion (AICc), adjusted for
>>>> >>  small sample size, for all of the models in the candidate model
>>>> >>  set from the AIC values given in the lme output.  We then examined
>>>> >>  relative support for each model using ? AICc and AICc weights
>>>> >>  (wi).
>>>> >>
>>>> >>  So, again, being relatively new to both R and mixed modeling, I
>>>> >>  would be much appreciative if those with a greater understanding of
>>>> >>  the process could inform me if the process I have used indeed is
>>>> >>  appropriate/makes sense (as I have not used a model building
>>>> >>  exercise that is often described when using this package), and if
>>>> >>  the AIC values can in fact be used with relative confidence.
>>>> >>
>>>> >>  Thanks so much for any insight, Corey VanStratt
>>>> >>
>>>> >>
>>> >  Example of output from fm15:
>>> >
>>> >  Linear mixed-effects model fit by maximum likelihood Data: data1 AIC
>>> >  BIC    logLik 5723.309 5835.224 -2838.654
>>> >
>>> >  Random effects: Formula: ~DateB + DateSq | ID Structure: General
>>> >  positive-definite, Log-Cholesky parametrization StdDev      Corr
>>> >  (Intercept) 3.500360472 (Intr) DateB DateB       0.175541462 -0.680
>>> >  DateSq      0.001982993  0.576 -0.967 Residual    4.185115132
>>> >
>>> >  Correlation Structure: Exponential spatial correlation Formula:
>>> >  ~DateB | ID Parameter estimate(s): range 0.674106 Fixed effects: MINh
>>> >  ~ DateB + DateSq + Cohort + (Cohort):(Season) + Tide + Temp +
>>> >  factor(Sea) + Time.start Value Std.Error  DF   t-value p-value
>>> >  (Intercept)         9.972296  2.624219 887  3.800100  0.0002 DateB
>>> >  0.025724  0.031821 887  0.808423  0.4191 DateSq              0.000128
>>> >  0.000336 887  0.379635  0.7043 CohortAHY M        -7.337342  3.060351
>>> >  57 -2.397549  0.0198 CohortHY F          4.634536  4.220465  57
>>> >  1.098110  0.2768 CohortHY M         -7.870449  3.346530  57 -2.351824
>>> >  0.0222 Tide                0.157180  0.088498 887  1.776074  0.0761
>>> >  Temp               -0.009083  0.039280 887 -0.231239  0.8172
>>> >  factor(Sea)6        0.111870  2.194239 887  0.050983  0.9594
>>> >  factor(Sea)7-8      3.294262  1.961388 887  1.679556  0.0934
>>> >  Time.start          0.087318  0.068847 887  1.268298  0.2050
>>> >  CohortAHY F:Season -1.925349  1.412226  57 -1.363344  0.1781
>>> >  CohortAHY M:Season  1.804896  1.235439  57  1.460935  0.1495 CohortHY
>>> >  F:Season  -4.522203  1.914241  57 -2.362400  0.0216 CohortHY M:Season
>>> >  2.323518  1.469479  57  1.581185  0.1194 Correlation:
>>> >
>>> >
>>> >  Standardized Within-Group Residuals: Min         Q1        Med
>>> >  Q3        Max -2.5466867 -0.6254431 -0.1261256  0.5150277  3.4233514
>>> >
>>> >  ===================================
>>> >
>>> >  As far as I can tell, this all looks pretty reasonable.
>>> >
>>> >  General questions/comments:
>>> >
>>> >
>>> >  - what are your concerns about unreliable AIC values?  Can you quote
>>> >  a specific post to this list?  There are various concerns with AICs
>>> >  in the mixed model context, but I don't know that any of them seems
>>> >  particularly awful in the context you describe: - there is a general
>>> >  issue when testing against null hypotheses on the edge of their
>>> >  feasible space (e.g. testing whether variances are equal to zero or
>>> >  not), which also affects AICs (because the approximations on which
>>> >  AICs are based rely on expanding*around*  the model) - it can be hard
>>> >  to count 'degrees of freedom' for random effects, both for comparing
>>> >  models with different random effects (here Vaida and Blanchard's
>>> >  'conditional AIC' is useful, but I think you're interested in
>>> >  population-level predictions, so the relevant number of parameters
>>> >  would be approx. 1) and for calculating residual degrees of freedom
>>> >  for finite-size corrections such as AICc.  Your spatial correlation
>>> >  term also affects this consideration, because your data points are
>>> >  not entirely independent (and hence you have fewer 'effective'
>>> >  residual degrees of freedom). - What kind of inference are you making
>>> >  *after*  you go through the model selection procedure?  Are you
>>> >  planning to use multi-model inference?  The biggest mistake that you
>>> >  might be making (can't tell from your description) would be to take
>>> >  the selected model and calculate confidence intervals for it,
>>> >  disregarding the model selection uncertainty. Or are you using the
>>> >  AICc values to (effectively) test hypotheses about which variables
>>> >  are 'important'?  I have a bit of a problem with this approach (I
>>> >  think if you're going to test hypotheses you should do it explicitly,
>>> >  not use an approach that is based on maximizing expected predictive
>>> >  accuracy), but I'm quite in the minority in that opinion.
>>> >
>>> >  Bottom line: I would strongly encourage you to supplement what
>>> >  you've done by exploring the data, and the predictions/residuals from
>>> >  the model graphically, and I am curious what kind of inferences you
>>> >  plan to make from these data, but overall your approach seems to make
>>> >  sense -- no big red flags going up for me.
>>> >
>>> >  Others may see something I missed, or have different opinions.
>>> >
>>> >  Ben Bolker
>>> >
>>> >
>>> >
>
>        [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>




More information about the R-sig-mixed-models mailing list