[R-sig-ME] Using nlme package in an information-theoretic framework, AIC
Douglas Bates
bates at stat.wisc.edu
Fri Jan 14 20:12:55 CET 2011
On Wed, Jan 12, 2011 at 10:32 AM, Gustaf Granath
<Gustaf.Granath at ebc.uu.se> wrote:
> Hi,
> I just wanted to inform about out the paper by Greven and Kneib 2010.
> /Biometrika/, 97(4): 773-789 and their cAIC package
> (http://www.bepress.com/jhubiostat/paper202/).
>
> Btw, the reported AIC by lmer is based on the marginal likelihood, right
> (i.e. mAIC is reported)?
I find such questions difficult to answer because to me there is only
one likelihood. Once you define a probability model there is only one
way to define the likelihood, which is the value of the probability
density (or mass) function evaluated at the observed responses and
regarded as a function of the parameters. In this case the density
function is the marginal density of the responses so I guess you could
call it a marginal likelihood, although I can't see any other way of
defining a likelihood. Even the REML criterion, which is supposed to
determine "restricted maximum likelihood" or "residual maximum
likelihood" estimates, is not a likelihood.
>
> Cheers,
>
> Gustaf Granath
>
>> ontent-Type: text/plain; charset=ISO-8859-1
>>
>> On 11-01-04 01:40 PM, Corey VanStratt wrote:
>>> > Hi Ben,
>>> >
>>> > Thanks for the response; good to know that there are not any
>>> > immediate procedural issues.
>>> >
>>> > I've looked back through some of the threads and personal
>>> > correspondences, and I think the main issue raised was that it can be
>>> > challenging to count the 'degrees of freedom'. For example, from a
>>> > personal communication: "A second point is that you should proceed a
>>> > bit cautiously using the AIC that's being outputted in R. The reason
>>> > for this is that it's the fixed-effects version of AIC, and what
>>> > you're wanting is a random-effects version... The fixed-effects
>>> > version only adds a penalty of 1 parameter for each type of random
>>> > effect, which is an underestimate of real complexity." However, from
>>> > your response, I may not have to worry about this given that I'm
>>> > comparing models with the same random effects?
>> Yes, although you still need to count degrees of freedom ascribed to
>> random effects in order to determine your 'residual degrees of freedom'
>> for AICc; this leads to an even murkier corner of the literature,
>> because with blocked and/or correlated data it's also not really clear
>> (as I mentioned obliquely) how to define your 'total number of data
>> points' either. (AICc was defined for linear models in the first place,
>> and has been stretched way beyond its original derivation, although it's
>> not clear to me whether which is the lesser of two evils -- pretending
>> you have an unlimited amount of data or using a poorly supported
>> finite-size correction ...)
>> If you were comparing models with different number of random effects,
>> I think Vaida and Blanchard's (2005?) paper on conditional AIC actually
>> gives a pretty strong justification that if you're interested in the
>> population level (and not predictions for the individual random-effects
>> units) that counting variance parameters rather than something more
>> complicated is the right way to go.
>> If this were really important and you were willing to go to a lot of
>> trouble to make sure the answer was as correct as modern statistical
>> technology could make it, I would probably try some sort of
>> cross-validation approach.
>> If it makes you feel any better, I think the problem with AIC[c] in
>> this context is not that nlme does something wrong, but that nobody
>> really knows what the right answer is. (That might make you feel worse
>> rather than better.)
>>
>>> > And to answer your other question: yes, we are using multi-model
>>> > inference. We determined parameter likelihood values by summing the
>>> > weights of all models that a particular parameter of interest
>>> > occurred in. We calculated model averaged parameter estimates and
>>> > unconditional standard errors for all predictors and we used these
>>> > values as our prediction values in the models.
>> Multi-model inference is definitely better than model selection, but
>> there are a number of things I don't like about this approach (but again
>> be aware that I am probably in the minority).
>>
>> 1. "Parameter likelihood values" are not in themselves ridiculous,
>> but they tend to get overinterpreted. In my opinion (I'd be happy to be
>> corrected) they don't have a coherent statistical meaning; you could
>> consider them a quasi-pseudo-Bayesian estimate of the probability that a
>> parameter 'belongs in the true model', but I don't even want to list how
>> many problems there are with that definition ... How are you
>> interpreting them?
>>
>> 2. It's tricky to combine parameters from different models whenever
>> the parameters potentially have different meanings in the different
>> models (e.g. when there are interactions in some of the models and not
>> in others, or when some of the predictors are collinear). It's best to
>> compute model averages for things that have real biological meaning and
>> interest, such as (in your case) foraging rates. If you want to compute
>> model-averaged effects of different environmental conditions on foraging
>> rates, that may indeed mean that you want to model-average parameters,
>> but you should be careful that the model parameters are really
>> interpretable in similar ways across the models you're considering.
>> Just don't take your model-averaged parameters and attribute
>> "significance" to whether the model-averaged confidence intervals
>> overlap zero, because then you are truly doing hypothesis testing by the
>> back door ...
>>
>>> > Ultimately, we'll be
>>> > comparing rates of foraging at this site to other sites along the
>>> > wintering latitudinal cline (these ducks winter from Alaska down to
>>> > Baja), but we're proceeding with the understanding that as far as
>>> > overall population inference goes, we're limited in our inference due
>>> > to limited sites (n=5).
>> Is there geographic structure to your sites? You might squeeze out a
>> tiny bit more information by incorporating this in the model (rather
>> than 'just' the exponential spatial correlation you're currently using).
>>
>>> > Thanks again for commenting on the overall approach that I've used
>>> > thus far.
>>> >
>>> > Cheers, Corey
>>> >
>>> >
>>> > Message: 3 Date: Mon, 03 Jan 2011 20:34:26 -0500 From: Ben Bolker
>>> > <bbolker at gmail.com> To:r-sig-mixed-models at r-project.org Subject: Re:
>>> > [R-sig-ME] Using nlme package in an information-theoretic framework,
>>> > AIC Message-ID:<4D227922.6030206 at gmail.com> Content-Type:
>>> > text/plain; charset=UTF-8
>>> >
>>> > On 11-01-03 06:51 PM, Corey VanStratt wrote:
>>>> >> Dear Listers,
>>>> >>
>>>> >> I am studying the foraging effort of surf scoters (a sea duck)
>>>> >> during the winter period @ a site in southeast Alaska. This
>>>> >> particular species only dives for its food, and therefore the
>>>> >> response of interest is both minutes underwater per hour and
>>>> >> minutes underwater per day. These 2 metrics provide a reliable
>>>> >> estimate of how often these birds are feeding during the winter,
>>>> >> and they can be compared to estimates collected in the same manner
>>>> >> at different winter sites. To collect these data, we attached VHF
>>>> >> transmitters to 110 individuals; we monitored individuals during
>>>> >> 1-hour foraging observations; we knew a bird was diving when a
>>>> >> signal was lost, as the VHF radio signal does not transmit through
>>>> >> water. For each observation, we calculated total minutes
>>>> >> underwater/hour and extrapolated to minutes underwater/day (these
>>>> >> ducks are diurnal foragers, so based daily rates on amount of
>>>> >> daylight hours). Of the 110 individuals marked, we were able to
>>>> >> get at least 2 observations on 65 individuals. Over 2 seasons,
>>>> >> there were a total of 959 1-hour observations on these 65
>>>> >> individuals.
>>>> >>
>>>> >> The data set was unbalanced and we observed the same individuals
>>>> >> over time (repeated measures, therefore requiring mixed models);
>>>> >> the number of observation periods was not equal among all
>>>> >> individuals and time between observations within and among
>>>> >> individuals was not equal. Therefore, we required a mixed modeling
>>>> >> procedure to account for these repeated measures of individuals.
>>>> >> We used the nlme package in R to specify individual as a random
>>>> >> effect.
>>>> >>
>>>> >> Foraging effort variation at all sites was evaluated using an
>>>> >> information-theoretic approach. Therefore, rather than 'model
>>>> >> building', as is described in much of the nlme documentation and
>>>> >> resources, I am testing several pre-determined, biologically
>>>> >> plausible models. However, I have read in prior posts that AIC
>>>> >> values may not be reliable from this package, and I have been using
>>>> >> these values to compare the models. So, I have three main
>>>> >> questions:
>>>> >>
>>>> >> 1.) Does the process I describe below to analyze these data make
>>>> >> sense?
>>>> >>
>>>> >> 2.) In the case of this data set and the candidate model set, is
>>>> >> it appropriate to use the given AIC values from the nlme output
>>>> >> for determining which model is most parsimonious?
>>>> >>
>>>> >> AND
>>>> >>
>>>> >> 3.) If it is not appropriate to use these AIC values, what is the
>>>> >> best alternative (e.g. how could I obtain AIC values that are
>>>> >> reliable)? So, moving onto the process:
>>>> >>
>>>> >> We evaluated the variation in response of both minutes underwater
>>>> >> per hr and per day in relation to several covariates. We
>>>> >> considered 16 biologically plausible a priori candidate models (see
>>>> >> below). We explored date as an explanatory variable, and we
>>>> >> included it both as a date and date squared term (referred to as
>>>> >> DATE), to allow variation in foraging effort to vary non-linearly
>>>> >> over time; we adjusted all date values in both years to correspond
>>>> >> to 2 December = day 1. We always considered sex (male or female)
>>>> >> and age [hatch year (HY) or after hatch year (AHY)] together as
>>>> >> COHORT, as these four groups often behave differently. We included
>>>> >> a YEAR term (2008/09 or 2009/10) to account for variation between
>>>> >> years. We considered the following environmental factors (ENVIRON)
>>>> >> in each model except for the true null: tide height (m),
>>>> >> temperature (?C), time of day of observation, and wind speed
>>>> >> (categorical with three levels?low, mid, high). We did this to
>>>> >> both limit the number of models in our set, and because we merely
>>>> >> wanted to account for environmental effects. Finally, we included
>>>> >> the interaction effect terms COHORT*DATE and COHORT*YEAR, because
>>>> >> we believed it quite feasible that the different cohorts would
>>>> >> respond to environmental pressures differently both across time
>>>> >> within a winter season and between the winter seasons.
>>>> >>
>>>> >> MODEL EXPLANATORY VARIABLES
>>> >
>>> > fmN Null fm1 Environ fm2 Date + environ fm3 Cohort + environ fm4
>>> > Year + environ fm5 Date + cohort + environ fm6 Date + year +
>>> > environ fm7 Cohort + year + environ fm8 Date + cohort + year +
>>> > environ fm9 Date + environ + cohort x date fm10 Year + environ +
>>> > cohort x date fm11 Date + cohort + environ + cohort x date fm12 Date
>>> > + cohort + year + environ + cohort x date fm13 Cohort + environ +
>>> > cohort x year fm14 Date + cohort + environ + cohort x year fm15 Date
>>> > + cohort + year + environ + cohort x date + cohort x year
>>> >
>>>> >> For each response variable, we first ran the most global model
>>>> >> (fm15) in our set using a model with no random effects, a model
>>>> >> with a random intercept (allowed intercept to vary among
>>>> >> individuals), and a model with a random intercept and slope
>>>> >> (allowed both intercept to vary among individuals and slope to vary
>>>> >> within each individual over DATE). 'ID' in the random effects
>>>> >> refers to each of the 65 uniquely identified ducks. Using an AIC
>>>> >> approach, we found that the random intercept and slope model was
>>>> >> the most parsimonious. Proceeding with this global model and
>>>> >> continuing with an AIC information-theoretic approach, we ran the
>>>> >> global model (inc. the random intercept/slope) against several
>>>> >> different feasible covariance structures, and found that for both
>>>> >> response variables (minutes underwater hr-1 and day-1), an
>>>> >> exponential covariance structure was the best-fitting based on AIC
>>>> >> values. Therefore, we applied this covariance structure to all of
>>>> >> the models in the candidate model set and proceeded with the
>>>> >> overall best-fitting model selection process. That final most
>>>> >> parameterized model specification as written for R is as follows:
>>>> >>
>>>> >> fm15<-
>>>> >> lme(fixed=MINh~((Date+DateSq)*(Cohort))+((Cohort)*Year)+Tide+
>>>> >> Temp+factor(Sea)+Time.start,data=data1,
>>>> >> random=~Date+DateSq|ID,corr=exp2,method="ML")
>>>> >>
>>>> >> We calculated Akaike?s Information Criterion (AICc), adjusted for
>>>> >> small sample size, for all of the models in the candidate model
>>>> >> set from the AIC values given in the lme output. We then examined
>>>> >> relative support for each model using ? AICc and AICc weights
>>>> >> (wi).
>>>> >>
>>>> >> So, again, being relatively new to both R and mixed modeling, I
>>>> >> would be much appreciative if those with a greater understanding of
>>>> >> the process could inform me if the process I have used indeed is
>>>> >> appropriate/makes sense (as I have not used a model building
>>>> >> exercise that is often described when using this package), and if
>>>> >> the AIC values can in fact be used with relative confidence.
>>>> >>
>>>> >> Thanks so much for any insight, Corey VanStratt
>>>> >>
>>>> >>
>>> > Example of output from fm15:
>>> >
>>> > Linear mixed-effects model fit by maximum likelihood Data: data1 AIC
>>> > BIC logLik 5723.309 5835.224 -2838.654
>>> >
>>> > Random effects: Formula: ~DateB + DateSq | ID Structure: General
>>> > positive-definite, Log-Cholesky parametrization StdDev Corr
>>> > (Intercept) 3.500360472 (Intr) DateB DateB 0.175541462 -0.680
>>> > DateSq 0.001982993 0.576 -0.967 Residual 4.185115132
>>> >
>>> > Correlation Structure: Exponential spatial correlation Formula:
>>> > ~DateB | ID Parameter estimate(s): range 0.674106 Fixed effects: MINh
>>> > ~ DateB + DateSq + Cohort + (Cohort):(Season) + Tide + Temp +
>>> > factor(Sea) + Time.start Value Std.Error DF t-value p-value
>>> > (Intercept) 9.972296 2.624219 887 3.800100 0.0002 DateB
>>> > 0.025724 0.031821 887 0.808423 0.4191 DateSq 0.000128
>>> > 0.000336 887 0.379635 0.7043 CohortAHY M -7.337342 3.060351
>>> > 57 -2.397549 0.0198 CohortHY F 4.634536 4.220465 57
>>> > 1.098110 0.2768 CohortHY M -7.870449 3.346530 57 -2.351824
>>> > 0.0222 Tide 0.157180 0.088498 887 1.776074 0.0761
>>> > Temp -0.009083 0.039280 887 -0.231239 0.8172
>>> > factor(Sea)6 0.111870 2.194239 887 0.050983 0.9594
>>> > factor(Sea)7-8 3.294262 1.961388 887 1.679556 0.0934
>>> > Time.start 0.087318 0.068847 887 1.268298 0.2050
>>> > CohortAHY F:Season -1.925349 1.412226 57 -1.363344 0.1781
>>> > CohortAHY M:Season 1.804896 1.235439 57 1.460935 0.1495 CohortHY
>>> > F:Season -4.522203 1.914241 57 -2.362400 0.0216 CohortHY M:Season
>>> > 2.323518 1.469479 57 1.581185 0.1194 Correlation:
>>> >
>>> >
>>> > Standardized Within-Group Residuals: Min Q1 Med
>>> > Q3 Max -2.5466867 -0.6254431 -0.1261256 0.5150277 3.4233514
>>> >
>>> > ===================================
>>> >
>>> > As far as I can tell, this all looks pretty reasonable.
>>> >
>>> > General questions/comments:
>>> >
>>> >
>>> > - what are your concerns about unreliable AIC values? Can you quote
>>> > a specific post to this list? There are various concerns with AICs
>>> > in the mixed model context, but I don't know that any of them seems
>>> > particularly awful in the context you describe: - there is a general
>>> > issue when testing against null hypotheses on the edge of their
>>> > feasible space (e.g. testing whether variances are equal to zero or
>>> > not), which also affects AICs (because the approximations on which
>>> > AICs are based rely on expanding*around* the model) - it can be hard
>>> > to count 'degrees of freedom' for random effects, both for comparing
>>> > models with different random effects (here Vaida and Blanchard's
>>> > 'conditional AIC' is useful, but I think you're interested in
>>> > population-level predictions, so the relevant number of parameters
>>> > would be approx. 1) and for calculating residual degrees of freedom
>>> > for finite-size corrections such as AICc. Your spatial correlation
>>> > term also affects this consideration, because your data points are
>>> > not entirely independent (and hence you have fewer 'effective'
>>> > residual degrees of freedom). - What kind of inference are you making
>>> > *after* you go through the model selection procedure? Are you
>>> > planning to use multi-model inference? The biggest mistake that you
>>> > might be making (can't tell from your description) would be to take
>>> > the selected model and calculate confidence intervals for it,
>>> > disregarding the model selection uncertainty. Or are you using the
>>> > AICc values to (effectively) test hypotheses about which variables
>>> > are 'important'? I have a bit of a problem with this approach (I
>>> > think if you're going to test hypotheses you should do it explicitly,
>>> > not use an approach that is based on maximizing expected predictive
>>> > accuracy), but I'm quite in the minority in that opinion.
>>> >
>>> > Bottom line: I would strongly encourage you to supplement what
>>> > you've done by exploring the data, and the predictions/residuals from
>>> > the model graphically, and I am curious what kind of inferences you
>>> > plan to make from these data, but overall your approach seems to make
>>> > sense -- no big red flags going up for me.
>>> >
>>> > Others may see something I missed, or have different opinions.
>>> >
>>> > Ben Bolker
>>> >
>>> >
>>> >
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
More information about the R-sig-mixed-models
mailing list