[R-sig-ME] Define prior in MCMCglmm

Maria Paola Bissiri Maria_Paola.Bissiri at tu-dresden.de
Thu Nov 21 20:19:35 CET 2013


Hello Szymek,
thank you very much for your answer.

> if you have enough information in your data it's probably good to  
> explicitly model covariances not fixed at 0.

what would be enough information to choose "us" instead of "idh"?
I have to generate two models. One for 360 observations from 20  
subjects, and the other one for 1548 observations from 86 subjects.

> Be careful about prior - for the idh
> structure your prior (nu=1.002 would probably more informative than you
> want, please Jarrod correct me if I'm right).

In case of idh, which should be the "nu" value in the prior for R and for G?

> E.g. for your estimated variances you have around 600 and 400 which  
> is a bit to low - try running for longer, maybe increasing thin a bit.

So, if effectiveSize(model$VCV) is > 1000, it is okay, right?

> Fixed effects look ok to me - you have some interactions that  
> probably are not
> significant, so I'd try simplifying the model.

I read in the list that it is not recommended to do backward  
elimination of non significant terms in the model. E.g.:
http://article.gmane.org/gmane.comp.lang.r.lme4.devel/5713/
My objective is to test by means of GLMM the influence of the  
predictors in the model on the response variable. Not sure if I should  
simplify the model or not.

Kind regards,
Maria Paola


Zitat von Szymek Drobniak <geralttee at gmail.com>:

> Hello Maria,
>
> us and idh fit different kinds of covariance structures: in us all relevant
> parameters are estimated (both variances and covariances) - if you have
> enough information in your data it's probably good to explicitly model
> covariances not fixed at 0. In case of idh the covariance matrix is similar
> but the covariances are fixed at zero (so only variances - i.e. the
> off-diagonal elements) are estimated. Be careful about prior - for the idh
> structure your prior (nu=1.002 would probably more informative than you
> want, please Jarrod correct me if I'm right).
>
> As for the autocorrelations - I'd generally go as low as possible, usually
> I try to get values less than 0.05. You're mostly interested in
> autocorrelations for lags=thin - and mostly diagonal values so you'll have
> much clearer output by running
>
> diag(autocorr(model$VCV)[2,,])
>
> or just look at effectiveSize(model$VCV) which should be ideally around at
> least 1000. As far as I know there's no strict rule as to which values of
> nitt/thin/burnin to use - it depends on your data, how much information it
> contains and how many parameters you're trying to estimate. What you're
> trying to achieve is a well mixing chain with low-enough
> autocorrelations/large enough effective sample sizes. E.g. for your
> estimated variances you have around 600 and 400 which is a bit to low - try
> running for longer, maybe increasing thin a bit. Burnin seems enough, of
> course as far as I would set it. There's no limit for the number of
> iterations (beside, of course, your time and available computer memory ;))
>
> And about the labels in the output: if you fit us() covariance structure -
> you're output will contain m.m.subj_ID and f.f.subj_ID labeling estimated
> variances (off-diagonal elements of the co matrix) and
> m.f.subj_ID/f.m.subj_ID labeling covariances (the matrix is symmetrical so
> they're the same). If you fit idh() you'll have only m.m.subj_ID and
> f.f.subj_ID as covariance will be fixed at 0 and not estimated. Fixed
> effects look ok to me - you have some interactions that probably are not
> significant, so I'd try simplifying the model.
>
> Cheers
> sz.
>
>
>
> On 20 November 2013 20:50, Maria Paola Bissiri <
> Maria_Paola.Bissiri at tu-dresden.de> wrote:
>
>> Dear Jarrod, dear Szymek,
>> thank you for your answers, I made some progress, but I am not sure if I
>> am on the right way.
>>
>> Jarrod, you mean that the random effects (1 + fin_B|subj_ID) in glmer,
>> corresponding to us(1+fin_B):subj_ID in MCMCglmm, are not appropriate
>> because fin_B is binomial, right?
>> So, would you suggest random = ~us(fin_B):subj_ID in MCMCglmm?
>> Is (fin_B|subj_ID) the corresponding glmer notation for it?
>>
>> Is "us" or "idh" more appropriate for my data in MCMCglmm? fin_B is
>> dichotomous, a factor variable with two outcomes "f" or "m". subj_ID has 86
>> outcomes (the experiment participants).
>>
>>
>>  list(V=diag(2), nu=2, alpha.mu=c(0,0), alpha.V=diag(2)*a)
>>>>> where a is something large.
>>>>>
>>>>
>> Can you give me some hint how large "a" should be?
>> I tried the following:
>>
>> a=1e+06
>>
>> k <- length(levels(fallmid$resp_X))
>> I <- diag(k-1)
>> J <- matrix(rep(1, (k-1)^2), c(k-1, k-1))
>> prior = list(R = list(fix = 1, V = 0.5 * (I+J), n = 2),
>>            G = list(G1=list(V=diag(2), nu=2, alpha.mu=c(0,0),
>> alpha.V=diag(2)*a)))
>>
>>
>> fallmid.MCMCglmm <- MCMCglmm(resp_X ~ lang * ini_pch + lang * manner +
>> lang * fin_B,
>>                               random = ~us(fin_B):subj_ID,
>>
>>                               family="categorical", data=fallmid,
>>                               prior=prior
>>                               )
>>
>>
>> Then I checked
>> autocorr(fallmid.MCMCglmm$Sol)
>> and I saw that the model with the default parameters produced sometimes
>> autocorrelation of 0.4 between successive stored iterations.
>> How should the parameters nitt, thin and burnin be set to improve this?
>> Should I just increase all numbers in the same ratio as the default ones?
>> Are there limits by increasing?
>>
>> I rerun MCMCglmm indicating
>> nitt=150000, thin=100, burnin=37500
>> and could get autocorrelation < 0.1 for all fixed effects, except 0.1043
>> for langen:ini_pchr
>>
>> I copy below the autocorrelation output for VCV, in which I cannot
>> understand titles as "f:f.subj_ID", and "m:f.subj_ID". "f" and "m" are the
>> outcomes of the dichotomous variable fin_B.
>> And I copy also the summary of the model.
>> How does it look for you? Am I on the right way?
>>
>> Kind regards,
>> Maria Paola
>>
>>  autocorr(fallmid.MCMCglmm.tweak$VCV)
>>>
>> , , f:f.subj_ID
>>
>>          f:f.subj_ID m:f.subj_ID f:m.subj_ID  m:m.subj_ID units
>> Lag 0     1.00000000 -0.18716407 -0.18716407 -0.002976274   NaN
>> Lag 100   0.28414235 -0.04901656 -0.04901656 -0.021149419   NaN
>> Lag 500   0.02436112 -0.01027021 -0.01027021 -0.027991234   NaN
>> Lag 1000 -0.01759353 -0.01638179 -0.01638179 -0.011795094   NaN
>> Lag 5000 -0.04787859  0.00763913  0.00763913  0.024799289   NaN
>>
>> , , m:f.subj_ID
>>
>>          f:f.subj_ID m:f.subj_ID f:m.subj_ID m:m.subj_ID units
>> Lag 0    -0.18716407 1.000000000 1.000000000 -0.17260855   NaN
>> Lag 100  -0.04912875 0.023138658 0.023138658 -0.05718198   NaN
>> Lag 500  -0.03577679 0.003964840 0.003964840 -0.05278701   NaN
>> Lag 1000  0.02771401 0.004615445 0.004615445  0.01383229   NaN
>> Lag 5000  0.01528668 0.001639329 0.001639329 -0.02532356   NaN
>>
>> , , f:m.subj_ID
>>
>>          f:f.subj_ID m:f.subj_ID f:m.subj_ID m:m.subj_ID units
>> Lag 0    -0.18716407 1.000000000 1.000000000 -0.17260855   NaN
>> Lag 100  -0.04912875 0.023138658 0.023138658 -0.05718198   NaN
>> Lag 500  -0.03577679 0.003964840 0.003964840 -0.05278701   NaN
>> Lag 1000  0.02771401 0.004615445 0.004615445  0.01383229   NaN
>> Lag 5000  0.01528668 0.001639329 0.001639329 -0.02532356   NaN
>>
>> , , m:m.subj_ID
>>
>>           f:f.subj_ID m:f.subj_ID f:m.subj_ID m:m.subj_ID units
>> Lag 0    -0.002976274 -0.17260855 -0.17260855  1.00000000   NaN
>> Lag 100  -0.062222834 -0.04848553 -0.04848553  0.22828366   NaN
>> Lag 500   0.017753127 -0.01216755 -0.01216755  0.07778476   NaN
>> Lag 1000  0.020301781  0.03771063  0.03771063 -0.02781175   NaN
>> Lag 5000 -0.015825205 -0.05418101 -0.05418101  0.07003646   NaN
>>
>> , , units
>>
>>          f:f.subj_ID m:f.subj_ID f:m.subj_ID m:m.subj_ID units
>> Lag 0            NaN         NaN         NaN         NaN   NaN
>> Lag 100          NaN         NaN         NaN         NaN   NaN
>> Lag 500          NaN         NaN         NaN         NaN   NaN
>> Lag 1000         NaN         NaN         NaN         NaN   NaN
>> Lag 5000         NaN         NaN         NaN         NaN   NaN
>>
>>
>>
>>
>>  summary(fallmid.MCMCglmm.tweak)
>>>
>>
>>  Iterations = 37501:149901
>>  Thinning interval  = 100
>>  Sample size  = 1125
>>
>>  DIC: 1586.166
>>
>>  G-structure:  ~us(fin_B):subj_ID
>>
>>             post.mean l-95% CI u-95% CI eff.samp
>> f:f.subj_ID    2.8329    1.351   4.4789    626.6
>> m:f.subj_ID   -0.4024   -1.301   0.5999   1125.0
>> f:m.subj_ID   -0.4024   -1.301   0.5999   1125.0
>> m:m.subj_ID    3.3495    1.665   4.9086    481.2
>>
>>  R-structure:  ~units
>>
>>       post.mean l-95% CI u-95% CI eff.samp
>> units         1        1        1        0
>>
>>  Location effects: resp_X ~ lang * ini_pch + lang * manner + lang * fin_B
>>
>>                 post.mean l-95% CI u-95% CI eff.samp   pMCMC
>> (Intercept)      -2.12231 -2.99651 -1.19650   1125.0 < 9e-04 ***
>> langde            0.86672 -0.33061  2.13741   1125.0 0.15467
>> langen            0.17379 -1.32158  1.96542   1021.4 0.83556
>> langsw            0.60668 -1.11771  2.00774   1125.0 0.44622
>> ini_pchm          0.31321 -0.36251  0.94373   1125.0 0.33956
>> ini_pchr         -0.01135 -0.61257  0.65751   1005.5 0.97600
>> mannerla          0.02540 -0.64462  0.60490   1125.0 0.93689
>> mannerna         -0.29521 -0.87690  0.37917   1125.0 0.38222
>> fin_Bm            2.57100  1.54529  3.81118   1019.3 < 9e-04 ***
>> langde:ini_pchm  -0.62639 -1.44799  0.26281   1026.4 0.14044
>> langen:ini_pchm   0.13596 -0.98973  1.23208    975.7 0.81244
>> langsw:ini_pchm   0.27390 -0.87134  1.31791   1125.0 0.63644
>> langde:ini_pchr  -0.89823 -1.79517 -0.03409   1256.7 0.05511 .
>> langen:ini_pchr  -1.30494 -2.58387 -0.09301    911.7 0.05333 .
>> langsw:ini_pchr  -0.08782 -1.13757  1.02182   1010.0 0.89422
>> langde:mannerla  -0.24312 -1.13861  0.63645   1125.0 0.59378
>> langen:mannerla  -0.58499 -1.78324  0.61955   1125.0 0.32000
>> langsw:mannerla   0.58872 -0.52892  1.64295   1125.0 0.27733
>> langde:mannerna   0.47920 -0.33966  1.37083   1125.0 0.30756
>> langen:mannerna   0.79813 -0.35619  1.89286   1125.0 0.19911
>> langsw:mannerna   0.72797 -0.29891  1.89281   1125.0 0.19911
>> langde:fin_Bm    -1.90047 -3.36978 -0.31952    997.2 0.00889 **
>> langen:fin_Bm    -1.44445 -3.59097  0.42764    964.4 0.15467
>> langsw:fin_Bm    -2.80490 -5.03965 -1.07777   1125.0 0.00533 **
>> ---
>> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>>
>>
>>
>>
>>
>> Zitat von Jarrod Hadfield <j.hadfield at ed.ac.uk>:
>>
>>
>>  Hi Szymek,
>>>
>>> Whoops  - I forgot it was dichotomous. idh(X):Y is probably more
>>> appropriate. (1+x) fits two effects; a) Y effects for the first
>>> level of X and b) the difference between Y effects between the two
>>> levels.  If X was continous it would make more sense because it
>>> would be intercept and slope.
>>>
>>> Cheers,
>>>
>>> Jarrod
>>>
>>>
>>> Quoting Szymek Drobniak <geralttee at gmail.com> on Fri, 15 Nov 2013
>>> 21:47:35 +0100:
>>>
>>>  Hello,
>>>>
>>>> thank you Jarrod for clarifying things, I obviously made a mistake with
>>>> the
>>>> nu value. I was also wondering - what is the interpretation of idh(1+X):Y
>>>> in case X is a dichotomous variable?
>>>>
>>>> Cheers,
>>>> sz.
>>>>
>>>>
>>>> On 14 November 2013 18:23, Jarrod Hadfield <j.hadfield at ed.ac.uk> wrote:
>>>>
>>>>  Hi Maria,
>>>>>
>>>>> random = ~us(1):subj_ID + us(fin_B):subj_ID  is the same as
>>>>> idh(1+finB):subj_ID and is not the same as (1 + fin_B|subj_ID) in lmer
>>>>> because the idh fixes the covariance between and intercepts and slopes
>>>>> to
>>>>> zero.  us(1+finB):subj_ID is equivalent to the lmer code.
>>>>>
>>>>> Also, a prior of nu=2 (or nu=2.002) could have quite an influence on
>>>>> your
>>>>> posterior. V=diag(2) and nu=1.002 for the 2x2 case gives a marginal
>>>>> prior
>>>>> on the variances that is equivalent to an inverse gamma with shape and
>>>>> scale equal to 0.001. This prior used to be used a lot, but has some bad
>>>>> properties when values close to zero have support. I often use the
>>>>> parameter expanded prior:
>>>>>
>>>>> list(V=diag(2), nu=2, alpha.mu=c(0,0), alpha.V=diag(2)*a)
>>>>>
>>>>> where a is something large.
>>>>>
>>>>> There is no difference between nu and n.
>>>>>
>>>>> Cheers,
>>>>>
>>>>> Jarrod
>>>>>
>>>>>
>>>>> I meant (1 + fin_B|subj_ID), as indicated in glmer() (lme4 package).
>>>>>
>>>>>  And this should be indicated in MCMCglmm() as random = ~us(1):subj_ID +
>>>>>> us(fin_B):subj_ID.
>>>>>>
>>>>>>
>>>>>
>>>>> Quoting Maria Paola Bissiri <Maria_Paola.Bissiri at tu-dresden.de> on Thu,
>>>>> 14 Nov 2013 15:06:26 +0000:
>>>>>
>>>>> Dear Szymek,
>>>>>
>>>>>> thank you very much for your answer.
>>>>>>
>>>>>> Yes, the random effects were indicated wrongly in MCMCglmm! My
>>>>>> intention
>>>>>> is of course to look at variance associated with subjects (subj_ID).
>>>>>> I meant (1 + fin_B|subj_ID), as indicated in glmer() (lme4 package).
>>>>>> And this should be indicated in MCMCglmm() as random = ~us(1):subj_ID +
>>>>>> us(fin_B):subj_ID.
>>>>>> Please, correct me if I am wrong.
>>>>>>
>>>>>> So the model runs with:
>>>>>> k <- length(levels(fallmid$resp_X))
>>>>>> I <- diag(k-1)
>>>>>> J <- matrix(rep(1, (k-1)^2), c(k-1, k-1))
>>>>>> prior <- list(R = list(fix = 1, V = 0.5 * (I+J), n = 2),
>>>>>>                G = list(G1 = list(V = diag(1), n = 2), G2 = list(V =
>>>>>> diag(2), n = 2)))
>>>>>>
>>>>>> fallmid.MCMCglmm <- MCMCglmm(resp_X ~ lang * ini_pch + lang * manner +
>>>>>> lang * fin_B,
>>>>>>                            random = ~us(1):subj_ID + us(fin_B):subj_ID,
>>>>>>                            family="categorical", data=fallmid,
>>>>>>                            prior=prior
>>>>>>                            )
>>>>>>
>>>>>> In your suggestion you indicate nu=2.002. What does "nu" mean? What is
>>>>>> the difference between nu and n? In the MCMCglmm manual and in
>>>>>> the tutorial
>>>>>> they are both defined as "degrees of belief". What does this mean?
>>>>>>
>>>>>> Kind regards,
>>>>>> Maria Paola
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Zitat von Szymek Drobniak <geralttee at gmail.com>:
>>>>>>
>>>>>> Dear Maria,
>>>>>>
>>>>>>>
>>>>>>> I'm not sure what exactly you're trying to test with your model, but
>>>>>>> to
>>>>>>> start with - your prior specification assumes 2 random effects, and
>>>>>>> your
>>>>>>> model has only one (a structured covariance matrix with fin_B defined
>>>>>>> as
>>>>>>> a
>>>>>>> random effect). This specification you've provided is similar to a
>>>>>>> random
>>>>>>> intercept/slope model - but I can't see why you would like to fit it
>>>>>>> (most
>>>>>>> importantly, you assumed that fin_B is both a fixed and random
>>>>>>> effect).
>>>>>>> If
>>>>>>> your intention was to look at variance associated with subjects
>>>>>>> (subj_ID),
>>>>>>> and you'd like to see if this variance is heterogeneous for different
>>>>>>> levels of fin_B - you could fit:
>>>>>>>
>>>>>>> MCMCglmm(your_fixed_formula_here, random=~us(fin_B):subj_ID, ...)
>>>>>>>
>>>>>>> and the prior would be (assuming fin_B has 2 levels as you've said)
>>>>>>>
>>>>>>> list(R=list(V=1, fix=1), G=list(G1=list(V=diag(2),nu=2.002)))
>>>>>>>
>>>>>>> that's for start, then have a look at mcmc-series plots to see if it
>>>>>>> mixes
>>>>>>> well and tweak your model further if necessary.
>>>>>>>
>>>>>>> Cheer
>>>>>>> szymek
>>>>>>>
>>>>>>>
>>>>>> --
>>>>>> Dr. Maria Paola Bissiri
>>>>>>
>>>>>> TU Dresden
>>>>>> Fakultät Elektrotechnik und Informationstechnik
>>>>>> Institut für Akustik und Sprachkommunikation
>>>>>> 01062 Dresden
>>>>>>
>>>>>> Barkhausen-Bau, Raum S54
>>>>>> Helmholtzstraße 18
>>>>>>
>>>>>> Tel: +49 (0)351 463-34283
>>>>>> Fax: +49 (0)351 463-37781
>>>>>> E-Mail: Maria_Paola.Bissiri at tu-dresden.de
>>>>>> http://wwwpub.zih.tu-dresden.de/~bissiri/index.htm
>>>>>>
>>>>>> _______________________________________________
>>>>>> R-sig-mixed-models at r-project.org mailing list
>>>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> The University of Edinburgh is a charitable body, registered in
>>>>> Scotland, with registration number SC005336.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>> --
>>>> Szymon Drobniak, PhD. || Population Ecology Group
>>>> Institute of Environmental Sciences, Jagiellonian University
>>>> ul. Gronostajowa 7, 30-387 Kraków, POLAND
>>>> tel.: +48 12 664 51 79 fax: +48 12 664 69 12
>>>> szymek.drobniak at uj.edu.pl
>>>> www.eko.uj.edu.pl/drobniak
>>>>
>>>>
>>>
>>>
>>> --
>>> The University of Edinburgh is a charitable body, registered in
>>> Scotland, with registration number SC005336.
>>>
>>
>>
>
>
> --
> Szymon Drobniak, PhD. || Population Ecology Group
> Institute of Environmental Sciences, Jagiellonian University
> ul. Gronostajowa 7, 30-387 Kraków, POLAND
> tel.: +48 12 664 51 79 fax: +48 12 664 69 12
> szymek.drobniak at uj.edu.pl
> www.eko.uj.edu.pl/drobniak



More information about the R-sig-mixed-models mailing list