[R-sig-ME] Define prior in MCMCglmm

Wed Nov 20 20:50:30 CET 2013

Dear Jarrod, dear Szymek,
thank you for your answers, I made some progress, but I am not sure if  
I am on the right way.

Jarrod, you mean that the random effects (1 + fin_B|subj_ID) in glmer,  
corresponding to us(1+fin_B):subj_ID in MCMCglmm, are not appropriate  
because fin_B is binomial, right?
So, would you suggest random = ~us(fin_B):subj_ID in MCMCglmm?
Is (fin_B|subj_ID) the corresponding glmer notation for it?

Is "us" or "idh" more appropriate for my data in MCMCglmm? fin_B is  
dichotomous, a factor variable with two outcomes "f" or "m". subj_ID  
has 86 outcomes (the experiment participants).

>>> list(V=diag(2), nu=2, alpha.mu=c(0,0), alpha.V=diag(2)*a)
>>> where a is something large.

Can you give me some hint how large "a" should be?
I tried the following:

a=1e+06
k <- length(levels(fallmid$resp_X))
I <- diag(k-1)
J <- matrix(rep(1, (k-1)^2), c(k-1, k-1))
prior = list(R = list(fix = 1, V = 0.5 * (I+J), n = 2),
            G = list(G1=list(V=diag(2), nu=2, alpha.mu=c(0,0),
alpha.V=diag(2)*a)))

fallmid.MCMCglmm <- MCMCglmm(resp_X ~ lang * ini_pch + lang * manner +  
lang * fin_B,
                               random = ~us(fin_B):subj_ID,
                               family="categorical", data=fallmid,
                               prior=prior
                               )

Then I checked
autocorr(fallmid.MCMCglmm$Sol)
and I saw that the model with the default parameters produced  
sometimes autocorrelation of 0.4 between successive stored iterations.
How should the parameters nitt, thin and burnin be set to improve  
this? Should I just increase all numbers in the same ratio as the  
default ones? Are there limits by increasing?

I rerun MCMCglmm indicating
nitt=150000, thin=100, burnin=37500
and could get autocorrelation < 0.1 for all fixed effects, except  
0.1043 for langen:ini_pchr

I copy below the autocorrelation output for VCV, in which I cannot  
understand titles as "f:f.subj_ID", and "m:f.subj_ID". "f" and "m" are  
the outcomes of the dichotomous variable fin_B.
And I copy also the summary of the model.
How does it look for you? Am I on the right way?

Kind regards,
Maria Paola

> autocorr(fallmid.MCMCglmm.tweak$VCV)
, , f:f.subj_ID

          f:f.subj_ID m:f.subj_ID f:m.subj_ID  m:m.subj_ID units
Lag 0     1.00000000 -0.18716407 -0.18716407 -0.002976274   NaN
Lag 100   0.28414235 -0.04901656 -0.04901656 -0.021149419   NaN
Lag 500   0.02436112 -0.01027021 -0.01027021 -0.027991234   NaN
Lag 1000 -0.01759353 -0.01638179 -0.01638179 -0.011795094   NaN
Lag 5000 -0.04787859  0.00763913  0.00763913  0.024799289   NaN

, , m:f.subj_ID

          f:f.subj_ID m:f.subj_ID f:m.subj_ID m:m.subj_ID units
Lag 0    -0.18716407 1.000000000 1.000000000 -0.17260855   NaN
Lag 100  -0.04912875 0.023138658 0.023138658 -0.05718198   NaN
Lag 500  -0.03577679 0.003964840 0.003964840 -0.05278701   NaN
Lag 1000  0.02771401 0.004615445 0.004615445  0.01383229   NaN
Lag 5000  0.01528668 0.001639329 0.001639329 -0.02532356   NaN

, , f:m.subj_ID

          f:f.subj_ID m:f.subj_ID f:m.subj_ID m:m.subj_ID units
Lag 0    -0.18716407 1.000000000 1.000000000 -0.17260855   NaN
Lag 100  -0.04912875 0.023138658 0.023138658 -0.05718198   NaN
Lag 500  -0.03577679 0.003964840 0.003964840 -0.05278701   NaN
Lag 1000  0.02771401 0.004615445 0.004615445  0.01383229   NaN
Lag 5000  0.01528668 0.001639329 0.001639329 -0.02532356   NaN

, , m:m.subj_ID

           f:f.subj_ID m:f.subj_ID f:m.subj_ID m:m.subj_ID units
Lag 0    -0.002976274 -0.17260855 -0.17260855  1.00000000   NaN
Lag 100  -0.062222834 -0.04848553 -0.04848553  0.22828366   NaN
Lag 500   0.017753127 -0.01216755 -0.01216755  0.07778476   NaN
Lag 1000  0.020301781  0.03771063  0.03771063 -0.02781175   NaN
Lag 5000 -0.015825205 -0.05418101 -0.05418101  0.07003646   NaN

, , units

          f:f.subj_ID m:f.subj_ID f:m.subj_ID m:m.subj_ID units
Lag 0            NaN         NaN         NaN         NaN   NaN
Lag 100          NaN         NaN         NaN         NaN   NaN
Lag 500          NaN         NaN         NaN         NaN   NaN
Lag 1000         NaN         NaN         NaN         NaN   NaN
Lag 5000         NaN         NaN         NaN         NaN   NaN

> summary(fallmid.MCMCglmm.tweak)

  Iterations = 37501:149901
  Thinning interval  = 100
  Sample size  = 1125

  DIC: 1586.166

  G-structure:  ~us(fin_B):subj_ID

             post.mean l-95% CI u-95% CI eff.samp
f:f.subj_ID    2.8329    1.351   4.4789    626.6
m:f.subj_ID   -0.4024   -1.301   0.5999   1125.0
f:m.subj_ID   -0.4024   -1.301   0.5999   1125.0
m:m.subj_ID    3.3495    1.665   4.9086    481.2

  R-structure:  ~units

       post.mean l-95% CI u-95% CI eff.samp
units         1        1        1        0

  Location effects: resp_X ~ lang * ini_pch + lang * manner + lang * fin_B

                 post.mean l-95% CI u-95% CI eff.samp   pMCMC
(Intercept)      -2.12231 -2.99651 -1.19650   1125.0 < 9e-04 ***
langde            0.86672 -0.33061  2.13741   1125.0 0.15467
langen            0.17379 -1.32158  1.96542   1021.4 0.83556
langsw            0.60668 -1.11771  2.00774   1125.0 0.44622
ini_pchm          0.31321 -0.36251  0.94373   1125.0 0.33956
ini_pchr         -0.01135 -0.61257  0.65751   1005.5 0.97600
mannerla          0.02540 -0.64462  0.60490   1125.0 0.93689
mannerna         -0.29521 -0.87690  0.37917   1125.0 0.38222
fin_Bm            2.57100  1.54529  3.81118   1019.3 < 9e-04 ***
langde:ini_pchm  -0.62639 -1.44799  0.26281   1026.4 0.14044
langen:ini_pchm   0.13596 -0.98973  1.23208    975.7 0.81244
langsw:ini_pchm   0.27390 -0.87134  1.31791   1125.0 0.63644
langde:ini_pchr  -0.89823 -1.79517 -0.03409   1256.7 0.05511 .
langen:ini_pchr  -1.30494 -2.58387 -0.09301    911.7 0.05333 .
langsw:ini_pchr  -0.08782 -1.13757  1.02182   1010.0 0.89422
langde:mannerla  -0.24312 -1.13861  0.63645   1125.0 0.59378
langen:mannerla  -0.58499 -1.78324  0.61955   1125.0 0.32000
langsw:mannerla   0.58872 -0.52892  1.64295   1125.0 0.27733
langde:mannerna   0.47920 -0.33966  1.37083   1125.0 0.30756
langen:mannerna   0.79813 -0.35619  1.89286   1125.0 0.19911
langsw:mannerna   0.72797 -0.29891  1.89281   1125.0 0.19911
langde:fin_Bm    -1.90047 -3.36978 -0.31952    997.2 0.00889 **
langen:fin_Bm    -1.44445 -3.59097  0.42764    964.4 0.15467
langsw:fin_Bm    -2.80490 -5.03965 -1.07777   1125.0 0.00533 **
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Zitat von Jarrod Hadfield <j.hadfield at ed.ac.uk>:

> Hi Szymek,
>
> Whoops  - I forgot it was dichotomous. idh(X):Y is probably more
> appropriate. (1+x) fits two effects; a) Y effects for the first
> level of X and b) the difference between Y effects between the two
> levels.  If X was continous it would make more sense because it
> would be intercept and slope.
>
> Cheers,
>
> Jarrod
>
>
> Quoting Szymek Drobniak <geralttee at gmail.com> on Fri, 15 Nov 2013
> 21:47:35 +0100:
>
>> Hello,
>>
>> thank you Jarrod for clarifying things, I obviously made a mistake with the
>> nu value. I was also wondering - what is the interpretation of idh(1+X):Y
>> in case X is a dichotomous variable?
>>
>> Cheers,
>> sz.
>>
>>
>> On 14 November 2013 18:23, Jarrod Hadfield <j.hadfield at ed.ac.uk> wrote:
>>
>>> Hi Maria,
>>>
>>> random = ~us(1):subj_ID + us(fin_B):subj_ID  is the same as
>>> idh(1+finB):subj_ID and is not the same as (1 + fin_B|subj_ID) in lmer
>>> because the idh fixes the covariance between and intercepts and slopes to
>>> zero.  us(1+finB):subj_ID is equivalent to the lmer code.
>>>
>>> Also, a prior of nu=2 (or nu=2.002) could have quite an influence on your
>>> posterior. V=diag(2) and nu=1.002 for the 2x2 case gives a marginal prior
>>> on the variances that is equivalent to an inverse gamma with shape and
>>> scale equal to 0.001. This prior used to be used a lot, but has some bad
>>> properties when values close to zero have support. I often use the
>>> parameter expanded prior:
>>>
>>> list(V=diag(2), nu=2, alpha.mu=c(0,0), alpha.V=diag(2)*a)
>>>
>>> where a is something large.
>>>
>>> There is no difference between nu and n.
>>>
>>> Cheers,
>>>
>>> Jarrod
>>>
>>>
>>> I meant (1 + fin_B|subj_ID), as indicated in glmer() (lme4 package).
>>>
>>>> And this should be indicated in MCMCglmm() as random = ~us(1):subj_ID +
>>>> us(fin_B):subj_ID.
>>>>
>>>
>>>
>>> Quoting Maria Paola Bissiri <Maria_Paola.Bissiri at tu-dresden.de> on Thu,
>>> 14 Nov 2013 15:06:26 +0000:
>>>
>>> Dear Szymek,
>>>> thank you very much for your answer.
>>>>
>>>> Yes, the random effects were indicated wrongly in MCMCglmm! My intention
>>>> is of course to look at variance associated with subjects (subj_ID).
>>>> I meant (1 + fin_B|subj_ID), as indicated in glmer() (lme4 package).
>>>> And this should be indicated in MCMCglmm() as random = ~us(1):subj_ID +
>>>> us(fin_B):subj_ID.
>>>> Please, correct me if I am wrong.
>>>>
>>>> So the model runs with:
>>>> k <- length(levels(fallmid$resp_X))
>>>> I <- diag(k-1)
>>>> J <- matrix(rep(1, (k-1)^2), c(k-1, k-1))
>>>> prior <- list(R = list(fix = 1, V = 0.5 * (I+J), n = 2),
>>>>                G = list(G1 = list(V = diag(1), n = 2), G2 = list(V =
>>>> diag(2), n = 2)))
>>>>
>>>> fallmid.MCMCglmm <- MCMCglmm(resp_X ~ lang * ini_pch + lang * manner +
>>>> lang * fin_B,
>>>>                            random = ~us(1):subj_ID + us(fin_B):subj_ID,
>>>>                            family="categorical", data=fallmid,
>>>>                            prior=prior
>>>>                            )
>>>>
>>>> In your suggestion you indicate nu=2.002. What does "nu" mean? What is
>>>> the difference between nu and n? In the MCMCglmm manual and in
>>>> the tutorial
>>>> they are both defined as "degrees of belief". What does this mean?
>>>>
>>>> Kind regards,
>>>> Maria Paola
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Zitat von Szymek Drobniak <geralttee at gmail.com>:
>>>>
>>>> Dear Maria,
>>>>>
>>>>> I'm not sure what exactly you're trying to test with your model, but to
>>>>> start with - your prior specification assumes 2 random effects, and your
>>>>> model has only one (a structured covariance matrix with fin_B defined as
>>>>> a
>>>>> random effect). This specification you've provided is similar to a random
>>>>> intercept/slope model - but I can't see why you would like to fit it
>>>>> (most
>>>>> importantly, you assumed that fin_B is both a fixed and random effect).
>>>>> If
>>>>> your intention was to look at variance associated with subjects
>>>>> (subj_ID),
>>>>> and you'd like to see if this variance is heterogeneous for different
>>>>> levels of fin_B - you could fit:
>>>>>
>>>>> MCMCglmm(your_fixed_formula_here, random=~us(fin_B):subj_ID, ...)
>>>>>
>>>>> and the prior would be (assuming fin_B has 2 levels as you've said)
>>>>>
>>>>> list(R=list(V=1, fix=1), G=list(G1=list(V=diag(2),nu=2.002)))
>>>>>
>>>>> that's for start, then have a look at mcmc-series plots to see if it
>>>>> mixes
>>>>> well and tweak your model further if necessary.
>>>>>
>>>>> Cheer
>>>>> szymek
>>>>>
>>>>
>>>> --
>>>> Dr. Maria Paola Bissiri
>>>>
>>>> TU Dresden
>>>> Fakultät Elektrotechnik und Informationstechnik
>>>> Institut für Akustik und Sprachkommunikation
>>>> 01062 Dresden
>>>>
>>>> Barkhausen-Bau, Raum S54
>>>> Helmholtzstraße 18
>>>>
>>>> Tel: +49 (0)351 463-34283
>>>> Fax: +49 (0)351 463-37781
>>>> E-Mail: Maria_Paola.Bissiri at tu-dresden.de
>>>> http://wwwpub.zih.tu-dresden.de/~bissiri/index.htm
>>>>
>>>> _______________________________________________
>>>> R-sig-mixed-models at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>>
>>>>
>>>
>>>
>>> --
>>> The University of Edinburgh is a charitable body, registered in
>>> Scotland, with registration number SC005336.
>>>
>>>
>>>
>>
>>
>> --
>> Szymon Drobniak, PhD. || Population Ecology Group
>> Institute of Environmental Sciences, Jagiellonian University
>> ul. Gronostajowa 7, 30-387 Kraków, POLAND
>> tel.: +48 12 664 51 79 fax: +48 12 664 69 12
>> szymek.drobniak at uj.edu.pl
>> www.eko.uj.edu.pl/drobniak
>>
>
>
>
> --
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.