[R-sig-ME] MCMCglmm with multinomial models

Mon Jul 18 08:32:42 CEST 2022

Hi Walid,

Thank you for your reply, I greatly appreciate it. I have a few more
questions and if you could help that would be great.

I tested for correlation between activities and the 14 Sections and the
correlation comes out as low. Therefore I have changed my code to use idh()
instead of us as suggested:

test1c.5b <- MCMCglmm(activity ~ -1 + at.level(culling,1):trait +
at.level(culling, 2):trait, random=~idh(trait):Section, rcov =
~idh(trait):units, data = caracal, family = "categorical", prior = prior,
burnin=5000, nitt=80000)

1) Is this correct?

2) Increasing the number of interactions increases the effective sample
size, therefore is there a general rule of thumb as to how large your
effective sample size should be?

3) I understand how to use and interpret the results of HPDinterval (i.e.
if intervals do not overlap 0 then relationship is strong), but how am I
able to test the relationship between all four activities and fixed effects
and not just have the three categories (i.e. diurnal, dusk, nocturnal)
compared to the base category (dawn)? For example, I am also interested in
whether there is a significant/strong relationship between activities of
caracal at dusk with culling(Lethal)/no culling(none) compared to
activities of caracal at diurnal with culling(Lethal)/no culling(none).

Below is an example of our dataset:
Camera Section CameraID Animal predator culling activity
1a Bucklands Bucklands1a Caracal low Lethal diurnal
1a Bucklands Bucklands1a Caracal low Lethal dawn
2a Bucklands Bucklands2a Caracal low Lethal dusk
2a Bucklands Bucklands2a Caracal low Lethal diurnal
3a Bucklands Bucklands3a Caracal low Lethal dawn
Cam 1  Colchester ColchesterCam 1  Caracal high none diurnal
Cam 1  Colchester ColchesterCam 1  Caracal high none diurnal
Cam 1  Colchester ColchesterCam 1  Caracal high none diurnal
Cam 1  Colchester ColchesterCam 1  Caracal high none diurnal
Cam 2  Colchester ColchesterCam 2  Caracal high none diurnal
Cam 2  Colchester ColchesterCam 2  Caracal high none diurnal
Cam 3  Colchester ColchesterCam 3  Caracal high none diurnal
Cam 3  Colchester ColchesterCam 3  Caracal high none diurnal
Cam 3  Colchester ColchesterCam 3  Caracal high none diurnal
Cam 4  Colchester ColchesterCam 4  Caracal high none diurnal
Cam 4  Colchester ColchesterCam 4  Caracal high none diurnal
Cam 4  Colchester ColchesterCam 4  Caracal high none nocturnal
1a Connaught Connaught1a Caracal low Lethal nocturnal
1a Connaught Connaught1a Caracal low Lethal nocturnal
1d Connaught Connaught1d Caracal low Lethal diurnal
3B Connaught Connaught3B Caracal low Lethal diurnal
3B Connaught Connaught3B Caracal low Lethal diurnal
4a Connaught Connaught4a Caracal low Lethal nocturnal
4a Connaught Connaught4a Caracal low Lethal nocturnal
4b Connaught Connaught4b Caracal low Lethal diurnal
6a Connaught Connaught6a Caracal low Lethal nocturnal
6b Connaught Connaught6b Caracal low Lethal diurnal
7a Connaught Connaught7a Caracal low Lethal nocturnal
9a Connaught Connaught9a Caracal low Lethal nocturnal
9d Connaught Connaught9d Caracal low Lethal nocturnal
9d Connaught Connaught9d Caracal low Lethal dusk
7d Diepdam Diepdam7d Caracal absent Lethal dusk
8d Diepdam Diepdam8d Caracal absent Lethal diurnal
9c Diepdam Diepdam9c Caracal absent Lethal nocturnal

All the best,
Jess

On Fri, Jul 15, 2022 at 11:37 PM Walid Mawass <walidmawass10 using gmail.com>
wrote:

> Hello,
>
> I don't think I can specifically help you with some of your inquiries.
> However, I do want to comment on a few things that might need some
> attention.
>
> First, MCMCglmm is based on a Bayesian implementation and does not compute
> p-values to compare. What you need to compare are the posterior
> distributions of your effect sizes. This can be done visually using the
> base plot function in R. Or by comparing the HPD intervals and the mode (or
> mean) of the posterior distributions.
>
> Second, I have no idea what your data structure looks like (which makes it
> hard to interpret model results), but the effective sample size (from the
> 5500 saved iterations sample) for your random variable Section is very low
> (the same applies for your fixed effects). You should consider this issue
> and look again at your assumption of correlation between activities for the
> 14 sections you have in your dataset. If you do not expect among activity
> correlations then you can use the idh() function instead of us().
>
> Hopefully this helps and in hope that people on this list with more
> knowledge of these models will help out.
>
> Best,
> --
> Walid Mawass
> Ph.D. candidate in Evolutionary Biology - UQTR
> *Currently* Postdoctoral Research Associate
> Masel Lab - University of Arizona
>
>
> On Fri, Jul 15, 2022 at 8:49 AM jessica comley <jessiecomley44 using gmail.com>
> wrote:
>
>> Dear all,
>>
>> I am hoping that someone will be able to help me with conducting MCMCglmm
>> multinomial models.
>>
>> The data I am working with is for black-backed jackal (bbj) and carcal.
>> For
>> each species we have a multinomial response variable called activity which
>> has four categories (dawn, diurnal, dusk, nocturnal). We have two
>> categorical fixed effects which are 1) culling (none, lethal) and 2)
>> predator presence (absent, high, low). We also have a categorical variable
>> called Section (made up of 14 different reserves/ farms where the activity
>> of caracal and bbj were recorded). There are 273 observations for caracal
>> and 4399 for bbj. We are wanting to test the effects of culling and
>> predators on caracal and bbj activity separately.
>>
>> I have been working through Jarrod Hadfields course notes, particularly
>> with regards to Chapter 5.2. The chi-square analyses reveal that the
>> frequencies of culling and predators differ as do activities.
>>
>> I have managed to work out the specific probabilities for the culling none
>> vs culling lethal for each activity (dawn, diurnal, dusk, nocturnal) for
>> caracal, but I'm confused as to how to determine p-values to determine
>> which activities culling none vs culling lethal are affecting?
>>
>> Myy code and outcomes are pasted below with questions stated in bold.
>>
>> caracal2 <- read.csv("caracal_new.csv", header=T)
>> caracal <- as.data.frame(unclass(caracal2), stringsAsFactors = TRUE)
>>
>> #Chi-squared tests
>> Ctable1 <- table(caracal$activity, caracal$culling)
>> chisq.test(rowSums(Ctable1)) #strongly suggests activities differ
>> chisq.test(Ctable1)#strongly suggests culling category differs
>>
>> Ctable2 <- table(caracal$activity, caracal$predator)
>> chisq.test(rowSums(Ctable2))#strongly suggests activities differ
>> chisq.test(Ctable2)#strongly suggests predator category differs
>>
>> prior = list(R = list(fix=1, V=(1/k) * (I + J)), G = list(G1=list(V =
>> diag(k-1), nu=1)))
>> test1c.5 <- MCMCglmm(activity ~ -1 + at.level(culling,1):trait +
>> at.level(culling, 2):trait, random=~us(trait):Section, rcov =
>> ~us(trait):units, data = caracal, family = "categorical", prior = prior,
>> burnin=5000, nitt=60000)
>> *##I'm not sure how to add the three predator levels to this model or if
>> it
>> would be appropriate?*
>>
>>
>> k <- length(levels(caracal$activity))
>> I <- diag(k-1)
>> J <- matrix(rep(1, (k-1)^2), c(k-1, k-1))
>> IJ <- (1/k) *(diag(k-1) + matrix(1,k-1, k-1))
>>
>> contrasts(caracal$activity)
>>
>> #culling lethal
>> Delta <- cbind(c(0,1,0,0), c(0,0,1,0), c(0,0,0,1))
>> c2 <- (16 * sqrt(3)/(15 * pi))^2
>> D <- ginv(Delta %*% t(Delta)) %*% Delta
>> Int <- t(apply(test1c.5$Sol[,1:3],1, function(x) + D %*% (x/sqrt(1 + c2 *
>> diag(IJ)))))
>> summary(mcmc(exp(Int)/rowSums(exp(Int))))
>>
>> prop.table(Ctable1[,1])
>>
>> #culling none
>> Delta <- cbind(c(0,1,0,0), c(0,0,1,0), c(0,0,0,1))
>> c2 <- (16 * sqrt(3)/(15 * pi))^2
>> D <- ginv(Delta %*% t(Delta)) %*% Delta
>> Int <- t(apply(test1c.5$Sol[,4:6],1, function(x) + D %*% (x/sqrt(1 + c2 *
>> diag(IJ)))))
>> summary(mcmc(exp(Int)/rowSums(exp(Int))))
>>
>> prop.table((Ctable1[,2]))
>>
>> HPDinterval(test1c.5$Sol)
>>
>> #model summary
>> > summary(test1c.5)
>>
>>  Iterations = 5001:59991
>>  Thinning interval  = 10
>>  Sample size  = 5500
>>
>>  DIC: 699.7014
>>
>>  G-structure:  ~us(trait):Section
>>
>>                                                         post.mean l-95% CI
>> u-95% CI eff.samp
>> traitactivity.diurnal:traitactivity.diurnal.Section        1.8124  0.09784
>>    5.665    77.01
>> traitactivity.dusk:traitactivity.diurnal.Section           0.8450 -0.83585
>>    3.856    64.17
>> traitactivity.nocturnal:traitactivity.diurnal.Section      1.3621 -1.19129
>>    6.157    58.48
>> traitactivity.diurnal:traitactivity.dusk.Section           0.8450 -0.83585
>>    3.856    64.17
>> traitactivity.dusk:traitactivity.dusk.Section              1.2034  0.07090
>>    3.681   102.16
>> traitactivity.nocturnal:traitactivity.dusk.Section         0.7505 -1.77113
>>    4.524    43.53
>> traitactivity.diurnal:traitactivity.nocturnal.Section      1.3621 -1.19129
>>    6.157    58.48
>> traitactivity.dusk:traitactivity.nocturnal.Section         0.7505 -1.77113
>>    4.524    43.53
>> traitactivity.nocturnal:traitactivity.nocturnal.Section    2.7148  0.09401
>>    8.397    76.59
>>
>>  R-structure:  ~us(trait):units
>>
>>                                                       post.mean l-95% CI
>> u-95% CI eff.samp
>> traitactivity.diurnal:traitactivity.diurnal.units          0.50     0.50
>>   0.50        0
>> traitactivity.dusk:traitactivity.diurnal.units             0.25     0.25
>>   0.25        0
>> traitactivity.nocturnal:traitactivity.diurnal.units        0.25     0.25
>>   0.25        0
>> traitactivity.diurnal:traitactivity.dusk.units             0.25     0.25
>>   0.25        0
>> traitactivity.dusk:traitactivity.dusk.units                0.50     0.50
>>   0.50        0
>> traitactivity.nocturnal:traitactivity.dusk.units           0.25     0.25
>>   0.25        0
>> traitactivity.diurnal:traitactivity.nocturnal.units        0.25     0.25
>>   0.25        0
>> traitactivity.dusk:traitactivity.nocturnal.units           0.25     0.25
>>   0.25        0
>> traitactivity.nocturnal:traitactivity.nocturnal.units      0.50     0.50
>>   0.50        0
>>
>>  Location effects: activity ~ -1 + at.level(culling, 1):trait +
>> at.level(culling, 2):trait
>>
>>                                              post.mean l-95% CI u-95% CI
>> eff.samp  pMCMC
>> at.level(culling, 1):traitactivity.diurnal      1.2306  -0.0533   2.6793
>> 145.29 0.0418 *
>> at.level(culling, 1):traitactivity.dusk         0.6605  -0.6006   2.0761
>>  92.91 0.2840
>> at.level(culling, 1):traitactivity.nocturnal    1.6090   0.0914   3.1356
>> 151.02 0.0265 *
>> traitactivity.diurnal:at.level(culling, 2)      1.2664  -0.1552   2.7750
>> 226.40 0.0604 .
>> traitactivity.dusk:at.level(culling, 2)         0.3533  -0.9898   1.5218
>> 148.44 0.5447
>> traitactivity.nocturnal:at.level(culling, 2)    1.0447  -0.6405   2.8354
>> 346.40 0.1618
>> ---
>> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>>
>> *##So for the model summary I get that lethal culling at activity diurnal
>> is significantly different from lethal culling at dawn (its the base
>> reference), but I'm also interested in whether lethal culling at activity
>> diurnal is different from lethal culling at dusk for example. Is this
>> possible? *
>>
>> #outcomes culling lethal
>> > summary(mcmc(exp(Int)/rowSums(exp(Int))))
>>
>> Iterations = 1:5500
>> Thinning interval = 1
>> Number of chains = 1
>> Sample size per chain = 5500
>>
>> 1. Empirical mean and standard deviation for each variable,
>>    plus standard error of the mean:
>>
>>        Mean      SD  Naive SE Time-series SE
>> [1,] 0.1253 0.05565 0.0007504       0.002484
>> [2,] 0.3748 0.10497 0.0014155       0.003204
>> [3,] 0.1757 0.06640 0.0008954       0.002515
>> [4,] 0.3242 0.11939 0.0016099       0.003514
>>
>> 2. Quantiles for each variable:
>>
>>         2.5%     25%    50%    75%  97.5%
>> var1 0.03641 0.08695 0.1198 0.1554 0.2553
>> var2 0.17298 0.30580 0.3704 0.4431 0.5896
>> var3 0.06166 0.12913 0.1705 0.2161 0.3215
>> var4 0.12610 0.23999 0.3090 0.3901 0.6045
>>
>> > prop.table(Ctable1[,1])
>>      dawn   diurnal      dusk nocturnal
>> 0.1250000 0.2812500 0.1770833 0.4166667
>>
>>
>> #outcomes culling none
>> > summary(mcmc(exp(Int)/rowSums(exp(Int))))
>>
>> Iterations = 1:5500
>> Thinning interval = 1
>> Number of chains = 1
>> Sample size per chain = 5500
>>
>> 1. Empirical mean and standard deviation for each variable,
>>    plus standard error of the mean:
>>
>>        Mean      SD  Naive SE Time-series SE
>> [1,] 0.1288 0.06141 0.0008280       0.002787
>> [2,] 0.3804 0.10406 0.0014032       0.002662
>> [3,] 0.1710 0.06844 0.0009228       0.002592
>> [4,] 0.3198 0.11812 0.0015928       0.002956
>>
>> 2. Quantiles for each variable:
>>
>>         2.5%     25%    50%    75%  97.5%
>> var1 0.02891 0.08896 0.1220 0.1594 0.2685
>> var2 0.18007 0.31094 0.3783 0.4474 0.5965
>> var3 0.05840 0.12425 0.1634 0.2083 0.3250
>> var4 0.12430 0.23921 0.3077 0.3862 0.5964
>>
>> > prop.table((Ctable1[,2]))
>>      dawn   diurnal      dusk nocturnal
>> 0.1306818 0.4375000 0.1875000 0.2443182
>>
>> Any help or guidance will be greatly appreciated.
>>
>> All the best,
>> Jess
>>
>> --
>> Jessica Comley (PhD)
>> Research Scientist
>>
>>         [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> R-sig-mixed-models using r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>
>

-- 
Jessica Comley (PhD)
Research Scientist

	[[alternative HTML version deleted]]