[R-sig-ME] MCMCglmm with multinomial models

Fri Jul 15 23:37:30 CEST 2022

Hello,

I don't think I can specifically help you with some of your inquiries.
However, I do want to comment on a few things that might need some
attention.

First, MCMCglmm is based on a Bayesian implementation and does not compute
p-values to compare. What you need to compare are the posterior
distributions of your effect sizes. This can be done visually using the
base plot function in R. Or by comparing the HPD intervals and the mode (or
mean) of the posterior distributions.

Second, I have no idea what your data structure looks like (which makes it
hard to interpret model results), but the effective sample size (from the
5500 saved iterations sample) for your random variable Section is very low
(the same applies for your fixed effects). You should consider this issue
and look again at your assumption of correlation between activities for the
14 sections you have in your dataset. If you do not expect among activity
correlations then you can use the idh() function instead of us().

Hopefully this helps and in hope that people on this list with more
knowledge of these models will help out.

Best,
-- 
Walid Mawass
Ph.D. candidate in Evolutionary Biology - UQTR
*Currently* Postdoctoral Research Associate
Masel Lab - University of Arizona

On Fri, Jul 15, 2022 at 8:49 AM jessica comley <jessiecomley44 using gmail.com>
wrote:

> Dear all,
>
> I am hoping that someone will be able to help me with conducting MCMCglmm
> multinomial models.
>
> The data I am working with is for black-backed jackal (bbj) and carcal. For
> each species we have a multinomial response variable called activity which
> has four categories (dawn, diurnal, dusk, nocturnal). We have two
> categorical fixed effects which are 1) culling (none, lethal) and 2)
> predator presence (absent, high, low). We also have a categorical variable
> called Section (made up of 14 different reserves/ farms where the activity
> of caracal and bbj were recorded). There are 273 observations for caracal
> and 4399 for bbj. We are wanting to test the effects of culling and
> predators on caracal and bbj activity separately.
>
> I have been working through Jarrod Hadfields course notes, particularly
> with regards to Chapter 5.2. The chi-square analyses reveal that the
> frequencies of culling and predators differ as do activities.
>
> I have managed to work out the specific probabilities for the culling none
> vs culling lethal for each activity (dawn, diurnal, dusk, nocturnal) for
> caracal, but I'm confused as to how to determine p-values to determine
> which activities culling none vs culling lethal are affecting?
>
> Myy code and outcomes are pasted below with questions stated in bold.
>
> caracal2 <- read.csv("caracal_new.csv", header=T)
> caracal <- as.data.frame(unclass(caracal2), stringsAsFactors = TRUE)
>
> #Chi-squared tests
> Ctable1 <- table(caracal$activity, caracal$culling)
> chisq.test(rowSums(Ctable1)) #strongly suggests activities differ
> chisq.test(Ctable1)#strongly suggests culling category differs
>
> Ctable2 <- table(caracal$activity, caracal$predator)
> chisq.test(rowSums(Ctable2))#strongly suggests activities differ
> chisq.test(Ctable2)#strongly suggests predator category differs
>
> prior = list(R = list(fix=1, V=(1/k) * (I + J)), G = list(G1=list(V =
> diag(k-1), nu=1)))
> test1c.5 <- MCMCglmm(activity ~ -1 + at.level(culling,1):trait +
> at.level(culling, 2):trait, random=~us(trait):Section, rcov =
> ~us(trait):units, data = caracal, family = "categorical", prior = prior,
> burnin=5000, nitt=60000)
> *##I'm not sure how to add the three predator levels to this model or if it
> would be appropriate?*
>
>
> k <- length(levels(caracal$activity))
> I <- diag(k-1)
> J <- matrix(rep(1, (k-1)^2), c(k-1, k-1))
> IJ <- (1/k) *(diag(k-1) + matrix(1,k-1, k-1))
>
> contrasts(caracal$activity)
>
> #culling lethal
> Delta <- cbind(c(0,1,0,0), c(0,0,1,0), c(0,0,0,1))
> c2 <- (16 * sqrt(3)/(15 * pi))^2
> D <- ginv(Delta %*% t(Delta)) %*% Delta
> Int <- t(apply(test1c.5$Sol[,1:3],1, function(x) + D %*% (x/sqrt(1 + c2 *
> diag(IJ)))))
> summary(mcmc(exp(Int)/rowSums(exp(Int))))
>
> prop.table(Ctable1[,1])
>
> #culling none
> Delta <- cbind(c(0,1,0,0), c(0,0,1,0), c(0,0,0,1))
> c2 <- (16 * sqrt(3)/(15 * pi))^2
> D <- ginv(Delta %*% t(Delta)) %*% Delta
> Int <- t(apply(test1c.5$Sol[,4:6],1, function(x) + D %*% (x/sqrt(1 + c2 *
> diag(IJ)))))
> summary(mcmc(exp(Int)/rowSums(exp(Int))))
>
> prop.table((Ctable1[,2]))
>
> HPDinterval(test1c.5$Sol)
>
> #model summary
> > summary(test1c.5)
>
>  Iterations = 5001:59991
>  Thinning interval  = 10
>  Sample size  = 5500
>
>  DIC: 699.7014
>
>  G-structure:  ~us(trait):Section
>
>                                                         post.mean l-95% CI
> u-95% CI eff.samp
> traitactivity.diurnal:traitactivity.diurnal.Section        1.8124  0.09784
>    5.665    77.01
> traitactivity.dusk:traitactivity.diurnal.Section           0.8450 -0.83585
>    3.856    64.17
> traitactivity.nocturnal:traitactivity.diurnal.Section      1.3621 -1.19129
>    6.157    58.48
> traitactivity.diurnal:traitactivity.dusk.Section           0.8450 -0.83585
>    3.856    64.17
> traitactivity.dusk:traitactivity.dusk.Section              1.2034  0.07090
>    3.681   102.16
> traitactivity.nocturnal:traitactivity.dusk.Section         0.7505 -1.77113
>    4.524    43.53
> traitactivity.diurnal:traitactivity.nocturnal.Section      1.3621 -1.19129
>    6.157    58.48
> traitactivity.dusk:traitactivity.nocturnal.Section         0.7505 -1.77113
>    4.524    43.53
> traitactivity.nocturnal:traitactivity.nocturnal.Section    2.7148  0.09401
>    8.397    76.59
>
>  R-structure:  ~us(trait):units
>
>                                                       post.mean l-95% CI
> u-95% CI eff.samp
> traitactivity.diurnal:traitactivity.diurnal.units          0.50     0.50
>   0.50        0
> traitactivity.dusk:traitactivity.diurnal.units             0.25     0.25
>   0.25        0
> traitactivity.nocturnal:traitactivity.diurnal.units        0.25     0.25
>   0.25        0
> traitactivity.diurnal:traitactivity.dusk.units             0.25     0.25
>   0.25        0
> traitactivity.dusk:traitactivity.dusk.units                0.50     0.50
>   0.50        0
> traitactivity.nocturnal:traitactivity.dusk.units           0.25     0.25
>   0.25        0
> traitactivity.diurnal:traitactivity.nocturnal.units        0.25     0.25
>   0.25        0
> traitactivity.dusk:traitactivity.nocturnal.units           0.25     0.25
>   0.25        0
> traitactivity.nocturnal:traitactivity.nocturnal.units      0.50     0.50
>   0.50        0
>
>  Location effects: activity ~ -1 + at.level(culling, 1):trait +
> at.level(culling, 2):trait
>
>                                              post.mean l-95% CI u-95% CI
> eff.samp  pMCMC
> at.level(culling, 1):traitactivity.diurnal      1.2306  -0.0533   2.6793
> 145.29 0.0418 *
> at.level(culling, 1):traitactivity.dusk         0.6605  -0.6006   2.0761
>  92.91 0.2840
> at.level(culling, 1):traitactivity.nocturnal    1.6090   0.0914   3.1356
> 151.02 0.0265 *
> traitactivity.diurnal:at.level(culling, 2)      1.2664  -0.1552   2.7750
> 226.40 0.0604 .
> traitactivity.dusk:at.level(culling, 2)         0.3533  -0.9898   1.5218
> 148.44 0.5447
> traitactivity.nocturnal:at.level(culling, 2)    1.0447  -0.6405   2.8354
> 346.40 0.1618
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> *##So for the model summary I get that lethal culling at activity diurnal
> is significantly different from lethal culling at dawn (its the base
> reference), but I'm also interested in whether lethal culling at activity
> diurnal is different from lethal culling at dusk for example. Is this
> possible? *
>
> #outcomes culling lethal
> > summary(mcmc(exp(Int)/rowSums(exp(Int))))
>
> Iterations = 1:5500
> Thinning interval = 1
> Number of chains = 1
> Sample size per chain = 5500
>
> 1. Empirical mean and standard deviation for each variable,
>    plus standard error of the mean:
>
>        Mean      SD  Naive SE Time-series SE
> [1,] 0.1253 0.05565 0.0007504       0.002484
> [2,] 0.3748 0.10497 0.0014155       0.003204
> [3,] 0.1757 0.06640 0.0008954       0.002515
> [4,] 0.3242 0.11939 0.0016099       0.003514
>
> 2. Quantiles for each variable:
>
>         2.5%     25%    50%    75%  97.5%
> var1 0.03641 0.08695 0.1198 0.1554 0.2553
> var2 0.17298 0.30580 0.3704 0.4431 0.5896
> var3 0.06166 0.12913 0.1705 0.2161 0.3215
> var4 0.12610 0.23999 0.3090 0.3901 0.6045
>
> > prop.table(Ctable1[,1])
>      dawn   diurnal      dusk nocturnal
> 0.1250000 0.2812500 0.1770833 0.4166667
>
>
> #outcomes culling none
> > summary(mcmc(exp(Int)/rowSums(exp(Int))))
>
> Iterations = 1:5500
> Thinning interval = 1
> Number of chains = 1
> Sample size per chain = 5500
>
> 1. Empirical mean and standard deviation for each variable,
>    plus standard error of the mean:
>
>        Mean      SD  Naive SE Time-series SE
> [1,] 0.1288 0.06141 0.0008280       0.002787
> [2,] 0.3804 0.10406 0.0014032       0.002662
> [3,] 0.1710 0.06844 0.0009228       0.002592
> [4,] 0.3198 0.11812 0.0015928       0.002956
>
> 2. Quantiles for each variable:
>
>         2.5%     25%    50%    75%  97.5%
> var1 0.02891 0.08896 0.1220 0.1594 0.2685
> var2 0.18007 0.31094 0.3783 0.4474 0.5965
> var3 0.05840 0.12425 0.1634 0.2083 0.3250
> var4 0.12430 0.23921 0.3077 0.3862 0.5964
>
> > prop.table((Ctable1[,2]))
>      dawn   diurnal      dusk nocturnal
> 0.1306818 0.4375000 0.1875000 0.2443182
>
> Any help or guidance will be greatly appreciated.
>
> All the best,
> Jess
>
> --
> Jessica Comley (PhD)
> Research Scientist
>
>         [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>

	[[alternative HTML version deleted]]