[R-sig-ME] Unclear output from MCMCglmm with categorical predictors

HADFIELD Jarrod j@h@dfield @ending from ed@@c@uk
Mon Nov 12 17:58:17 CET 2018


Hi,

The habitat coefficients are the estimates when (in the example) diet is 
Faun. The two diet coefficients are the differences between Herb and 
Faun and Herb and Omni. As the previous poster said, removing the -1 
will give you an intercept and habitat coefficients that are differences 
between the absorbed (base-line) habitat and the remaining habitats.

Cheers,

Jarrod



On 12/11/2018 12:26, roimaor using post.tau.ac.il wrote:
>
> Thanks for your response Walid.
> However, it doesn't really answer my questions, so I'll try to clarify 
> those.
>
> An intercept estimate in my model makes no biological sense because 
> all species have both diet and habitat. This is why I chose to 
> suppress the intercept.
>
> In the context of categorical predictors, suppressing the intercept  
> means that an arbitrarily-chosen category is taken as baseline, and 
> the model estimates the difference (in the response variable) between 
> this baseline and all other categories.
> When dealing with two categorical traits as predictors, each data 
> point is a combination of the effects of both traits, i.e. it is a 
> combination of one category from each trait.
>
> If this is indeed the case, the baseline value must include one 
> category of each predictor in the model, otherwise their effects would 
> be confounded.
>
> For example, in my model I would expect one diet category AND one 
> habitat category, say, 'herbivorous' and 'aquatic', to be 
> baseline/intercept (un-estimated).
> Instead, it seems that only 'herbivorous' is absorbed in the 
> intercept, while all habitat classes have posterior estimates.
>
> My questions are:
> 1- why does no habitat category get absorbed in the intercept? and
> 2- what can I do to fix that?
>
> If it's at all useful, here is an example:
>
> ## data format
> unique(Tdata$Habitat)
> [1] TE AR AQ ST SA
> Levels: AQ AR SA ST TE
>
> unique(Tdata$Diet)
> [1] Herb Omni Faun
> Levels: Faun Herb Omni
>
> ModTHRE <- MCMCglmm(Response ~ -1 + Habitat + Diet,
>                     prior = list(R = list(V = 1, nu = 0.002)),
>                     ginverse = list(Binomial=INphylo$Ainv),
>                     burnin = 25000, nitt = 225000, thin = 200,
>                     family = "threshold",
>                     data = Tdata,
>                     verbose = FALSE)
>
> ## model output
> summary(ModTHRE)
>
>  Iterations = 25001:224801
>  Thinning interval  = 200
>  Sample size  = 1000
>
>  DIC: 2321.861
>
>  R-structure:  ~units
>
>       post.mean l-95% CI u-95% CI eff.samp
> units         1   0.9973    1.003     1000
>
>  Location effects: Response ~ -1 + ForagingHab + Troph_Lev
>
>             post.mean l-95% CI u-95% CI eff.samp  pMCMC
> HabitatST   -0.07612 -0.42629  0.33746   1000.0  0.694
> HabitatTE   -0.36682 -0.52533 -0.21118   1000.0 <0.001 ***
> HabitatSA   -0.04691 -0.25769  0.19636   1000.0  0.724
> HabitatAR   -0.07302 -0.29900  0.16681    883.3  0.548
> HabitatAQ   -0.47948 -0.82598 -0.15793   1000.0  0.002 **
> DietHerb     0.30708  0.11255  0.48545   1000.0  0.002 **
> DietOmni     0.05263 -0.12176  0.26105   1000.0  0.604
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
>  Cutpoints:
>                          post.mean l-95% CI u-95% CI eff.samp
> cutpoint.traitActivity.1    0.4911    0.436   0.5495    877.6
>
>
>
> Any thoughts, ideas of what's going on here, or corrections of my 
> mis-perceptions, are all very welcome.
>
> Many thanks,
> Roi
>
>
> Quoting Walid Mawass <walidmawass10 using gmail.com>:
>
>> Hello,
>>
>> by adding -1 into your model's formula, you are explicitly calling 
>> for no
>> intercept term to be included in the estimates. If you do want an 
>> intercept
>> term which would include a baseline level for each of your categorical
>> variables then the syntax should be:
>>
>> MCMCglmm(Activity ~1 + Habitat + Diet + log(Mass) + Max.Temp... 
>> (explicit
>> call for intercept term)
>> or
>> MCMCglmm(Activity ~ Habitat + Diet + log(Mass) + Max.Temp... 
>> (implicit call
>> for intercept term)
>>
>> Hope this helps
>> -- 
>> Walid Mawass
>> Ph.D. candidate in Cellular and Molecular Biology
>> Population Genetics Laboratory
>> University of Québec at Trois-Rivières
>> 3351, boul. des Forges, C.P. 500
>> Trois-Rivières (Québec) G9A 5H7
>> Telephone: 819-376-5011 poste 3384
>>
>>
>> On Wed, Nov 7, 2018 at 12:09 PM <roimaor using post.tau.ac.il> wrote:
>>
>>> Dear list members,
>>>
>>> I have a model with categorical response and categorical + continuous
>>> predictors.
>>>
>>> My model has two categorical predictors: "diet" (3 levels) and
>>> "habitat" (6 levels):
>>>
>>> THRE1 <- MCMCglmm(Activity ~ -1 + Habitat + Diet + log(Mass) + 
>>> Max.Temp,
>>>                       prior = list(R = list(V = 1, fix = 1)),
>>>                       ginverse = list(Binomial=INphylo$Ainv),
>>>                       family = "threshold",
>>>                       data = Tdata)
>>>
>>> If I understand correctly, in this configuration the algorithm
>>> shouldn't return estimated values for the effect of each level of a
>>> categorical predictor, instead, it returns a contrast between that
>>> level and another level which was arbitrarily chosen as the base
>>> level. Each species (data point) has a value for each of these traits,
>>> so I would expect them to be estimated independently, meaning that one
>>> level of each predictor should be the 'baseline' and absorbed into the
>>> global intercept. In that case I expect 2 contrasts to be returned for
>>> diet categories and 5 contrasts for habitat.
>>>
>>> However, I get 2 estimates (presumably contrasts) for diet categories,
>>> and 6 for habitat categories, i.e., no habitat category was designated
>>> as baseline, which makes me question whether the estimates are
>>> contrasts or actual effect sizes.
>>>
>>> My questions:
>>> - Is the algorithm pooling all the predictor categories as if they
>>> were a single trait with 8 levels?
>>> - If the habitat effect estimates are contrasts - what are they 
>>> compared
>>> to?
>>> - If they are effect sizes - what did I do to not get the contrasts as
>>> I expected?
>>>
>>> Any help would be much appreciated!
>>> Thanks,
>>> -- 
>>> Roi Maor
>>> PhD candidate
>>> School of Zoology, Tel Aviv University
>>> Centre for Biodiversity and Environment Research, UCL
>>>
>>> _______________________________________________
>>> R-sig-mixed-models using r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>
>
> _______________________________________________
> R-sig-mixed-models using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models


-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.



More information about the R-sig-mixed-models mailing list