[R-sig-ME] Modelling with uncertain (but not missing) categorical random effect values

Wed Jul 14 16:35:54 CEST 2021

Dear Michael,

I'd set the reference of mum_sp to a species with plenty of data and a raw
proportion away from the extremes. And then set a more informative prior
for the mum_sp effects.

I tend to specify the random effect priors in INLA manually as penalised
complexity priors. I've noticed in the past that the default random effect
priors can underestimate the random effect variance.

Best regards,

ir. Thierry Onkelinx
Statisticus / Statistician

Vlaamse Overheid / Government of Flanders
INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND
FOREST
Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
thierry.onkelinx using inbo.be
Havenlaan 88 bus 73, 1000 Brussel
www.inbo.be

///////////////////////////////////////////////////////////////////////////////////////////
To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey
///////////////////////////////////////////////////////////////////////////////////////////

<https://www.inbo.be>

Op wo 14 jul. 2021 om 16:10 schreef Michael Lawson <mrml500 using york.ac.uk>:

> Thanks Thierry,
>
> Yes I refer to the fitted values and getting the marginal averages of
> these fitted values for each group within mum_sp.
>
> con = 1 for all offspring of 8 out of 9 (mum_id) individuals in one of the
> groups within mum_sp, so this may be creating the complete separation.
> Could I specify a prior on the mum_sp fixed effect in INLA to deal with the
> quasi-complete separation?
>
> Overall the responses don't vary much within mum_id levels. Males vary a
> lot in number of offspring and proportion of their offspring with
> particular females and species, so the high variance is expected. I had
> previously collapsed all offspring per mum_id to individual rows and run as
> a simple binomial GLM with a cbind(no._conspecific_offspring /
> no._heterospecific offspring) response, but this didn't seem ideal as it
> doesn't account for variation among parents. To get an idea of the data
> structure (and whether what I'm trying to do here makes any sense)...
> overall there are 27 individuals in mum_id, 18 individuals in dad_id and 3
> groups for mum_sp. There are about 200 rows of offspring (averaging ~ 7-8
> offspring / mum).
>
> Thanks, I'll check out that forum.
>
> All the best,
> Mike
>
> On Wed, 14 Jul 2021 at 13:10, Thierry Onkelinx <thierry.onkelinx using inbo.be>
> wrote:
>
>> Dear Michael,
>>
>> The model formulation seems reasonable to me. I assume you refer to the
>> fitted values? You get them when fitting the model after adding the
>> argument: control.predictor = list(compute = TRUE)
>> Another option is to specify linear combinations. See
>> https://www.r-inla.org/faq#h.dwc64vjwo03.
>> You might want to do some reading on fitting models with INLA. I
>> recommend
>> http://www.highstat.com/index.php/beginner-s-guide-to-regression-models-with-spatial-and-temporal-correlation
>>
>> Looking at the model output, I noticed a few things.
>> 1) The effects for mum_sp are extreme. Do you have (quasi) complete
>> separation?
>> 2) The precision for mum_id is large (small random effects). Does
>> that make sense?
>> 3) The precision fordad_id is small (very large random effects). Does
>> that make sense?
>>
>> You probably want to specify the priors of the random effects yourself
>> instead of using the default priors.
>>
>> Note that INLA has its dedicated forum:
>> https://groups.google.com/g/r-inla-discussion-group?pli=1
>>
>> Best regards,
>>
>> ir. Thierry Onkelinx
>> Statisticus / Statistician
>>
>> Vlaamse Overheid / Government of Flanders
>> INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE
>> AND FOREST
>> Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
>> thierry.onkelinx using inbo.be
>> Havenlaan 88 bus 73, 1000 Brussel
>> www.inbo.be
>>
>>
>> ///////////////////////////////////////////////////////////////////////////////////////////
>> To call in the statistician after the experiment is done may be no more
>> than asking him to perform a post-mortem examination: he may be able to say
>> what the experiment died of. ~ Sir Ronald Aylmer Fisher
>> The plural of anecdote is not data. ~ Roger Brinner
>> The combination of some data and an aching desire for an answer does not
>> ensure that a reasonable answer can be extracted from a given body of data.
>> ~ John Tukey
>>
>> ///////////////////////////////////////////////////////////////////////////////////////////
>>
>> <https://www.inbo.be>
>>
>>
>> Op wo 14 jul. 2021 om 11:17 schreef Michael Lawson <mrml500 using york.ac.uk>:
>>
>>> Dear Thierry,
>>>
>>> Many thanks for your email - that looks like what I am after. I have
>>> never used INLA before, so thus far I have just made a basic model
>>> without specifying any further arguments to the call. Am I on the
>>> right lines? How would I go about extracting the predicted probability
>>> of conspecific mating for each group within mum_sp?
>>>
>>> values <- as.factor(unique(c(levels(dat$dad_1), levels(dat$dad_2),
>>> levels(dat$dad_3), levels(dat$dad_4))))
>>>
>>> formula <- con ~ mum_sp + f(mum_id, model = "iid") + f(dad_1, w_1,
>>> values = values, model = "iid") + f(dad_2, w_2, values = values, copy
>>> = "dad_1") + f(dad_3, w_3, values = values, copy = "dad_1") + f(dad_4,
>>> w_4, values = values, copy = "dad_1")
>>>
>>> model <- inla(formula, family="binomial", data=dat,
>>> control.family=list(link='logit'))
>>> summary(model)
>>>
>>> Call:
>>>    "inla(formula = formula, family = \"binomial\", data = dat,
>>> control.family = list(link = \"logit\"))"
>>> Time used:
>>>     Pre = 0.462, Running = 3.3, Post = 0.115, Total = 3.88
>>> Fixed effects:
>>>                mean     sd 0.025quant 0.5quant 0.975quant   mode   kld
>>> (Intercept)  12.696 10.298      0.834   10.000     40.536  6.699 0.087
>>> mum_spL      18.725 11.824      3.426   16.051     49.365 11.804 0.023
>>> mum_spN     -11.697 10.257    -38.926   -9.318      1.392 -6.208 0.031
>>>
>>> Random effects:
>>>   Name   Model
>>>     mum_id IID model
>>>    dad_1 IID model
>>>    dad_2 Copy
>>>    dad_3 Copy
>>>    dad_4 Copy
>>>
>>> Model hyperparameters:
>>>                          mean       sd 0.025quant 0.5quant 0.975quant
>>>  mode
>>> Precision for mum_id 2.03e+04 1.97e+04    977.697 1.43e+04   7.21e+04
>>> 2331.739
>>> Precision for dad_1  9.20e-02 5.10e-02      0.025 8.20e-02   2.17e-01
>>> 0.061
>>>
>>> Expected number of effective parameters(stdev): 25.62(0.441)
>>> Number of equivalent replicates : 7.46
>>>
>>> Marginal log-Likelihood:  -81.32
>>>
>>> Many thanks,
>>> Mike
>>>
>>

	[[alternative HTML version deleted]]