[R-sig-ME] Cannot grok the newdata argument in simulate.merMod(), package lme4
Ben Bolker
bbo|ker @end|ng |rom gm@||@com
Fri Sep 2 02:55:18 CEST 2022
With apologies for delayed response.
The answer to the "why doesn't roll-your-own match simulate()" is on
Stack Overflow now: https://stackoverflow.com/a/73576691/190277. The
short answer is that the differences are harmless and caused by
differences in the procedures used to pick multivariate Normal deviates
(which give the same *distributions*, just not the same specific values
for any given realization/random number seed).
The answer to "why doesn't simulate() + newdata do what I want" is
simple **IF YOU KNOW THE ANSWER ALREADY**, which is that for binomial
responses, the number of trials per observation needs to be specified
somehow. Your experience reflects arguably a bug in the code, the
documentation, or the user interface design (or some combination), but
the solution is to specify something like
weights = rep(50, nrow(X))
if (for example) you want all of the binomial samples to be out of 50.
Or whatever makes sense for your application. (And good for you for
paying attention to the "ominous warning"!)
Hope that helps.
cheers
Ben Bolker
On 2022-08-05 2:10 a.m., Rolf Turner wrote:
>
>
> I wish to assess, via simulation, the impact of certain aspects
> of the experimental design on the precision of the estimates of
> a rather intricate parameter, the details of which I won't go into.
>
> To this end I need to fit a model (generalised linear mixed model,
> binomial family) to a data set, and then simulate other data sets,
> reflecting a number of different experimental designs, from the given
> fit. I thought that the "newdata" argument of the simulate.merMod()
> method would provide me with the means to accomplish my goal, but I
> cannot get it to work. I cannot properly comprehend the documentation
> of the "newdata" argument in the help file.
>
> To start with, to make sure (???) that I understood what was going on
> I just simulated a data set from a model fitted to the cbpp data,
> using a "roll-your-own" procedure, and then re-did the simulation
> using simulate.merMod(). I set seeds appropriately so that I would get
> the same results, given that my roll-your-own procedure was correct.
> After many false starts and excursions down blind alleys, I got the two
> procedures to agree. The details of what I did are to be found in the
> attached sourceable script "demo.txt".
>
> I then attempted to effect simulations using a different data set "X"
> (just including predictors, no responses) constructed according to
> a different experimental design.
>
> I tried two different approaches. The first approach:
>
> s.mer1 <- simulate(fit,newdata=X,allow.new.levels=TRUE)
>
> This produced an ominous warning:
>> In wts - Y :
>> longer object length is not a multiple of shorter object length
>
> The second approach:
>
> newpar <- list(theta=getME(fit,"theta"),beta=getME(fit,"beta"))
> set.seed(101)
> s.mer2 <- simulate(~0+period +(1|herd),newdata=X,newparams=newpar)
>
> This produced warnings:
>
>> beta parameter vector not named: assuming same order as internal
>> vector
>> Warning message:
>> In setParams(object, newparams) :
>> some parameters not specified in setParams()
>
> The first bit ("beta parameter vector not named ...") also happens if I
> try to reproduce exactly an example given in the help, so perhaps I
> should not be too worried by it. The second bit is ominous (and to
> me mysterious).
>
> The result of the first approach is incomprehensible to me, and bears
> no relationship that I can discern to the results of a roll-your-own
> approach that I also tried. The result of the second approach is all
> NAs, so something is clearly wrong.
>
> The details of what I did are given in the attached file
> "try.newdata.txt".
>
> I would be ever-so-humbly grateful if someone could explain to me what
> I am doing wrong in my attempts to use the "newdata" argument of
> simulate.merMod().
>
> cheers,
>
> Rolf Turner
>
>
> _______________________________________________
> R-sig-mixed-models using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
--
Dr. Benjamin Bolker
Professor, Mathematics & Statistics and Biology, McMaster University
Director, School of Computational Science and Engineering
(Acting) Graduate chair, Mathematics & Statistics
> E-mail is sent at my convenience; I don't expect replies outside of
working hours.
More information about the R-sig-mixed-models
mailing list