[R-sig-ME] Running

Sat Nov 10 14:56:58 CET 2018

Ben,

I am sorry.  I did misunderstand your first email last night.  I am using
the glmm models for predicting water quality and my random effects are at
the site and basin level and they do explain a lot of the variance in the
models especially for "noisy" indicators like turbidity and fecal
coliform.  In the project, I am predicting for current conditions as well
as potential management scenarios throughout a region.  Initially, I just
calculate the mean difference between these two values (current vs.
management scenario) for the region, but I would like to get an idea of the
uncertainty in this mean reduction. Though the random effects are
significant, we are making an assumption that when trying to restore a
particular site, the random effect at that site will not change over the
course of the restoration. This implies that the uncertainty of improvement
for a given site is mostly affected by the uncertainty in the fixed effects
which are being adjusted for the management scenarios  (i.e., increase of
canopy cover, nutrient loadings from wastewater treatment plants, etc.). I
tried to use the predictInterval function, but it seemed to give me
predictive intervals including random effects as well. In essence, they
were much larger than the ones I am getting using :

## param only
b3 <- bootMer(fm1,FUN=function(x) predict(x,newdata=test,re.form=~0),
              ## re.form=~0 is equivalent to use.u=FALSE
              nsim=100,seed=101)

I also used Cholesky decomposition on the covariance matrix of the fixed
effects to "simulate" the uncertainty of the fixed effects giving similar
results.  I think bootstrapping is a bit easier to explain in my manuscript
though and thought it might also be easier for coding purposes using
bootmer.

It does seem to be working well, but my question was more on why using
parallel= "snow" isn't speeding things up, though maybe your concerns of me
having to do PB are right as well.

Thank you,

Jonathan

On Fri, Nov 9, 2018 at 6:44 PM Ben Bolker <bbolker using gmail.com> wrote:

>
>   [please keep r-sig-mixed-models in the Cc: if possible - although I
> see it's a judgment call in this case because the e-mail contains both
> generally pertinent info (uncertainty of FE small) and a personal-ish
> message ...]
>
>   Just to be clear, (1) I was suggesting that the uncertainty of the
> fixed effects might be *large* with respect to the uncertainty of the
> random effects, and largely independent of it; (2) have you already
> tried implementing other (approximate, faster) methods for the
> uncertainty on a small subset of the sites to convince yourself that you
> really need the full PB method?
>
> On 2018-11-09 6:28 p.m., Jonathan Miller wrote:
> > Thank you.  You are right the uncertainty of the fixed effects is
> > smaller than the others, but is of importance for my project. I
> > appreciate any thoughts you have when you have time to get to it.
> >
> > Jonathan
> >
> > On Fri, Nov 9, 2018, 5:17 PM Ben Bolker <bbolker using gmail.com
> > <mailto:bbolker using gmail.com>> wrote:
> >
> >
> >       I will give this some thought when I get a chance (hopefully
> someone
> >     else will give it some thought and find some answers sooner ...)  In
> the
> >     meantime -- do you really need parametric bootstrapping/bootMer to
> get
> >     the confidence intervals you want?  It's quite possible that a
> simpler
> >     approximation (e.g. assuming that the variation caused by
> uncertainty in
> >     the top-level random-effects parameters is small relative to other
> >     sources of variability) is adequate, given that you have thousands of
> >     samples ...
> >
> >     On 2018-11-09 4:15 p.m., Jonathan Miller wrote:
> >     > Dr. Bolker,
> >     >
> >     > I am a Phd student at NCSU and struggling with a coding issue. I am
> >     > bootstrapping some glmm model predictions in order to determine the
> >     > uncertainty associated with their fixed effects.  I read your
> >     comments on
> >     > https://github.com/lme4/lme4/issues/388 and have used a code
> >     similar to
> >     > yours below (b3):
> >     >
> >     > ## param, RE, and conditional
> >     > b1 <- bootMer(fm1,FUN=sfun1,nsim=100,seed=101)
> >     > ## param and RE (no conditional)
> >     > b2 <- bootMer(fm1,FUN=sfun2,nsim=100,seed=101)
> >     > ## param only
> >     > b3 <- bootMer(fm1,FUN=function(x)
> predict(x,newdata=test,re.form=~0),
> >     >               ## re.form=~0 is equivalent to use.u=FALSE
> >     >               nsim=100,seed=101)
> >     >
> >     >
> >     > It has worked well for me but takes an extremely long time to run.
> >     I am
> >     > predicting 6 different wq indicators for 1,423 sites and the
> >     datasets range
> >     > in size from 3,000 to 25,000 entries each.  The small one is
> >     relatively
> >     > runs relatively ok, but the others are extremely slow. In my code
> >     (below),
> >     > I also want to make more than one prediction (current conditions,
> >     possible
> >     > future conditions) using the bootstrapping. Using "snow" in
> parallel
> >     > doesn't seem to speed things up.  I thought of two possibilities,
> >     but am
> >     > unsure how to implement them.
> >     >
> >     > for (s in 1:1423){
> >     >
> >     > bi <- bootMer(BI.mod,FUN=function(x)
> >     > predict(x,newdata=pred.sites[s,],re.form=~0,REML=TRUE),
> >     >               parallel="snow",nsim=1000,seed=101)
> >     > bi.5 <- bootMer(BI.mod,FUN=function(x)
> >     > predict(x,newdata=pred.sites.m5[s,],re.form=~0,REML=TRUE),
> >     >               parallel="snow",nsim=1000,seed=101)
> >     > }
> >     >
> >     > 1) Can I predict the bootstrapped model using two different
> >     datasets at
> >     > once to speed things up (i.e., pred.sites and pred.sites.m5)?
> >     > 2) Can I use parallel processing of the initial loop (1,423 sites)
> >     outside
> >     > of bootmer (perhaps with doParallel and foreach) and then run
> bootmer
> >     > within that loop?  Though I have used foreach before, I find it
> >     hard to
> >     > compile the data that I really want on the backend.
> >     >
> >     > Thank you for your time and any suggestions you might have.
> >     >
> >     > Sincerely,
> >     >
> >     > Jonathan
> >     >
> >     >       [[alternative HTML version deleted]]
> >     >
> >     > _______________________________________________
> >     > R-sig-mixed-models using r-project.org
> >     <mailto:R-sig-mixed-models using r-project.org> mailing list
> >     > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >     >
> >
> >     _______________________________________________
> >     R-sig-mixed-models using r-project.org
> >     <mailto:R-sig-mixed-models using r-project.org> mailing list
> >     https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >
>

	[[alternative HTML version deleted]]