[R] bootstrapping nlme fits (was boot function)

Frank E Harrell Jr fharrell at virginia.edu
Fri Aug 22 16:18:53 CEST 2003


On Fri, 22 Aug 2003 14:39:28 +0100 (BST)
Prof Brian Ripley <ripley at stats.ox.ac.uk> wrote:

> First, this has very little to do with boot: PLEASE use an infromative
> subject line.  You need to work out how to resample in this situation: are
> you resampling subjects or observations?  If you are resampling subjects,
> you need to create a data frame containing just the resampled subjects and
> pass that to nlme.
> 
> However, you also need to think if this is valid.  If you resample 
> subjects, you will be fitting subjects twice or more as if they are 
> independent.  I know of no theoretical studies on resampling mixed-effects 
> models, and urge you to look for such results.
> 
> On Fri, 22 Aug 2003, Brunschwig, Hadassa {PDMM~Basel} wrote:

Hadassa - You may want to look at the slightly simpler generalized least squares with correlated observations case.  For that I have a bootstrap option in the Design packages's glsD function (which uses the nlme package).  There is an option to treat multiply-sampled subjects as if they were different subjects, or to pool them into one larger subject (I think the former is more correct but I haven't gotten very far in this thinking).  You can do simulations with glsD to check the performance of the cluster-sampling bootstrap in this situation.  I have done limited simulations and bootstrap variance estimates seem to be close to actual values, although not as close as information-matrix-based estimates when the model is true.  glsD attempts to implement the cluster bootstrap fairly efficiently, although it does not yet work for the case where an across-time covariance pattern is not assumed.

Frank Harrell

> 
> > I skimmed through the archives and couldnt really find an answer to my
> > question. 
> 
> It's not an R question.
> 
> > One thing i dont understand of the description of the function
> > boot() is the second variable for statistics. I have a sample of say 19
> > subjects out of these, using boot(), i would like to generate say 1000
> > samples. For these 1000 samples ill calculate an nlme() and ill use
> > these 1000 estimators of a variable to make further calculation. 
> 
> Whether this is valid most likely depends on what those calculations are.
> 
> > Now
> > what i dont understand is where the index should be set. the nlme()
> > looks like this:
> > 
> > nlme(Concentr~a*(1-exp(Day*(log(0.1,base=exp(1))/exp(logt09))))
> >                               ,data=data
> >                               ,fixed=a+logt09~1
> >                               ,random=a+logt09~1|Subject[ind]
> >                               ,start=list(fixed=c(a=30,logt09=1)))
> > 
> > My idea was to put the index ( second variable of the statistcs
> > function) 
> 
> What that variable means depends on the other arguments to boot, and you 
> haven't told us what those are.
> 
> > in the subject as i want to generate different samples of
> > subjects. I get the error that the vector ind was not found. I would be
> > happy for any help concerning this problem.
> 
> 
> 
> -- 
> Brian D. Ripley,                  ripley at stats.ox.ac.uk
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford,             Tel:  +44 1865 272861 (self)
> 1 South Parks Road,                     +44 1865 272866 (PA)
> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help


---
Frank E Harrell Jr              Prof. of Biostatistics & Statistics
Div. of Biostatistics & Epidem. Dept. of Health Evaluation Sciences
U. Virginia School of Medicine  http://hesweb1.med.virginia.edu/biostat




More information about the R-help mailing list