[R-sig-ME] Bootstrap the variance difference

Mon Feb 6 23:43:26 CET 2012

Hi Kevin,

add an indicator to each dataset and then rbind them. The indicator
variable then distinguishes between the sources. Pass it to boot as
the strata= argument.  Then also use that indicator in the varcomp
function to distinguish between the two datasets, e.g. using the
subset= arguments in lmer..

Cheers

Andrew

On Mon, Feb 06, 2012 at 03:32:35PM -0600, Kevin Spring wrote:
>  I asked this question on Stack Exchange, but I think it might be too
> specialized.  Hopefully someone in the mixed model group can help me out.
> 
> I want to be able to bootstrap the variance differences between two data
> sets obtained at different times while taking out the error in a random
> effect.
> 
> I have 2 sets of experimental data, where the data was measured at 2 time
> points (initial and final). I also have a set of simulation data. I want to
> compare the variance of the simulated date with the variance difference
> between the experimental data (final - initial). The idea is to get
> confidence intervals from the bootstrap to compare the experimental data
> with the simulation.
> 
> I am having trouble making the statistic for the bootstrap function in the
> boot package for R. So far I have.
> 
> varcomp <- function ( formula, data, indices ) {
>     d <- data[indices,] #sample for boot
>     fit <- lmer(formula, data=d) #linear model
>     res.var = (attr (VarCorr(fit), "sc")^2) # variance estimation
>     return(res.var)
>     }
> 
> But this function only returns the variance of a single data set. I want to
> be able to input 2 sets of data and have it return the difference between
> the two data sets' variance.
> 
> When I try something like:
> 
> varcomp <- function ( formula, data1, data2, indices ) {
> d1 <- data1[indices,] #sample for boot
> d2 <- data2[indices,] #sample for boot
> fit1 <- lmer(formula, data=d1) #linear model
> fit2 <- lmer(formula, data=d2) #linear model
> a = (attr (VarCorr(fit1), "sc")^2) #output variance estimation
> b = (attr (VarCorr(fit2), "sc")^2) #output variance estimation
> drv = a - b #difference between the variance estimations
> return(drv)
> }
> 
> I would then put it into boot such as:
> 
> ip1.boot <- boot ( data = ip1, statistic=varcomp, R=100,
> formula=CNPC~(1|Cell.line:DNA.extract)+Cell.line)
> 
> I can't do it this way because the boot function only allows for one data
> set to be inputted.
> 
> *Does anyone know how to create the correct statistic function for this?*
> 
> An example of the data can also be downloaded
> here<http://www.mediafire.com/file/68a3ro1cfneiy2r/data.zip.zip>(2 csv
> files zipped 1.22KB.)
> 
> My data looks something like the following:
> 
> Initial
> 
>        Cell.line    Time DNA.extract   Gene      CNPC
> 1          9 initial           1 atubP1 1778.4589
> 2          9 initial           1 atubP1 2108.0552
> 3          9 initial           1 atubP1 2118.6725
> 4          9 initial           2 atubP1 2018.6593
> 5          9 initial           2 atubP1 1935.9008
> 6          9 initial           2 atubP1 1749.9158
> 7          9 initial           3 atubP1 1524.7475
> 8          9 initial           3 atubP1 1532.9781
> 9          9 initial           3 atubP1 1693.3098
> 10        17 initial           1 atubP1 1076.4720
> 11        17 initial           1 atubP1 1101.3315
> 12        17 initial           1 atubP1 1185.3606
> 13        17 initial           2 atubP1 1131.1118
> 14        17 initial           2 atubP1  892.7087
> 15        17 initial           2 atubP1 1028.5465
> 16        17 initial           3 atubP1  887.9972
> 17        17 initial           3 atubP1  732.9646
> 18        17 initial           3 atubP1  680.6724
> 
> Final
> 
>    Cell.line  Time DNA.extract   Gene      CNPC
> 1          9 final           1 atubP1 1262.2378
> 2          9 final           1 atubP1 1261.9858
> 3          9 final           1 atubP1 1390.6873
> 4          9 final           2 atubP1 1539.7180
> 5          9 final           2 atubP1 1510.5405
> 6          9 final           2 atubP1 1443.1767
> 7          9 final           3 atubP1 1456.2050
> 8          9 final           3 atubP1 1578.6396
> 9          9 final           3 atubP1 1656.1822
> 10        17 final           1 atubP1 1462.5179
> 11        17 final           1 atubP1 1580.9956
> 12        17 final           1 atubP1 1255.9020
> 13        17 final           2 atubP1  886.7579
> 14        17 final           2 atubP1  581.8116
> 15        17 final           2 atubP1  722.0526
> 16        17 final           3 atubP1 4168.7895
> 17        17 final           3 atubP1 3266.2105
> 18        17 final           3 atubP1 4219.5645
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

-- 
Andrew Robinson  
Deputy Director, ACERA 
Department of Mathematics and Statistics            Tel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia               (prefer email)
http://www.ms.unimelb.edu.au/~andrewpr              Fax: +61-3-8344-4599
http://www.acera.unimelb.edu.au/

Forest Analytics with R (Springer, 2011) 
http://www.ms.unimelb.edu.au/FAwR/
Introduction to Scientific Programming and Simulation using R (CRC, 2009): 
http://www.ms.unimelb.edu.au/spuRs/