[R-sig-ME] Bootstrap the variance difference
Andrew Robinson
A.Robinson at ms.unimelb.edu.au
Mon Feb 6 23:43:26 CET 2012
Hi Kevin,
add an indicator to each dataset and then rbind them. The indicator
variable then distinguishes between the sources. Pass it to boot as
the strata= argument. Then also use that indicator in the varcomp
function to distinguish between the two datasets, e.g. using the
subset= arguments in lmer..
Cheers
Andrew
On Mon, Feb 06, 2012 at 03:32:35PM -0600, Kevin Spring wrote:
> I asked this question on Stack Exchange, but I think it might be too
> specialized. Hopefully someone in the mixed model group can help me out.
>
> I want to be able to bootstrap the variance differences between two data
> sets obtained at different times while taking out the error in a random
> effect.
>
> I have 2 sets of experimental data, where the data was measured at 2 time
> points (initial and final). I also have a set of simulation data. I want to
> compare the variance of the simulated date with the variance difference
> between the experimental data (final - initial). The idea is to get
> confidence intervals from the bootstrap to compare the experimental data
> with the simulation.
>
> I am having trouble making the statistic for the bootstrap function in the
> boot package for R. So far I have.
>
> varcomp <- function ( formula, data, indices ) {
> d <- data[indices,] #sample for boot
> fit <- lmer(formula, data=d) #linear model
> res.var = (attr (VarCorr(fit), "sc")^2) # variance estimation
> return(res.var)
> }
>
> But this function only returns the variance of a single data set. I want to
> be able to input 2 sets of data and have it return the difference between
> the two data sets' variance.
>
> When I try something like:
>
> varcomp <- function ( formula, data1, data2, indices ) {
> d1 <- data1[indices,] #sample for boot
> d2 <- data2[indices,] #sample for boot
> fit1 <- lmer(formula, data=d1) #linear model
> fit2 <- lmer(formula, data=d2) #linear model
> a = (attr (VarCorr(fit1), "sc")^2) #output variance estimation
> b = (attr (VarCorr(fit2), "sc")^2) #output variance estimation
> drv = a - b #difference between the variance estimations
> return(drv)
> }
>
> I would then put it into boot such as:
>
> ip1.boot <- boot ( data = ip1, statistic=varcomp, R=100,
> formula=CNPC~(1|Cell.line:DNA.extract)+Cell.line)
>
> I can't do it this way because the boot function only allows for one data
> set to be inputted.
>
> *Does anyone know how to create the correct statistic function for this?*
>
> An example of the data can also be downloaded
> here<http://www.mediafire.com/file/68a3ro1cfneiy2r/data.zip.zip>(2 csv
> files zipped 1.22KB.)
>
> My data looks something like the following:
>
> Initial
>
> Cell.line Time DNA.extract Gene CNPC
> 1 9 initial 1 atubP1 1778.4589
> 2 9 initial 1 atubP1 2108.0552
> 3 9 initial 1 atubP1 2118.6725
> 4 9 initial 2 atubP1 2018.6593
> 5 9 initial 2 atubP1 1935.9008
> 6 9 initial 2 atubP1 1749.9158
> 7 9 initial 3 atubP1 1524.7475
> 8 9 initial 3 atubP1 1532.9781
> 9 9 initial 3 atubP1 1693.3098
> 10 17 initial 1 atubP1 1076.4720
> 11 17 initial 1 atubP1 1101.3315
> 12 17 initial 1 atubP1 1185.3606
> 13 17 initial 2 atubP1 1131.1118
> 14 17 initial 2 atubP1 892.7087
> 15 17 initial 2 atubP1 1028.5465
> 16 17 initial 3 atubP1 887.9972
> 17 17 initial 3 atubP1 732.9646
> 18 17 initial 3 atubP1 680.6724
>
> Final
>
> Cell.line Time DNA.extract Gene CNPC
> 1 9 final 1 atubP1 1262.2378
> 2 9 final 1 atubP1 1261.9858
> 3 9 final 1 atubP1 1390.6873
> 4 9 final 2 atubP1 1539.7180
> 5 9 final 2 atubP1 1510.5405
> 6 9 final 2 atubP1 1443.1767
> 7 9 final 3 atubP1 1456.2050
> 8 9 final 3 atubP1 1578.6396
> 9 9 final 3 atubP1 1656.1822
> 10 17 final 1 atubP1 1462.5179
> 11 17 final 1 atubP1 1580.9956
> 12 17 final 1 atubP1 1255.9020
> 13 17 final 2 atubP1 886.7579
> 14 17 final 2 atubP1 581.8116
> 15 17 final 2 atubP1 722.0526
> 16 17 final 3 atubP1 4168.7895
> 17 17 final 3 atubP1 3266.2105
> 18 17 final 3 atubP1 4219.5645
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
--
Andrew Robinson
Deputy Director, ACERA
Department of Mathematics and Statistics Tel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia (prefer email)
http://www.ms.unimelb.edu.au/~andrewpr Fax: +61-3-8344-4599
http://www.acera.unimelb.edu.au/
Forest Analytics with R (Springer, 2011)
http://www.ms.unimelb.edu.au/FAwR/
Introduction to Scientific Programming and Simulation using R (CRC, 2009):
http://www.ms.unimelb.edu.au/spuRs/
More information about the R-sig-mixed-models
mailing list