[R-sig-ME] power simulations for lmer

Beate Glaser b.glaser at bristol.ac.uk
Mon Jul 7 13:53:05 CEST 2008


Dear lmer users,

I have been reading the mailing list for a while and came across many good 
advice/suggestions/ideas from all the correspondence. I was wondering if I 
could ask your opinion about a DNA methylation project we wanted to set up.

We have DNA collected at different time points (time) and want to determine 
associations between quantitative methylation status (met) and traits 
(height/weight etc). The data will be measured for different loci (locus) 
across the genome. The DNA was collected in different tubes (tube) and 
extracted with different methods (extr). As we are in the planning phase, I 
don't have real data yet, but this is how it will look like:

sub	locus	time 		tube		extr		met	sex
1	1		0		Heparin	Phenol	0.2	1
1	1		40		EDTA		Phenol	0.4	1
1	1		60		EDTA		Phenol	0.6	1
1	1		80		EDTA		SO		0.7	1
1	1		100		CPD		Cells		0.8	1
1	2		0		Heparin	Phenol	0.2	1
1	2		40		EDTA		Phenol	0.4	1
1	2		60		EDTA		Phenol	0.5	1
1	2		80		EDTA		SO		0.4	1
1	2		100		CPD		Cells		0.4	1
1	3		0		Heparin	Phenol	0.2	1
1	3		40		EDTA		Phenol	0.6	1
1	3		60		EDTA		Phenol	0.3	1
1	3		80		EDTA		SO		0.4	1
1	3		100		CPD		Cells		0.5	1
1	4		0		Heparin	Phenol	0.3	1
1	4		40		EDTA		Phenol	0.4	1
1	4		60		EDTA		Phenol	0.4	1
1	4		80		EDTA		SO		0.7	1
1	4		100		CPD		Cells		0.3	1


1) We wanted to run a pilot project without trait analysis across many loci 
(100 - 1000) to assess the DNA handling effects. Analysis would be 
performed with a crossed effect mixed model:

fit <- lmer(met ~ poly(I(time),2)*sex + (1|extr) + (1|tube) + (1|locus) + 
(poly(I(time),2)| subj), data, method="REML")

We hope to see that the correlation within individuals is stronger than the 
one between samples isolated with identical methods, and if not we need to 
account for this in our main experiment. For each factor combination in the 
pilot project we have around 5 individuals, and the only variable we can 
truly influence is the number of met loci we want to analyse. Could anyone 
point me in the right direction of how to set up power simulations to 
determine the number of met loci we would ideally need, in order to assess 
the effect of handling (extr and tube)?

Another problem is that the tube factor is not balanced although it is 
nested within extr; so (1|extr/tube) deemed unreasonable.

2) For our main experiment we truly cannot afford to measure the DNA 
methylation status in all individuals across 1000 loci; We will concentrate 
on a specific locus (locus_of_interest) and determine its methylation 
pattern in many people who have a specific trait.

fit2 <- lmer(met ~ poly(I(time),2)*sex*locus_of_interest*trait + (1|extr) + 
(1|tube) + (1|locus_of_interest) + (poly(I(time),2)| subj), data)

Would anyone know if there is a way to include the more precise random 
effects for (extr and tube) from the pilot experiment into the main model? 
(would weights be a possibility?)

It would be great if you could let me know any 
comments/suggestions/questions/simplifications. This would help quite a lot,
Many thanks,

Beate


----------------------
Beate Glaser
Dept Social Medicine
Canynge Hall
Room 3.5
Whiteladies Road
Bristol BS8 2PR
UK

++44-117-331-3901




More information about the R-sig-mixed-models mailing list