[BioC] Design question: How to account for dependent samples?

Sun Feb 24 01:38:36 CET 2013

Dear Moritz,

See Section 8.7 "Multi-level Experiments" in the limma User's Guide:

http://www.bioconductor.org/packages/2.11/bioc/vignettes/limma/inst/doc/usersguide.pdf

Your experiment is a bit simpler than the example because you have only 
one factor other than patient.

Best wishes
Gordon

--------------- original message -----------------
[BioC] Design question: How to account for dependent samples?
Moritz Kebschull endothel at gmail.com
Thu Feb 21 11:43:44 CET 2013

Dear list,

I am looking at a microarray dataset that consists of 'healthy' and 
'diseased' samples from patients with two different diagnoses.

We have several 'diseased' samples per patient. For many, but not all 
patients, a single healthy sample exists (therefore, I cannot do paired 
analyses within individual patients).

Thus far, since the multiple samples per patient are dependent on each 
other, we had aggregated them into a single 'diseased' sample mean for 
each patient.

edata_diseased_aggregated <- sapply(unique(patnumbers),
function(i)rowMeans(edata_diseased[, patnumbers==i]))

The design was basically

design = cbind(Cond1 healthy, Cond1 diseased, Cond2 healthy, Cond2 
diseased)

with the following contrasts

contrastsMatrix=makeContrasts("C1d-C1h", "C2d-C2h", "C1h-C2h", "C1d-C2d",
levels=design)

This approach does, however, strongly reduce the power of the comparison.

I was wondering whether aggregation was in fact the correct thing to do
here.

What about a design that factors in the multiple samples per patient,
similar to technical (=within patient) and biological (=several patients
with the same diagnosis) replicates? How would you suggest to implement
this here?

Many thanks,
Moritz (Univ. of Bonn, Germany)

______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}