[R] Incomplete, unbalanced design, and pseudoreplication?

Jennifer Mollon jennifer.mollon at kcl.ac.uk
Mon Nov 9 10:57:45 CET 2009


Hello,

I am trying to help someone who has carried out an experiment and I'm  
finding it quite difficult to understand the appropriate model to use  
& code it.

The response is a measurement - the amount of DNA extracted during the  
experiment.  There were 2 factors to be tested - one is the condition  
under which the experiment took place and the other is the type of DNA  
to be extracted.  Each set of factors was replicated, so condition A  
and DNA type A were tested twice using the same input material.   
Finally, the whole experiment was repeated twice, but in one of the  
experiments there was not enough input material and one of the DNA  
types (call it type D) was not tested at all, but all other levels of  
that factor and the condition factor were tested.  From this, I think:

1. The replicates within each experiment are pseudoreplicates - there  
are pairs of measures with the same input material, and both factor  
levels are the same.
2. The 2 experiments can be treated as blocks, but they are not  
balanced or complete.

There are 2 questions of interest to the experimenter:

1. Does the amount of DNA extracted differ for the different DNA types  
under the different conditions?
2. One of the conditions is new, and of particular interest.  Under  
this condition, are there significantly different amounts of DNA  
extracted depending on DNA type?  There are 2 particular contrasts of  
interest here, call them DNA types B&C vs A, and B&C vs D.  DNA type D  
is only tested in the second experiment.

I would be very grateful for comments about the analysis of this  
complicated data set. Are my beliefs above correct, regarding the  
design?  If so, which R packages and methods can help me with this  
analysis?  In particular, how should the error term be structured for  
this design?  And finally, are the 2 research questions best answered  
by 2 separate analyses (e.g. the second one looking at only the one  
condition in isolation), or can a single analysis of a full model  
answer both of these questions?

Many thanks for your consideration and time,
Jen




More information about the R-help mailing list