[BioC] Different levels of replicates and how to create a
correct targets file out of that.
Gordon Smyth
smyth at wehi.edu.au
Thu Apr 1 09:43:13 CEST 2004
Dear Johan,
Now I've had a chance to read your email more thoroughly, I think you
actually have a clever approach.
At 11:51 PM 30/03/2004, Johan Lindberg wrote:
>Sorry, I forgot to have a subject on the mail I sent before.
>
>Hello everyone.
>I would really appreciate some comments/hints/help with a pretty long
>question.
>
>I have an experiment consisting of 18 hybridizations. On the 30K cDNA
>arrays knee joint bioipsies (from different patients) before and after a
>certain treatment is hybridized. What I want to find out is the effect of
>the treatment, not the difference between the patients. The problem is how
>to deal with different levels of replicates and how to create a correct
>target file since I have no common reference?
>This is how the experimental set-up looks like.
>
>Patient Hybridization Cy3 Cy5
>1 1A Biopsy 1 before
>treatment Biopsy 1 after treatment
> 1B Biopsy 1 after
> treatment Biopsy 1 before treatment
>3 2A Biopsy 1 before
>treatment Biopsy 1 after treatment
> 2B Biopsy 1 after
> treatment Biopsy 1 before treatment
> 3A Biopsy 2 before
> treatment Biopsy 2 after treatment
> 3B Biopsy 2 after
> treatment Biopsy 2 before treatment
>4 4A Biopsy 1 before
>treatment Biopsy 1 after treatment
> 4B Biopsy 1 after
> treatment Biopsy 1 before treatment
> 5A Biopsy 2 before
> treatment Biopsy 2 after treatment
> 5B Biopsy 2 after
> treatment Biopsy 2 before treatment
>5 6A Biopsy 1 before
>treatment Biopsy 1 after treatment
> 6B Biopsy 1 after
> treatment Biopsy 1 before treatment
>6 7A Biopsy 1 before
>treatment Biopsy 1 after treatment
> 7B Biopsy 1 after
> treatment Biopsy 1 before treatment
>7 8A Biopsy 1 before
>treatment Biopsy 1 after treatment
> 8B Biopsy 1 after
> treatment Biopsy 1 before treatment
>10 9A Biopsy 1 before
>treatment Biopsy 1 after treatment
> 9B Biopsy 1 after
> treatment Biopsy 1 before treatment
You have an unbalanced design with three error strata: patient, biopsy,
microarray. In principle one would like to treat this using a model with
nested random effects but, as recent discussion has indicated, this is not
so straightforward.
>As you can see different patients have one or two biopsies taken from
>them. Since I realize it would be a mistake to include all those into the
>target file because if I have more measurements of a certain patient that
>would bias the ranking of the B-stat towards the patient having the most
>biopsies in the end, right? Or?
>Since the differentially expressed genes in the patient with more biopsies
>will get smaller variance?
>
>My solution to the problem was just to create an artificial Mmatrix twice
>as long as the original MA object. For the patients with two biopsies I
>averaged over the technical replicates (dye-swaps) and put the values from
>biopsy one and then the values from biopsy two in the matrix. From
>patients with just a technical replicate I put the values from
>hybridization 1A and then hybridization 1B into the matrix.
>
>The M-values of that matrix object would look something like:
>
> patient
> 1 patient3 ....
>Rows 1-30000 Hybridization 1A Average of hybridization 2A and
>2B ....
>Rows 30001-60000 Hybridization 1B Average of hybridization
>3A and 3B ....
>
>After this I plan to use dupcor on the new matrix of M-values, as if I
>would have a slide with replicate spots on it.
>
>So far so good or? Is this a good way of treating replicates on different
>levels or has anyone else some better idea of how to do this. Comments
>please.....
This is actually very clever. You've got rid of one error strata by
averaging, then use duplicateCorrelation to handle the other. I think your
approach is actually a good one *but* you need to give double weight to
cases where you have averaged over two technical replicates. Use the
'weights' component of your MAList object to do this.
>And now, how to create a correct targets file since I have no common
>reference.
>I guess it would look something like this:
>
>SlideNumber Name FileName Cy3 Cy5
>1 pat1_p test1.gpr Before_p1 After_p1
>2 pat3_p test2.gpr Before_p2 After_p2
>3 pat4_p test3.gpr Before_p3 After_p3
>4 pat6_p test4.gpr Before_p4 After_p4
>5 pat7_p test5.gpr Before_p5 After_p5
>6 pat10_p test6.gpr Before_p6 After_p6
>
>But when I want to make my contrast matrix I am lost since I do not have
>anything to write as ref.
>design <- modelMatrix(targets, ref="????????")
If I have understood your approach, you don't need to do anything about the
targets file or the design matrix. Just use design <- rep(1,6). You now
have independent M-values estimating the same thing.
Gordon
>If I redo the matrix to
>
>SlideNumber Name FileName Cy3 Cy5
>1 pat1_p test1.gpr Before_p After_p
>2 pat3_p test2.gpr Before_p After_p
>3 pat4_p test3.gpr Before_p After_p
>4 pat6_p test4.gpr Before_p After_p
>5 pat7_p test5.gpr Before_p After_p
>6 pat10_p test6.gpr Before_p After_p
>
>wouldnt that be the same as treating this as a common reference design
>when it is not? And wouldnt that effect the variance of the experiment?
>How do I do this in a correct way.
>I looked at the Zebra fish example in the LIMMA user guide but isnt that
>wrong as well. Because technical and biological replicates are treated the
>same way in the targets file of the zebra fish.
>
>I realize that many of these questions should have been considered before
>conducting the lab part but unfortunately they were not. So I will not be
>surprised if someone sends me the same quote as I got yesterday from a friend:
>
>"To consult a statistician after an experiment is finished is often merely
>to ask him to conduct a post mortem examination. He can perhaps say what
>the experiment died of."
>- R.A. Fisher, Presidential Address to the First Indian Statistical
>Congress, 1938
>
>Best regards
>
>/Johan Lindberg
More information about the Bioconductor
mailing list