[BioC] Different levels of replicates and how to create a correct targets file out of that.

Naomi Altman naomi at stat.psu.edu
Thu Apr 1 00:23:37 CEST 2004


Material relevant to this discussion can be found under the thread with 
subject line:

technical replicates (again!): a summary


At 02:21 AM 3/31/2004, Johan Lindberg wrote:
>Thank you for the answer but I think that my situation is a little bit 
>different. First of all I wonder about the answer that was given 
>in  https://stat.ethz.ch/pipermail/bioconductor/2003-December/003277.html
>He has got 30 individuals with 4-6 replicates of each. This would mean 
>that 120 - 160 hybridizations have been done. The example targets file 
>that is given looks something like this:
>
>Cy3                Cy5
>Patient1     Control
>Control      Patient1
>Patient1     Control
>Patient2     Control
>Control       Patient2
>...
>
>Here is were I get confused because it looks here as the technical 
>replicates are included in the targets file (on the same level as the 
>biological replicates) and should therefore also be included in a 
>following contrast matrix. But the contrast.matrix given
>cont.matrix <- matrix(1,30,1)
>is just a row of 30 1:s (he had 30patients in the study) witch indicates 
>that only the true biological replicates would be included in the B-stat 
>analysis???
>Back to my experiment. My real problem I think is that I have no common 
>reference between the different samples. In the example above he has got 
>this "control" used in the hybridizations. But I have hybridized a biopsy 
>before and then after treatment for each individual.
>
>Cy3                Cy5
>Patient1 before     Patient1 after
>Patient1 after      Patient1 before
>Patient2 before     Patient2 after
>...
>
>But since the effect I am looking for is the effect of the treatment, not 
>the between patients effect, would it be correct to use the same approach 
>as the given example 
>https://stat.ethz.ch/pipermail/bioconductor/2003-December/003277.html
>even though I have no common reference?
>
>Another question that was not aswered is how to treat different replicates 
>on different levels. Since I have 1-2 biopsy taken from different 
>individuals plus technical replicates of each. Is there a way of dealing 
>with this kind of stuff in LIMMA? Should one just average over lower 
>levels of replicates and then just put in true biological replicates in 
>the targets file/contrast matrix?
>
>Best regards
>
>/ Johan Lindberg
>
>
>
>
>
>
>
>
>At 10:32 2004-03-31 +1000, Gordon Smyth wrote:
>>At 11:51 PM 30/03/2004, Johan Lindberg wrote:
>>>Sorry, I forgot to have a subject on the mail I sent before.
>>>
>>>Hello everyone.
>>>I would really appreciate some comments/hints/help with a pretty long 
>>>question.
>>
>>This question has been asked on the list before. See:
>>
>>https://stat.ethz.ch/pipermail/bioconductor/2003-December/003277.html
>>
>>The simplest treatment in limma is simply to treat your experiment as 
>>having two factors, one factor having 10 levels indicating the patient 
>>and one taking two levels, before or after. This treatment is analogous 
>>to a paired-test or to a two-way analysis of variance.
>>
>>An alternative treatment would be to treat the patients as random 
>>effects. That would also be a correct treatment, and potentially a little 
>>more powerful, but also much more difficult and I don't think you gain 
>>very much.
>>
>>>I have an experiment consisting of 18 hybridizations. On the 30K cDNA 
>>>arrays knee joint bioipsies (from different patients) before and after a 
>>>certain treatment is hybridized. What I want to find out is the effect 
>>>of the treatment, not the difference between the patients. The problem 
>>>is how to deal with different levels of replicates and how to create a 
>>>correct target file since I have no common reference?
>>>This is how the experimental set-up looks like.
>>>
>>>Patient Hybridization   Cy3                                     Cy5
>>>1               1A                      Biopsy 1 before 
>>>treatment       Biopsy 1 after treatment
>>>                 1B                      Biopsy 1 after 
>>> treatment        Biopsy 1 before treatment
>>>3               2A                      Biopsy 1 before 
>>>treatment       Biopsy 1 after treatment
>>>                 2B                      Biopsy 1 after 
>>> treatment        Biopsy 1 before treatment
>>>                 3A                      Biopsy 2 before 
>>> treatment       Biopsy 2 after treatment
>>>                 3B                      Biopsy 2 after 
>>> treatment        Biopsy 2 before treatment
>>>4               4A                      Biopsy 1 before 
>>>treatment       Biopsy 1 after treatment
>>>                 4B                      Biopsy 1 after 
>>> treatment        Biopsy 1 before treatment
>>>                 5A                      Biopsy 2 before 
>>> treatment       Biopsy 2 after treatment
>>>                 5B                      Biopsy 2 after 
>>> treatment        Biopsy 2 before treatment
>>>5               6A                      Biopsy 1 before 
>>>treatment       Biopsy 1 after treatment
>>>                 6B                      Biopsy 1 after 
>>> treatment        Biopsy 1 before treatment
>>>6               7A                      Biopsy 1 before 
>>>treatment       Biopsy 1 after treatment
>>>                 7B                      Biopsy 1 after 
>>> treatment        Biopsy 1 before treatment
>>>7               8A                      Biopsy 1 before 
>>>treatment       Biopsy 1 after treatment
>>>                 8B                      Biopsy 1 after 
>>> treatment        Biopsy 1 before treatment
>>>10              9A                      Biopsy 1 before 
>>>treatment       Biopsy 1 after treatment
>>>                 9B                      Biopsy 1 after 
>>> treatment        Biopsy 1 before treatment
>>>
>>>As you can see different patients have one or two biopsies taken from 
>>>them. Since I realize it would be a mistake to include all those into 
>>>the target file because if I have more measurements of a certain patient 
>>>that would bias the ranking of the B-stat towards the patient having the 
>>>most biopsies in the end, right? Or?
>>>Since the differentially expressed genes in the patient with more 
>>>biopsies will get smaller variance?
>>>
>>>My solution to the problem was just to create an artificial Mmatrix 
>>>twice as long as the original MA object. For the patients with two 
>>>biopsies I averaged over the technical replicates (dye-swaps) and put 
>>>the values from biopsy one and then the values from biopsy two in the 
>>>matrix. From patients with just a technical replicate I put the values 
>>>from hybridization 1A and then hybridization 1B into the matrix.
>>>
>>>The M-values of that matrix object would look something like:
>>>
>>>                         patient 
>>> 1               patient3                                        ....
>>>Rows 1-30000    Hybridization 1A        Average of hybridization 2A and 
>>>2B      ....
>>>Rows 30001-60000        Hybridization 1B        Average of hybridization 
>>>3A and 3B      ....
>>>
>>>After this I plan to use dupcor on the new matrix of M-values, as if I 
>>>would have a slide with replicate spots on it.
>>>
>>>So far so good or? Is this a good way of treating replicates on 
>>>different levels or has anyone else some better idea of how to do this. 
>>>Comments please.....
>>>
>>>
>>>And now, how to create a correct targets file since I have no common 
>>>reference.
>>>I guess it would look something like this:
>>>
>>>SlideNumber     Name    FileName        Cy3     Cy5
>>>1       pat1_p  test1.gpr       Before_p1       After_p1
>>>2       pat3_p  test2.gpr       Before_p2       After_p2
>>>3       pat4_p  test3.gpr       Before_p3       After_p3
>>>4       pat6_p  test4.gpr       Before_p4       After_p4
>>>5       pat7_p  test5.gpr       Before_p5       After_p5
>>>6       pat10_p test6.gpr       Before_p6       After_p6
>>>
>>>But when I want to make my contrast matrix I am lost since I do not have 
>>>anything to write as ref.
>>>design <- modelMatrix(targets, ref="????????")
>>>
>>>If I redo the matrix to
>>>
>>>SlideNumber     Name    FileName        Cy3     Cy5
>>>1       pat1_p  test1.gpr       Before_p        After_p
>>>2       pat3_p  test2.gpr       Before_p        After_p
>>>3       pat4_p  test3.gpr       Before_p        After_p
>>>4       pat6_p  test4.gpr       Before_p        After_p
>>>5       pat7_p  test5.gpr       Before_p        After_p
>>>6       pat10_p test6.gpr       Before_p        After_p
>>>
>>>wouldnt that be the same as treating this as a common reference design 
>>>when it is not? And wouldnt that effect the variance of the experiment? 
>>>How do I do this in a correct way.
>>>I looked at the Zebra fish example in the LIMMA user guide but isnt that 
>>>wrong as well. Because technical and biological replicates are treated 
>>>the same way in the targets file of the zebra fish.
>>
>>Dye-swap pairs are not necessarily technical replicates.
>>
>>>I realize that many of these questions should have been considered 
>>>before conducting the lab part but unfortunately they were not. So I 
>>>will not be surprised if someone sends me the same quote as I got 
>>>yesterday from a friend:
>>>
>>>"To consult a statistician after an experiment is finished is often 
>>>merely to ask him to conduct a post mortem examination. He can perhaps 
>>>say what the experiment died of."
>>>- R.A. Fisher, Presidential Address to the First Indian Statistical 
>>>Congress, 1938
>>>
>>>Best regards
>>
>>Gordon
>>
>>>/Johan Lindberg
>>>
>>>_______________________________________________
>>>Bioconductor mailing list
>>>Bioconductor at stat.math.ethz.ch
>>>https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor
>
>_______________________________________________
>Bioconductor mailing list
>Bioconductor at stat.math.ethz.ch
>https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor

Naomi S. Altman                                814-865-3791 (voice)
Associate Professor
Bioinformatics Consulting Center
Dept. of Statistics                              814-863-7114 (fax)
Penn State University                         814-865-1348 (Statistics)
University Park, PA 16802-2111



More information about the Bioconductor mailing list