[BioC] Different levels of replicates and how to create a
correct targets file out of that.
Naomi Altman
naomi at stat.psu.edu
Thu Apr 1 00:23:37 CEST 2004
Material relevant to this discussion can be found under the thread with
subject line:
technical replicates (again!): a summary
At 02:21 AM 3/31/2004, Johan Lindberg wrote:
>Thank you for the answer but I think that my situation is a little bit
>different. First of all I wonder about the answer that was given
>in https://stat.ethz.ch/pipermail/bioconductor/2003-December/003277.html
>He has got 30 individuals with 4-6 replicates of each. This would mean
>that 120 - 160 hybridizations have been done. The example targets file
>that is given looks something like this:
>
>Cy3 Cy5
>Patient1 Control
>Control Patient1
>Patient1 Control
>Patient2 Control
>Control Patient2
>...
>
>Here is were I get confused because it looks here as the technical
>replicates are included in the targets file (on the same level as the
>biological replicates) and should therefore also be included in a
>following contrast matrix. But the contrast.matrix given
>cont.matrix <- matrix(1,30,1)
>is just a row of 30 1:s (he had 30patients in the study) witch indicates
>that only the true biological replicates would be included in the B-stat
>analysis???
>Back to my experiment. My real problem I think is that I have no common
>reference between the different samples. In the example above he has got
>this "control" used in the hybridizations. But I have hybridized a biopsy
>before and then after treatment for each individual.
>
>Cy3 Cy5
>Patient1 before Patient1 after
>Patient1 after Patient1 before
>Patient2 before Patient2 after
>...
>
>But since the effect I am looking for is the effect of the treatment, not
>the between patients effect, would it be correct to use the same approach
>as the given example
>https://stat.ethz.ch/pipermail/bioconductor/2003-December/003277.html
>even though I have no common reference?
>
>Another question that was not aswered is how to treat different replicates
>on different levels. Since I have 1-2 biopsy taken from different
>individuals plus technical replicates of each. Is there a way of dealing
>with this kind of stuff in LIMMA? Should one just average over lower
>levels of replicates and then just put in true biological replicates in
>the targets file/contrast matrix?
>
>Best regards
>
>/ Johan Lindberg
>
>
>
>
>
>
>
>
>At 10:32 2004-03-31 +1000, Gordon Smyth wrote:
>>At 11:51 PM 30/03/2004, Johan Lindberg wrote:
>>>Sorry, I forgot to have a subject on the mail I sent before.
>>>
>>>Hello everyone.
>>>I would really appreciate some comments/hints/help with a pretty long
>>>question.
>>
>>This question has been asked on the list before. See:
>>
>>https://stat.ethz.ch/pipermail/bioconductor/2003-December/003277.html
>>
>>The simplest treatment in limma is simply to treat your experiment as
>>having two factors, one factor having 10 levels indicating the patient
>>and one taking two levels, before or after. This treatment is analogous
>>to a paired-test or to a two-way analysis of variance.
>>
>>An alternative treatment would be to treat the patients as random
>>effects. That would also be a correct treatment, and potentially a little
>>more powerful, but also much more difficult and I don't think you gain
>>very much.
>>
>>>I have an experiment consisting of 18 hybridizations. On the 30K cDNA
>>>arrays knee joint bioipsies (from different patients) before and after a
>>>certain treatment is hybridized. What I want to find out is the effect
>>>of the treatment, not the difference between the patients. The problem
>>>is how to deal with different levels of replicates and how to create a
>>>correct target file since I have no common reference?
>>>This is how the experimental set-up looks like.
>>>
>>>Patient Hybridization Cy3 Cy5
>>>1 1A Biopsy 1 before
>>>treatment Biopsy 1 after treatment
>>> 1B Biopsy 1 after
>>> treatment Biopsy 1 before treatment
>>>3 2A Biopsy 1 before
>>>treatment Biopsy 1 after treatment
>>> 2B Biopsy 1 after
>>> treatment Biopsy 1 before treatment
>>> 3A Biopsy 2 before
>>> treatment Biopsy 2 after treatment
>>> 3B Biopsy 2 after
>>> treatment Biopsy 2 before treatment
>>>4 4A Biopsy 1 before
>>>treatment Biopsy 1 after treatment
>>> 4B Biopsy 1 after
>>> treatment Biopsy 1 before treatment
>>> 5A Biopsy 2 before
>>> treatment Biopsy 2 after treatment
>>> 5B Biopsy 2 after
>>> treatment Biopsy 2 before treatment
>>>5 6A Biopsy 1 before
>>>treatment Biopsy 1 after treatment
>>> 6B Biopsy 1 after
>>> treatment Biopsy 1 before treatment
>>>6 7A Biopsy 1 before
>>>treatment Biopsy 1 after treatment
>>> 7B Biopsy 1 after
>>> treatment Biopsy 1 before treatment
>>>7 8A Biopsy 1 before
>>>treatment Biopsy 1 after treatment
>>> 8B Biopsy 1 after
>>> treatment Biopsy 1 before treatment
>>>10 9A Biopsy 1 before
>>>treatment Biopsy 1 after treatment
>>> 9B Biopsy 1 after
>>> treatment Biopsy 1 before treatment
>>>
>>>As you can see different patients have one or two biopsies taken from
>>>them. Since I realize it would be a mistake to include all those into
>>>the target file because if I have more measurements of a certain patient
>>>that would bias the ranking of the B-stat towards the patient having the
>>>most biopsies in the end, right? Or?
>>>Since the differentially expressed genes in the patient with more
>>>biopsies will get smaller variance?
>>>
>>>My solution to the problem was just to create an artificial Mmatrix
>>>twice as long as the original MA object. For the patients with two
>>>biopsies I averaged over the technical replicates (dye-swaps) and put
>>>the values from biopsy one and then the values from biopsy two in the
>>>matrix. From patients with just a technical replicate I put the values
>>>from hybridization 1A and then hybridization 1B into the matrix.
>>>
>>>The M-values of that matrix object would look something like:
>>>
>>> patient
>>> 1 patient3 ....
>>>Rows 1-30000 Hybridization 1A Average of hybridization 2A and
>>>2B ....
>>>Rows 30001-60000 Hybridization 1B Average of hybridization
>>>3A and 3B ....
>>>
>>>After this I plan to use dupcor on the new matrix of M-values, as if I
>>>would have a slide with replicate spots on it.
>>>
>>>So far so good or? Is this a good way of treating replicates on
>>>different levels or has anyone else some better idea of how to do this.
>>>Comments please.....
>>>
>>>
>>>And now, how to create a correct targets file since I have no common
>>>reference.
>>>I guess it would look something like this:
>>>
>>>SlideNumber Name FileName Cy3 Cy5
>>>1 pat1_p test1.gpr Before_p1 After_p1
>>>2 pat3_p test2.gpr Before_p2 After_p2
>>>3 pat4_p test3.gpr Before_p3 After_p3
>>>4 pat6_p test4.gpr Before_p4 After_p4
>>>5 pat7_p test5.gpr Before_p5 After_p5
>>>6 pat10_p test6.gpr Before_p6 After_p6
>>>
>>>But when I want to make my contrast matrix I am lost since I do not have
>>>anything to write as ref.
>>>design <- modelMatrix(targets, ref="????????")
>>>
>>>If I redo the matrix to
>>>
>>>SlideNumber Name FileName Cy3 Cy5
>>>1 pat1_p test1.gpr Before_p After_p
>>>2 pat3_p test2.gpr Before_p After_p
>>>3 pat4_p test3.gpr Before_p After_p
>>>4 pat6_p test4.gpr Before_p After_p
>>>5 pat7_p test5.gpr Before_p After_p
>>>6 pat10_p test6.gpr Before_p After_p
>>>
>>>wouldnt that be the same as treating this as a common reference design
>>>when it is not? And wouldnt that effect the variance of the experiment?
>>>How do I do this in a correct way.
>>>I looked at the Zebra fish example in the LIMMA user guide but isnt that
>>>wrong as well. Because technical and biological replicates are treated
>>>the same way in the targets file of the zebra fish.
>>
>>Dye-swap pairs are not necessarily technical replicates.
>>
>>>I realize that many of these questions should have been considered
>>>before conducting the lab part but unfortunately they were not. So I
>>>will not be surprised if someone sends me the same quote as I got
>>>yesterday from a friend:
>>>
>>>"To consult a statistician after an experiment is finished is often
>>>merely to ask him to conduct a post mortem examination. He can perhaps
>>>say what the experiment died of."
>>>- R.A. Fisher, Presidential Address to the First Indian Statistical
>>>Congress, 1938
>>>
>>>Best regards
>>
>>Gordon
>>
>>>/Johan Lindberg
>>>
>>>_______________________________________________
>>>Bioconductor mailing list
>>>Bioconductor at stat.math.ethz.ch
>>>https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor
>
>_______________________________________________
>Bioconductor mailing list
>Bioconductor at stat.math.ethz.ch
>https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor
Naomi S. Altman 814-865-3791 (voice)
Associate Professor
Bioinformatics Consulting Center
Dept. of Statistics 814-863-7114 (fax)
Penn State University 814-865-1348 (Statistics)
University Park, PA 16802-2111
More information about the Bioconductor
mailing list