[BioC] design in mixed ref and dye-swap experiment

Gordon K Smyth smyth at wehi.EDU.AU
Tue Mar 29 15:41:21 CEST 2005


> Date: Tue, 29 Mar 2005 11:22:22 +0200
> From: Silvano Piazza <piazza at lncib.it>
> Subject: Re: [BioC] design in mixed ref and dye-swap experiment
> To: Naomi Altman <naomi at stat.psu.edu>
> Cc: bioconductor at stat.math.ethz.ch
> Message-ID: <107fbaba4c9ce4b7c1ecc6051aac92ff at lncib.it>
> Content-Type: text/plain; charset=US-ASCII; format=flowed
>
> Dear Naomi
>
> First of all, thank you very much for your answer.
>>
>>
>> As I have indicated elsewhere on this list, the "p-values" reported by
>> TopTable are actually "q-values".  Hence, if you have fewer
>> "significant" genes than expected by chance under the null hypothesis,
>> the reported p-value is 1.0.
>>
>> e.g. Suppose you have 1000 genes.  Then if the number of genes
>> significant at alpha% is less than 1000*alpha for each alpha, your
>> TopTable p-value will be 1.0 (i.e. all of the significant genes are
>> estimated to be false positives).
>>
>
> that's very clear now, thanks again.
>
>
>> Your experiment design is needlessly complex and also wasteful.  If
>> you have only 2 conditions, you should do one of the following:
>>
>> hybridize both conditions to every array (in dye-swap pairs) with no
>> technical replicates (This is most efficient)
>> use a reference design with the reference sample always in the same
>> channel. (This is simplest, but has 1/2 the efficiency.)
>>
>> Mixing these 2 designs, especially with a mix of biological and
>> technical replicates needlessly complicates your analysis.  It also
>> requires a mixed model ANOVA to take into account the different levels
>> of replication.
>>
>
> Yes, I know I know....
> but  unfortunately I  could not  decide, in this case,  how to make the
> experiments, so my situation is: these experiments are available at the
> moment and I have to find out DE genes, and only for this reason I was
> wondering if there is any correct methods to work in "mixed" (exp vs
> ref and dye-swap) design, thats means to extract more information that
> it is possible.

You analysis is already correct, given the arrays that you have.

If you are expecting to see differential expression here but aren't, you might revisit the
pre-processing and QC steps for this data.  Good pre-processing can make a spectactular difference
to differential expression results.

Gordon

> Thank you
>
> Silvano
>
>> --Naomi
>>
>> At 10:28 AM 3/25/2005, Silvano Piazza wrote:
>>> Hello to everyones,
>>> the experiments that I have to consider is very simple:
>>>
>>> I want  to find significant genes between 2 conditions A and B, but I
>>> have only few experiment so I have to collect both ref versus
>>> conditions (A or B) either dye swap experiment (A versus B and B
>>> versus A)
>>>
>>> so targets is
>>> SlideNumber     Cy3     Cy5
>>> array1          ref     A
>>> array2          ref     B
>>> array3          ref     B
>>> array4          ref     B
>>> array5          A       B
>>> array6          B       A
>>>
>>> of course array5 and array6 are the dye-swap.
>>>
>>> So to design the procedure, I follow the LIMMA user guide (by Gordon
>>> Smith), Chapter 14.5 Weaver Mutant Data.
>>>
>>> so
>>> >design <- modelMatrix(targets, ref = "ref")
>>>         Found unique target names:
>>>         B A ref
>>> >design
>>>         A       B
>>>         array1    0    1
>>>         array2    1    0
>>>         array3    1    0
>>>         array4    1    0
>>>         array5   -1    1
>>>         array6    1   -1
>>> >fit <- lmFit(MA,design)
>>> >cont.matrix <- makeContrasts(A.B=A-B,levels=design,weight=MA$weights)

Why are you using 'weights='?  That is not an argument for makeContrasts().

Gordon

>>> >fit2 <- contrasts.fit(fit, cont.matrix)
>>> > fit2 <- eBayes(fit2)
>>> >topTable(fit2,adjust.method="fdr")
>>>         ....omissis...
>>>                         M               A                       t
>>>    P.Value         B
>>>         209             3.801460        6.538782  8.315672 1.0000000
>>> -4.209468
>>>         2328    1.184194        7.343676  6.717978 1.0000000 -4.228492
>>>         7877    1.904360        6.504330  6.114349 1.0000000 -4.239110
>>>         27187   -4.0759493.771499 -5.783558 1.0000000 -4.246099
>>>         3709    3.434542 3.467492  5.639159 1.0000000 -4.249459
>>>         7561    2.002753 5.159913  5.616194 1.0000000 -4.250013
>>>         7130    2.580527 3.863867  5.600047 1.0000000 -4.250405
>>>         19983   -2.1176246.836539 -5.567882 1.0000000 -4.251194
>>> So all genes have P.Value equal to 1!!!!!!
>>> in previous posts I read that this happen when you have to consider
>>> multivariate test, which i don't known how to manage..., but anyway
>>>
>>> 1) Am I doing something wrong in the design?
>>> 2) Am I doing something wrong in the subsequent evaluation steps?
>>> Any ideas
>>>
>>>
>>>
>>> Thank you to all
>>>
>>> Silvano
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> Dr.Silvano Piazza
>>> LNCIB,
>>> Area Science Park,
>>> Padriciano 99
>>> Trieste, ITALY
>>> Tel. +39040398992
>>> Fax +39040398990
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at stat.math.ethz.ch
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>
>> Naomi S. Altman                                814-865-3791 (voice)
>> Associate Professor
>> Bioinformatics Consulting Center
>> Dept. of Statistics                              814-863-7114 (fax)
>> Penn State University                         814-865-1348 (Statistics)
>> University Park, PA 16802-2111
>>
>>
>>
> Dr.Silvano Piazza
> LNCIB,
> Area Science Park,
> Padriciano 99
> Trieste, ITALY
> Tel. +39040398992
> Fax +39040398990



More information about the Bioconductor mailing list