[BioC] design in mixed ref and dye-swap experiment
Gordon K Smyth
smyth at wehi.EDU.AU
Tue Mar 29 15:41:21 CEST 2005
> Date: Tue, 29 Mar 2005 11:22:22 +0200
> From: Silvano Piazza <piazza at lncib.it>
> Subject: Re: [BioC] design in mixed ref and dye-swap experiment
> To: Naomi Altman <naomi at stat.psu.edu>
> Cc: bioconductor at stat.math.ethz.ch
> Message-ID: <107fbaba4c9ce4b7c1ecc6051aac92ff at lncib.it>
> Content-Type: text/plain; charset=US-ASCII; format=flowed
>
> Dear Naomi
>
> First of all, thank you very much for your answer.
>>
>>
>> As I have indicated elsewhere on this list, the "p-values" reported by
>> TopTable are actually "q-values". Hence, if you have fewer
>> "significant" genes than expected by chance under the null hypothesis,
>> the reported p-value is 1.0.
>>
>> e.g. Suppose you have 1000 genes. Then if the number of genes
>> significant at alpha% is less than 1000*alpha for each alpha, your
>> TopTable p-value will be 1.0 (i.e. all of the significant genes are
>> estimated to be false positives).
>>
>
> that's very clear now, thanks again.
>
>
>> Your experiment design is needlessly complex and also wasteful. If
>> you have only 2 conditions, you should do one of the following:
>>
>> hybridize both conditions to every array (in dye-swap pairs) with no
>> technical replicates (This is most efficient)
>> use a reference design with the reference sample always in the same
>> channel. (This is simplest, but has 1/2 the efficiency.)
>>
>> Mixing these 2 designs, especially with a mix of biological and
>> technical replicates needlessly complicates your analysis. It also
>> requires a mixed model ANOVA to take into account the different levels
>> of replication.
>>
>
> Yes, I know I know....
> but unfortunately I could not decide, in this case, how to make the
> experiments, so my situation is: these experiments are available at the
> moment and I have to find out DE genes, and only for this reason I was
> wondering if there is any correct methods to work in "mixed" (exp vs
> ref and dye-swap) design, thats means to extract more information that
> it is possible.
You analysis is already correct, given the arrays that you have.
If you are expecting to see differential expression here but aren't, you might revisit the
pre-processing and QC steps for this data. Good pre-processing can make a spectactular difference
to differential expression results.
Gordon
> Thank you
>
> Silvano
>
>> --Naomi
>>
>> At 10:28 AM 3/25/2005, Silvano Piazza wrote:
>>> Hello to everyones,
>>> the experiments that I have to consider is very simple:
>>>
>>> I want to find significant genes between 2 conditions A and B, but I
>>> have only few experiment so I have to collect both ref versus
>>> conditions (A or B) either dye swap experiment (A versus B and B
>>> versus A)
>>>
>>> so targets is
>>> SlideNumber Cy3 Cy5
>>> array1 ref A
>>> array2 ref B
>>> array3 ref B
>>> array4 ref B
>>> array5 A B
>>> array6 B A
>>>
>>> of course array5 and array6 are the dye-swap.
>>>
>>> So to design the procedure, I follow the LIMMA user guide (by Gordon
>>> Smith), Chapter 14.5 Weaver Mutant Data.
>>>
>>> so
>>> >design <- modelMatrix(targets, ref = "ref")
>>> Found unique target names:
>>> B A ref
>>> >design
>>> A B
>>> array1 0 1
>>> array2 1 0
>>> array3 1 0
>>> array4 1 0
>>> array5 -1 1
>>> array6 1 -1
>>> >fit <- lmFit(MA,design)
>>> >cont.matrix <- makeContrasts(A.B=A-B,levels=design,weight=MA$weights)
Why are you using 'weights='? That is not an argument for makeContrasts().
Gordon
>>> >fit2 <- contrasts.fit(fit, cont.matrix)
>>> > fit2 <- eBayes(fit2)
>>> >topTable(fit2,adjust.method="fdr")
>>> ....omissis...
>>> M A t
>>> P.Value B
>>> 209 3.801460 6.538782 8.315672 1.0000000
>>> -4.209468
>>> 2328 1.184194 7.343676 6.717978 1.0000000 -4.228492
>>> 7877 1.904360 6.504330 6.114349 1.0000000 -4.239110
>>> 27187 -4.0759493.771499 -5.783558 1.0000000 -4.246099
>>> 3709 3.434542 3.467492 5.639159 1.0000000 -4.249459
>>> 7561 2.002753 5.159913 5.616194 1.0000000 -4.250013
>>> 7130 2.580527 3.863867 5.600047 1.0000000 -4.250405
>>> 19983 -2.1176246.836539 -5.567882 1.0000000 -4.251194
>>> So all genes have P.Value equal to 1!!!!!!
>>> in previous posts I read that this happen when you have to consider
>>> multivariate test, which i don't known how to manage..., but anyway
>>>
>>> 1) Am I doing something wrong in the design?
>>> 2) Am I doing something wrong in the subsequent evaluation steps?
>>> Any ideas
>>>
>>>
>>>
>>> Thank you to all
>>>
>>> Silvano
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> Dr.Silvano Piazza
>>> LNCIB,
>>> Area Science Park,
>>> Padriciano 99
>>> Trieste, ITALY
>>> Tel. +39040398992
>>> Fax +39040398990
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at stat.math.ethz.ch
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>
>> Naomi S. Altman 814-865-3791 (voice)
>> Associate Professor
>> Bioinformatics Consulting Center
>> Dept. of Statistics 814-863-7114 (fax)
>> Penn State University 814-865-1348 (Statistics)
>> University Park, PA 16802-2111
>>
>>
>>
> Dr.Silvano Piazza
> LNCIB,
> Area Science Park,
> Padriciano 99
> Trieste, ITALY
> Tel. +39040398992
> Fax +39040398990
More information about the Bioconductor
mailing list