[BioC] (no subject)

Wed Mar 24 00:18:08 CET 2004

At 11:13 PM 23/03/2004, Eliana Lucchinetti Zaugg wrote:
>Hello!
>I have a question concerning data interpretation in LIMMA. First of all 
>I'll give you the commands:
>
>contrast.matrix<-makeContrasts(APC-ISCH,IPC-ISCH,levels=design)
>fitPRECOND2<-contrasts.fit(fit,contrast.matrix)
>fitPRECOND2<-eBayes(fitPRECOND2)
>clas3<-classifyTestsP(fitPRECOND2,p.value=0.01,method="fdr")
>vennDiagram(clas3,include="up")
>vennDiagram(clas3,include="down")
>
>So far, no problem! I've got nice Venn diagrams. After that my input was:
>
>topTable(fitPRECOND2,number=20,genelist=NULL,coef=1,adjust="fdr")
>
>which yielded:
>
>               M         t    P.Value         B
>2380 -1.0241489 -6.058845 0.01147946 5.0373154
>379  -0.8440311 -5.464009 0.02976549 3.6259203
>2382 -0.9576517 -5.261334 0.02976549 3.1387221
>8436 -0.6746846 -5.216491 0.02976549 3.0306287
>2383 -0.7343412 -5.064616 0.03638846 2.6639293
>7974 -1.0281853 -4.889561 0.04943332 2.2404780
>7925 -0.8519961 -4.833610 0.04952967 2.1050515
>2384 -0.7728411 -4.698536 0.05865728 1.7781412
>432   0.5732701  4.656867 0.05865728 1.6773365
>7680  0.5717780  4.640318 0.05865728 1.6373117
>7927 -0.7303740 -4.598890 0.05865728 1.5371476
>2569  0.5033168  4.579322 0.05865728 1.4898512
>1688  0.6904956  4.540636 0.06028835 1.3963860
>6603  0.5590289  4.506030 0.06162635 1.3128274
>5487 -0.7299432 -4.425216 0.07196002 1.1179075
>3247  0.6442329  4.377510 0.07698329 1.0030059
>5501 -0.8589314 -4.348990 0.07839810 0.9343814
>7434  0.7573137  4.293568 0.07860314 0.8011835
>7230  0.6309675  4.277457 0.07860314 0.7625059
>8527 -0.8768646 -4.252795 0.07860314 0.7033388
>
>It occurred to me that none of the p-palues in the ranking was below 
>p=0.01. Why were there significantly differentially regulated genes in my 
>Venn diagrams (constructed using p=0.01)?

This is a good question. The problem is that classifyTestsP() is doing two 
things which are probably unexpected. The first thing is really an error on 
my part. classifyTestsP() command is not extracting the degrees of freedom 
from the fitted model object, so it is computing p-values on the basis of 
normal distributions rather than t-distributions. You could give it the 
correct degrees of freedom as follows:

 > df <- fitPRECOND2$df.residual + fitPRECOND2$df.prior
 > clas3<-classifyTestsP(fitPRECOND2, df=df, p.value=0.01,method="fdr")

The second thing is more important. The three functions classifyTestsP(), 
classifyTestsT() and classifyTest() are all designed to control the false 
discovery rate *across contrasts* rather than across genes. This means that 
the 'fdr' adjustment is applied across rows of the p-values rather than 
down the columns. The idea is that you can do false discovery rate control 
across genes separately and simply input the resulting p-value that you 
want to cut on. You are supposed to change the p-value that you give to 
classifyTestsP() until you get the number of significant results that you 
want to have, e.g., to make one of the columns agree with topTable().

The above means that you can't get topTable() and classifyTestsP() to agree 
exactly. I agree that this is unintuitive and I'll give some thought to 
making a version of classifyTestsP() that can agree with topTable().

>Second question:
>Do the numbers in the first column correspont to the row number in the 
>eset (expression set, I used RMA to do the analysis)?

Yes. And to get gene names you could use

topTable(fitPRECOND2,number=20,genelist=geneNames(eset),coef=1,adjust="fdr")

Gordon

>THANKS IN ADVANCE!
>ELIANA
>
>===============================
>Eliana Lucchinetti Zaugg, PhD
>Institute of Pharmacology and Toxicology
>Section of Cardiovascular Research-Room 17 J 28
>University of Zurich
>Winterthurerstr. 190
>CH-8057 Zürich (Switzerland)
>
>Phone: +41-1-635 59 18
>Fax: +41-1-635 68 71
>e-mail: eliana.zaugg at pharma.unizh.ch