[BioC] diferent pvalues for a treatment when other contrasts areremoved from targets file

James MacDonald jmacdon at med.umich.edu
Tue Feb 17 16:31:41 CET 2009


Hi Agnieszka,

Neither approach is more correct than the other, but the second is very
likely to be more powerful.

In either case the numerator of the contrast will be the same. However,
the denominator of the contrast is based on the intra-group variability
for all of the groups in your linear model. So if you only include the
first treatment comparison in your model then the denominator will be
based on just those two groups. If you use all 20 of your slides, then
you increase the amount of data that can be used to compute the
intra-group variability. 

Since the accuracy of a variance estimate is dependent on the amount of
data (it gets more accurate as you increase the amount of data used),
you will probably have more power to detect differences if you include
all the data in your linear model. And more power is usually considered
to be better.

Best,

Jim


James W. MacDonald, M.S.
Biostatistician
Hildebrandt Lab
8220D MSRB III
1150 W. Medical Center Drive
Ann Arbor MI 48109-0646
734-936-8662
>>> Agnieszka Zmienko <akisiel at ibch.poznan.pl> 02/17/09 9:05 AM >>>
Hi!

I am "pure" biologist so I strictly follow the 
limma userguide commands in my analysis but I have a problem.
I have a set of microarrays with a common control 
channel. I have 4 biological replicates of each 
experiment. If I perform the simplest possible 
analysis for the first treatment 
(wt.NaCl/wt.pool) using only files 
043,046,048,077 as targets ("targetsA.txt"), I 
obtain different adjusted p.values (and different 
topTable) comparing to analysis with all twenty 
input files ("targetsB.txt") and extracting 
wt.NaCl contrast. Why? Which topTable is correct?


Agnieszka


Here are my targets files and the codes:

TargetsA.txt

SlideNumber     Name    FileName        Cy3     Cy5
43      wt.Na1  043.gpr wt.pool wt.NaCl
46      wt.Na2  046.gpr wt.pool wt.NaCl
48      wt.Na3  048.gpr wt.pool wt.NaCl
77      wt.Na4  077.gpr wt.pool wt.NaCl

TargetsB.txt

SlideNumber     Name    FileName        Cy3     Cy5
43      wt.Na1  043.gpr wt.pool wt.NaCl
46      wt.Na2  046.gpr wt.pool wt.NaCl
48      wt.Na3  048.gpr wt.pool wt.NaCl
77      wt.Na4  077.gpr wt.pool wt.NaCl
44      wt.Cd1  044.gpr wt.pool wt.CdCl2
47      wt.Cd2  047.gpr wt.pool wt.CdCl2
49      wt.Cd3  049.gpr wt.pool wt.CdCl2
78      wt.Cd4  078.gpr wt.pool wt.CdCl2
75      mu1.U1  075.gpr wt.pool mu1.U
70      mu1.U2  070.gpr wt.pool mu1.U
71      mu1.U3  071.gpr wt.pool mu1.U
80      mu1.U4  080.gpr wt.pool mu1.U
67      mu2.U1  067.gpr wt.pool mu2.U
74      mu2.U2  074.gpr wt.pool mu2.U
79      mu2.U3  079.gpr wt.pool mu2.U
68      mu2.U4  068.gpr wt.pool mu2.U
72      mu3.U1  072.gpr wt.pool mu3.U
73      mu3.U2  073.gpr wt.pool mu3.U
69      mu3.U3  069.gpr wt.pool mu3.U
88      mu3.U4  088.gpr wt.pool mu3.U

 > targetsA=readTargets("targetsA.txt")
 > 
RGA=read.maimages(targetsA,source="genepix",wt.fun=wtflags(weight=0, 
cutoff=-50))
Read 043.gpr
Read 046.gpr
Read 048.gpr
Read 077.gpr
 > spottypes=readSpotTypes("SpotTypes.txt")
 > RG$genes$Status=controlStatus(spottypes,RGA)
Matching patterns for: ID Name
Found 31200 cDNA
Found 48 no_change
Setting attributes: values Color
 > RGAb=backgroundCorrect(RGA, method="normexp", offset=50)
Green channel
Corrected array 1
Corrected array 2
Corrected array 3
Corrected array 4
Red channel
Corrected array 1
Corrected array 2
Corrected array 3
Corrected array 4
 > MAA=normalizeWithinArrays(RGAb)
 > fitA=lmFit(MAA)
Warning message:
In lmFit(MAA) :
   Some coefficients not estimable: coefficient interpretation may vary.
 > fitA=eBayes(fitA)
 > write.table(topTable(fitA,
+ 
number=100,adjust.method="BH",p.value=0.05,lfc=1,sort.by="P",resort.by="logFC"),"topTableA.txt")
 > write.fit(fitA, digits=6,F.adjust="BH",file="resultsA.txt")

---------------------------------
 > targetsB=readTargets("targetsB.txt")
 > 
RGB=read.maimages(targetsB,source="genepix",wt.fun=wtflags(weight=0,
cutoff=-50))
Read 043.gpr
Read 046.gpr
Read 048.gpr
ReRead 074.gpr
Read 079.gpr
Read 068.gpr
Read 072.gpr
Read 073.gpr
Read 069.gpr
Read 088.gpr
 > spottypes=readSpotTypes("SpotTypes.txt")
 > RG$genes$Status=controlStatus(spottypes,RGB)
Matching patterns for: ID Name
Found 31200 cDNA
Found 48 no_change
Setting attributes: values Color
 > RGBb=backgroundCorrect(RGB, method="normexp", offset=50)
Green channel
Corrected array 1
Corrected array 2
Corrected array 3
Corrected array 4
Corrected array 5
Corrected array 6
Corrected array 7
Corrected array 8
Corrected array 9
Corrected array 10
Corrected array 11
Corrected array 12
Corrected array 13
Corrected array 14
Corrected array 15
Corrected array 16
Corrected array 17
Corrected array 18
Corrected array 19
Corrected array 20
Red channel
Corrected array 1
Corrected array 2
Corrected array 3
Corrected array 4
Corrected array 5
Corrected array 6
Corrected array 7
Corrected array 8
Corrected array 9
Corrected array 10
Corrected array 11
Corrected array 12
Corrected array 13
Corrected array 14
Corrected array 15
Corrected array 16
Corrected array 17
Corrected array 18
Corrected array 19
Corrected array 20
 > MAB=normalizeWithinArrays(RGBb)
 > designB=modelMatrix(targetsB,ref="wt.pool")
Found unique target names:
  mu1.U mu2.U mu3.U wt.CdCl2 wt.NaCl wt.pool
 > designB
       mu1.U mu2.U mu3.U wt.CdCl2 wt.NaCl
  [1,]     0     0     0        0       1
  [2,]     0     0     0        0       1
  [3,]     0     0     0        0       1
  [4,]     0     0     0        0       1
  [5,]     0     0     0        1       0
  [6,]     0     0     0        1       0
  [7,]     0     0     0        1       0
  [8,]     0     0     0        1       0
  [9,]     1     0     0        0       0
[10,]     1     0     0        0       0
[11,]     1     0     0        0       0
[12,]     1     0     0        0       0
[13,]     0     1     0        0       0
[14,]     0     1     0        0       0
[15,]     0     1     0        0       0
[16,]     0     1     0        0       0
[17,]     0     0     1        0       0
[18,]     0     0     1        0       0
[19,]     0     0     1        0       0
[20,]     0     0     1        0       0
 > fitB=lmFit(MAB,designB)
Warning message:
In lmFit(MAB, designB) :
   Some coefficients not estimable: coefficient interpretation may vary.
 > contrast.matrix=makeContrasts(wt.NaCl,levels=designB)
 > contrast.matrix
           Contrasts
Levels     wt.NaCl
   mu1.U          0
   mu2.U          0
   mu3.U          0
   wt.CdCl2       0
   wt.NaCl        1
 > fitB=contrasts.fit(fitB,contrast.matrix)
 > fitB=eBayes(fitB)
 > write.table(topTable(fitB,
+ 
number=100,adjust.method="BH",p.value=0.05,lfc=1,sort.by="P",resort.by="logFC"),"topTableB.txt")
 > write.fit(fitB, digits=6,F.adjust="BH",file="resultsB.txt")



Dr Agnieszka ̄mieñko

Centrum Doskonalosci CENAT
Instytut Chemii Bioorganicznej Polskiej Akademii Nauk
Noskowskiego 12/14
61-704 Poznañ
tel. (61) 8528503 wew. 249
fax: (61) 8520532

Agnieszka Zmienko, Ph.D.

CENAT
Institute of Bioorganic Chemistry
Polish Academy of Sciences
Noskowskiego 12/14
61-704 Poznan, Poland
phone (0048) 61-8528503 ext. 249
fax: (0048) 61-8520532 

_______________________________________________
Bioconductor mailing list
Bioconductor at stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/bioconductor
Search the archives:
http://news.gmane.org/gmane.science.biology.informatics.conductor

**********************************************************
Electronic Mail is not secure, may not be read every day, and should not
be used for urgent or sensitive issues



More information about the Bioconductor mailing list