[BioC] limma all adj.P.Val the same
James W. MacDonald
jmacdon at uw.edu
Mon May 21 22:54:43 CEST 2012
Hi Sam,
Please don't take things off-list. We like to think of the list archives
as a searchable resource, so any off-list conversations obviate that use.
On 5/21/2012 1:40 PM, Sam McInturf wrote:
> Jim,
> here is my target file and three topTables.
>> target
> FileName
> Timing Sample Rep
> 1 1 Wheat Genome 4-26-12_(wheat).CEL ANTH SPIKE A
> 2 10 Wheat Genome Array 5-3-12_(wheat).CEL ANTH PEDUNCLE A
> 3 11 Wheat Genome Array 5-3-12_(wheat).CEL ANTH STEM A
> 4 12 Wheat Genome Array 5-3-12_(wheat).CEL ANTH FLAGLEAF A
> 5 13 Wheat Genome Array 5-3-12_(wheat).CEL ANTH NODES A
> 6 14 Wheat Genome Array 5-3-12_(wheat).CEL DAA14 SPIKE A
> 7 15 Wheat Genome Array 5-3-12_(wheat).CEL DAA14 PEDUNCLE A
> 8 16 Wheat Genome Array 5-3-12_(wheat).CEL DAA14 STEM A
> 9 17 Wheat Genome Array 5-3-12_(wheat).CEL DAA14 FLAGLEAF A
> 10 18 Wheat Genome Array 5-3-12_(wheat).CEL DAA14 NODES A
> 11 19 Wheat Genome Array 5-8-12_(wheat).CEL ANTH SPIKE B
> 12 2 Wheat Genome 4-26-12_(wheat).CEL ANTH PEDUNCLE B
> 13 20 Wheat Genome Array 5-8-12_(wheat).CEL ANTH STEM B
> 14 21 Wheat Genome Array 5-8-12_(wheat).CEL ANTH FLAGLEAF B
> 15 22 Wheat Genome Array 5-8-12_(wheat).CEL ANTH NODES B
> 16 23 Wheat Genome Array 5-8-12_(wheat).CEL DAA14 SPIKE B
> 17 24 Wheat Genome Array 5-8-12_(wheat).CEL DAA14 PEDUNCLE B
> 18 25 Wheat Genome Array 5-8-12_(wheat).CEL DAA14 STEM B
> 19 26 Wheat Genome Array 5-8-12_(wheat).CEL DAA14 FLAGLEAF B
> 20 27 Wheat Genome Array 5-8-12_(wheat).CEL DAA14 NODES B
> 21 28 Wheat Genome Array 5-8-12_(wheat).CEL ANTH SPIKE C
> 22 29 Wheat Genome Array 5-10-12_(wheat).CEL ANTH PEDUNCLE C
> 23 3 Wheat Genome 4-26-12_(wheat).CEL ANTH STEM C
> 24 30 Wheat Genome Array 5-8-12_(wheat).CEL ANTH FLAGLEAF C
> 25 4 Wheat Genome 4-26-12_(wheat).CEL ANTH NODES C
> 26 5 Wheat Genome 4-26-12_(wheat).CEL DAA14 SPIKE C
> 27 6 Wheat Genome 4-26-12_(wheat).CEL DAA14 PEDUNCLE C
> 28 7 Wheat Genome 4-26-12_(wheat).CEL DAA14 STEM C
> 29 8 Wheat Genome Array 5-3-12_(wheat).CEL DAA14 FLAGLEAF C
> 30 9 Wheat Genome Array 5-3-12_(wheat).CEL DAA14 NODES C
>
>
>> topTable(fit, coef = "Spike")
> ID logFC AveExpr
> t P.Value adj.P.Val B
> 54657 TaAffx.78175.1.S1_at 0.4215108 3.216996 4.514525
> 0.0001717272 0.9999869 -3.269251
> 40437 TaAffx.13113.3.S1_at 0.4911042 3.912458 4.156361
> 0.0004119435 0.9999869 -3.402249
> 55069 TaAffx.79041.1.S1_at 0.3726910 2.666774 4.063081
> 0.0005172708 0.9999869 -3.438121
> 39227 TaAffx.12387.2.S1_at -0.7312518 3.934821 -3.968346
> 0.0006516554 0.9999869 -3.475043
> 33239 TaAffx.107821.1.S1_at 0.3374931 3.718697 3.957133
> 0.0006696985 0.9999869 -3.479445
> 53443 TaAffx.66663.2.S1_at 0.5686773 4.155875 3.925068
> 0.0007240767 0.9999869 -3.492069
> 43703 TaAffx.28502.1.S1_at 0.3998569 4.607502 3.915422
> 0.0007412758 0.9999869 -3.495877
> 36163 TaAffx.112860.1.S1_at 0.3813519 3.401501 3.899640
> 0.0007702914 0.9999869 -3.502118
> 35389 TaAffx.111828.1.S1_at 0.5061669 4.090638 3.799882
> 0.0009815673 0.9999869 -3.541854
> 33890 TaAffx.108878.1.S1_x_at 0.3026730 3.772136 3.779824
> 0.0010305082 0.9999869 -3.549902
>
Ugh. Stupid line wraps.
So, three things. First, with only three replications you won't be able
to reliably detect very small changes (and the fold changes you are
seeing here are very small, barely even 1.5-fold on the natural scale).
Second, the unadjusted p-values are pretty big, so it isn't surprising
that you have such large adjusted p-values.
Third, the average expression levels you are showing are really small. I
have no experience with the Wheat array, but for the mammalian versions
of the 3'-biased arrays I am used to seeing much larger expression
values. This may be expected, but if I were you I would be doing some
exploratory data analysis (image plots, density plots, etc, maybe use
arrayQualityMetrics package).
You could also do a PCA plot to see if your samples are grouping as you
expect. I often fit array weights in limma and append them to the PCA
plot to see if weighting might help (see plotPCA, c.f. addtext argument
in affycoretools).
Best,
Jim
>> topTable(fit, coef = "ABSURD")
> ID logFC AveExpr
> t P.Value adj.P.Val B
> 32597 TaAffx.106447.1.S1_at -0.7516767 3.714621 -4.451899 0.0002001067
> 0.9979636 -3.291944
> 1598 Ta.10910.1.A1_at -0.4429634 3.524978 -4.417483 0.0002176579
> 0.9979636 -3.304519
> 15208 Ta.25816.1.S1_at -0.4721049 6.752687 -4.246018 0.0003309222
> 0.9979636 -3.368238
> 57638 TaAffx.8440.1.S1_at 0.4725505 3.198740 4.106344 0.0004654457
> 0.9979636 -3.421423
> 8110 Ta.1735.2.S1_a_at 0.5125682 3.656730 4.019142 0.0005757797
> 0.9979636 -3.455186
> 50758 TaAffx.57221.1.S1_at -0.4975976 3.789495 -4.005497 0.0005952539
> 0.9979636 -3.460506
> 58501 TaAffx.85842.1.S1_at -0.4646994 3.862172 -4.004599 0.0005965581
> 0.9979636 -3.460857
> 30327 Ta.9283.2.S1_at -0.3770003 3.214838 -3.980915 0.0006320034
> 0.9979636 -3.470117
> 9090 Ta.18507.1.S1_s_at -0.4499434 2.592315 -3.962537 0.0006609422
> 0.9979636 -3.477323
> 44052 TaAffx.29281.1.S1_at -0.5103668 3.360355 -3.933707 0.0007090089
> 0.9979636 -3.488663
>
>> topTable(fit, coef = "ABSURD", p.value = 0.05)
> data frame with 0 columns and 0 rows
>
> Best!
>
> Samuel
>
>
> On Mon, May 21, 2012 at 12:22 PM, James W. MacDonald<jmacdon at uw.edu> wrote:
>> Hi Sam,
>>
>>
>> On 5/21/2012 12:12 PM, Sam McInturf wrote:
>>> Hello Everyone,
>>> I am working on expression profiling of wheat, where five tissue
>>> types were analyzed at anthesis (flowering) and 14 days after anthesis
>>> (DAA14). Because of the amount of development that occurs during the
>>> first two weeks of grain filling, we expect to see differentially
>>> expressed genes (Code below). I approached this as a 5x2 factorial
>>> problem and took my lead from the limma guide (ch 8.7). When calling
>>> topTable for any contrast, the list of P.Values contains a range of
>>> values (small small, some large). While the adj.P.Val column contains
>>> the same (large) value for every gene reported ( up to n = 10,000).
>>>
>>> The adj.P.Vals that comeback for each contrasts range from 0.87 to
>>> 0.9999..
>>>
>>> Note the "ABSURD" contrast, when comparing different tissue types, we
>>> expect a lot of DE genes, but still no DE reported
>>>
>>> I subsetted my data to only analyze one tissue across the two times,
>>> and then doing the analysis as below, and if it is approached it as a
>>> 'comparison of two groups', and still was returned adjusted p values
>>> that were all the same, high value.
>>>
>>> Any idea why my q-values are all the same?
>>
>> Without any more information it is difficult to say. The stock answer would
>> be that you don't have enough power to detect any differences. I have seen
>> this on occasion, especially when there isn't much replication.
>>
>> You could help us by giving your targets file and an example of one of your
>> topTables.
>>
>>
>>
>>
>>> ################################################
>>> library(limma)
>>> library(affy)
>>> #read in our data
>>> target<- readTargets("--------/TargetCEL.txt")
>>> data<- ReadAffy(filenames = target$FileName)
>>>
>>> eset<- rma(data)
>>> #Building our model. Let us model this as a 5x2 factorial problem
>>> (five tissues, two times)
>>> factors<- paste(target$Timing, target$Sample, sep = ".")
>>> factors<- factor(factors, levels = c("ANTH.SPIKE", "ANTH.PEDUNCLE",
>>> "ANTH.STEM", "ANTH.FLAGLEAF", "ANTH.NODES", "DAA14.SPIKE",
>>> "DAA14.PEDUNCLE", "DAA14.STEM", "DAA14.FLAGLEAF", "DAA14.NODES"))
>>> design<- model.matrix(~0+factors)
>>> colnames(design)<- levels(factors)
>>> contrastMat.TissueWise<- makeContrasts(
>>> Spike = DAA14.SPIKE - ANTH.SPIKE,
>>> PEDUNCLE = DAA14.PEDUNCLE - ANTH.PEDUNCLE,
>>> STEM = DAA14.STEM - ANTH.STEM,
>>> FLAGLEAF = DAA14.FLAGLEAF - ANTH.FLAGLEAF,
>>> NODES = DAA14.NODES - ANTH.NODES,
>>> ABSURD = ANTH.STEM - DAA14.SPIKE,
>>> levels = design
>>> )
>>>
>>> fit3<- lmFit(eset, design)
>>> fit2<- contrasts.fit(fit3, contrastMat.TissueWise)
>>> fit<- eBayes(fit2)
>>>
>>> topTable(fit2, coef = ___, n = 100)
>>
>> I am assuming this is a typo (fit2 rather than fit)?
>>
>> Best,
>>
>> Jim
>>
>>
>>> --
>>> Sam McInturf
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>> --
>> James W. MacDonald, M.S.
>> Biostatistician
>> University of Washington
>> Environmental and Occupational Health Sciences
>> 4225 Roosevelt Way NE, # 100
>> Seattle WA 98105-6099
>>
>
>
--
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099
More information about the Bioconductor
mailing list