[BioC] Understanding the columns in the limma results output
Jorge Miró
jorgma86 at gmail.com
Wed Aug 29 22:25:50 CEST 2012
Hi,
Thanks :) Then I will transform the PLIER data.
I have seen people that takes the ratio instead of the FC (
http://wiki.c2b2.columbia.edu/workbench/index.php/Fold_Change) and
that's why I wondered if I really should tranform my RMA data to their
linear form. I think I should but I am a little unsure.
Best regards
Jorge
On Wed, Aug 29, 2012 at 10:18 PM, James W. MacDonald <jmacdon at uw.edu> wrote:
> Hi Jorge,
>
>
> On 8/29/2012 4:15 PM, Jorge Miró wrote:
>>
>> Hi,
>>
>> My bad too :) What I really meant with fold change was the fold ratio
>> change between the groups. I though that was what was in topTable but now I
>> think I get it. The fold change that is printed in limma is the difference
>> between the means in log2 and not the ratio of the mean of the gene
>> expressions in their linear form.
>>
>> The data I'm analysing is from RMA and from PLIER so I have linear values
>> for the gene expressions summarized with PLIER. Should I log2-transform them
>> before passing them to the functions in limma?
>
>
> Yes.
>
>
>>
>> Also, if I now wanted to calculated the ratios between the RMA gene
>> expressions, do you recommend that I transform the expressions to linear
>> values before calculating the ratio or is it OK to take the ratio in log2?
>
>
> You don't take the ratio, you take the difference. Remember that log(x/y) =
> log(x) - log(y).
>
> Best,
>
> Jim
>
>
>>
>> Best regards
>> Jorge
>>
>>
>> On Wed, Aug 29, 2012 at 4:56 PM, James W. MacDonald <jmacdon at uw.edu
>> <mailto:jmacdon at uw.edu>> wrote:
>>
>> Hi Jorge,
>>
>> My bad. The coefficients for write.fit() are the fold changes, and
>> you do get a column for each. As an example, with fake data.
>>
>> > library(limma)
>> > mat <- matrix(rnorm(12e5), ncol = 12)
>> > design <- model.matrix(~0+factor(rep(1:4, each = 3)))
>> > contrast <- matrix(c(1,-1,0,0,1,0,-1,0,1,0,0,-1), ncol = 3)
>> > fit2 <- contrasts.fit(fit, contrast)
>> > fit2 <- eBayes(fit2)
>> > rslt <- decideTests(fit2)
>> > write.fit(fit2, rslt, "tmp.txt")
>> > dat <- read.table("tmp.txt", header=T)
>> > head(dat)
>> Coef.1 Coef.2 Coef.3 t.1 t.2 t.3 p.value.1 p.value.2
>> p.value.3 F
>> 1 -0.837 -0.352 -0.430 -1.02 -0.43 -0.53 0.30621 0.66699
>> 0.59932 0.35
>> 2 -1.200 -0.599 0.524 -1.47 -0.73 0.64 0.14148 0.46259
>> 0.52034 1.67
>> 3 -1.014 -1.533 -1.515 -1.24 -1.87 -1.85 0.21627 0.06176
>> 0.06492 1.54
>> 4 0.520 0.308 1.059 0.64 0.38 1.30 0.52390 0.70573
>> 0.19449 0.60
>> 5 -0.848 0.269 -0.611 -1.04 0.33 -0.75 0.29880 0.74132
>> 0.45410 0.81
>> 6 -0.094 -0.245 -0.474 -0.11 -0.30 -0.58 0.90847 0.76396
>> 0.56088 0.13
>> F.p.value Res.1 Res.2 Res.3
>> 1 0.78697 0 0 0
>> 2 0.17130 0 0 0
>> 3 0.20359 0 0 0
>> 4 0.61671 0 0 0
>> 5 0.48698 0 0 0
>> 6 0.94300 0 0 0
>>
>> > nrow(mat)
>> [1] 100000
>> > nrow(dat)
>> [1] 100000
>>
>> An alternative you might consider is the writeFit() function from
>> my affycoretools package, which outputs the data in a slightly
>> different format.
>>
>> Best,
>>
>> Jim
>>
>>
>>
>>
>> On 8/29/2012 2:55 AM, Jorge Miró wrote:
>>
>> Hi James,
>>
>> thanks for the explanation. I do not really understand the
>> columns yet. Shouldn't the FC be printed for every comparison
>> as is done for the Coef-columns? I just get one A-column.
>> Is there any way of printing the results to a file with the
>> same columns as in topTable()?
>> What I'm really want to have in the output is a p-value
>> column, an FC column and an adjusted p-value column.
>>
>> Best regards
>>
>>
>> On Tue, Aug 28, 2012 at 5:41 PM, James W. MacDonald
>> <jmacdon at uw.edu <mailto:jmacdon at uw.edu> <mailto:jmacdon at uw.edu
>>
>> <mailto:jmacdon at uw.edu>>> wrote:
>>
>> Hi Jorge,
>>
>>
>> On 8/28/2012 11:20 AM, Jorge Miró wrote:
>>
>> Hi,
>>
>> I have run the commands below to get an analysis of
>> differential
>> expressions in my Affymetrix arrays
>>
>> #Prepare the design and contrast matrices for my
>> comparisons
>> of the three
>> groups in a loop-manner.
>>
>> design<- model.matrix(~0+factor(c(1,1,1,2,2,2,3,3,3)))
>> colnames(design)<- c('GroupA', 'GroupB', 'GroupC')
>> contrast.matrix<- makeContrasts(GroupB-GroupA,
>> GroupC-GroupA,
>>
>> GroupC-GroupB, levels=design)
>>
>> #Check design and contrast matrices
>>
>> design
>>
>> GroupA GroupB GroupC
>> 1 1 0 0
>> 2 1 0 0
>> 3 1 0 0
>> 4 0 1 0
>> 5 0 1 0
>> 6 0 1 0
>> 7 0 0 1
>> 8 0 0 1
>> 9 0 0 1
>> attr(,"assign")
>> [1] 1 1 1
>> attr(,"contrasts")
>> attr(,"contrasts")$`factor(c(1, 1, 1, 2, 2, 2, 3, 3, 3))`
>> [1] "contr.treatment"
>>
>> contrast.matrix
>>
>> Contrasts
>> Levels GroupB - GroupA GroupC - GroupA GroupC - GroupB
>> GroupA -1 -1 0
>> GroupB 1 0 -1
>> GroupC 0 1 1
>>
>> #Fitting the eset to to the design and contrast
>>
>> fit<- lmFit(exprs, design)
>> fit2<- contrasts.fit(fit, contrast.matrix)
>>
>> #Computing the statistics
>>
>> fit2<- eBayes(fit2)
>>
>>
>> Then I check the results with topTable and get the
>> following
>> columns in the
>> output
>>
>> topTable(fit2)
>>
>> GroupB...GroupA GroupC...GroupA GroupC...GroupB
>> AveExpr F
>> P.Value adj.P.Val
>> 25031 2.3602203 2.4273830 0.06716267
>> 5.021412
>> 29.06509
>> 7.844834e-05 0.9587773
>> 12902 -0.4572897 -0.5680943 -0.11080467
>> 7.516681
>> 25.41608
>> 1.365021e-04 0.9587773
>> 7158 -0.4478660 -0.4296077 0.01825833
>> 7.057833
>> 23.48871
>> 1.880100e-04 0.9587773
>> 18358 -0.1002647 0.3304903 0.43075500
>> 7.352807
>> 22.78417
>> 2.125096e-04 0.9587773
>> 28768 -0.7695883 -1.3837750 -0.61418667
>> 3.983044
>> 22.47514
>> 2.244612e-04 0.9587773
>> 28820 -0.1708800 -0.9939680 -0.82308800
>> 5.470826
>> 18.25071
>> 5.081473e-04 0.9587773
>> 15238 -0.4850297 -0.4658157 0.01921400
>> 7.071662
>> 17.15191
>> 6.440979e-04 0.9587773
>> 24681 -0.3759717 -0.3486450 0.02732667
>> 9.281578
>> 16.47813
>> 7.493077e-04 0.9587773
>> 19246 -0.8675393 -0.5082140 0.35932533
>> 8.123538
>> 16.27776
>> 7.845150e-04 0.9587773
>> 28808 0.2601277 0.6909140 0.43078633
>> 4.814602
>> 16.21283
>> 7.963487e-04 0.9587773
>>
>> I want to export my results and write
>>
>> results<- decideTests(fit2)
>> write.fit(fit2, results, "limma_results.txt",
>> adjust="BH")
>>
>> Now don't get the same columns as when using topTable
>> which is
>> quite
>> confusing. Why don't I get the FC for the comparisons
>> between
>> the different
>> groups as if I run topTable with the coef parameter (
>> "topTable(fit2,
>> coef=1)" )? The columns I get are the following
>>
>>
>> The simple answer is that they are two different functions
>> with
>> different goals. But note that you do get the same
>> information.
>>
>>
>>
>> A
>>
>> Coef.GroupB - GroupA
>> Coef.GroupC - GroupA
>> Coef.GroupC - GroupB
>>
>> t.GroupB - GroupA
>> t.GroupC - GroupA
>> t.GroupC - GroupB
>>
>> p.value.GroupB - GroupA
>> p.value.GroupC - GroupA
>> p.value.GroupC - GroupB
>>
>> p.value.adj.GroupB - GroupA
>> p.value.adj.GroupC - GroupA
>> p.value.adj.GroupC - GroupB
>>
>> F
>> F.p.value
>>
>> Res.GroupB - GroupA
>> Res.GroupC - GroupA
>> Res.GroupC - GroupB
>>
>>
>> Could some body please try to explain what do the
>> columns A,
>> Coef, F,
>> F.p.value and Res mean?
>>
>>
>> A - are your log fold change values
>> Coef - are your coefficients (you set up a cell means
>> model, so
>> these are the sample means)
>> F - is an F-statistic, which tests the null hypothesis
>> that none
>> of the sample means are different
>> F.p.value - is the p-value for the F-statistic
>> Res - is the results matrix you passed into write.fit(),
>> showing
>> which contrast(s) were significant
>>
>> Best,
>>
>> Jim
>>
>>
>>
>>
>>
>> #Session info
>>
>> sessionInfo()
>>
>> R version 2.15.0 (2012-03-30)
>> Platform: i386-pc-mingw32/i386 (32-bit)
>>
>> locale:
>> [1] LC_COLLATE=Swedish_Sweden.1252
>> LC_CTYPE=Swedish_Sweden.1252
>> LC_MONETARY=Swedish_Sweden.1252 LC_NUMERIC=C
>> LC_TIME=Swedish_Sweden.1252
>>
>> attached base packages:
>> [1] stats graphics grDevices utils datasets
>> methods
>> base
>>
>> other attached packages:
>> [1] limma_3.12.1 Biobase_2.16.0
>> BiocGenerics_0.2.0
>>
>> loaded via a namespace (and not attached):
>> [1] affylmGUI_1.30.0 IRanges_1.14.4
>> oneChannelGUI_1.22.10
>> stats4_2.15.0 tcltk_2.15.0
>> Best regards
>> JMA
>>
>> [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org <mailto:Bioconductor at r-project.org>
>> <mailto:Bioconductor at r-project.org
>>
>> <mailto:Bioconductor at r-project.org>>
>>
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>>
>> -- James W. MacDonald, M.S.
>> Biostatistician
>> University of Washington
>> Environmental and Occupational Health Sciences
>> 4225 Roosevelt Way NE, # 100
>> Seattle WA 98105-6099
>>
>>
>>
>> -- James W. MacDonald, M.S.
>> Biostatistician
>> University of Washington
>> Environmental and Occupational Health Sciences
>> 4225 Roosevelt Way NE, # 100
>> Seattle WA 98105-6099
>>
>>
>
> --
> James W. MacDonald, M.S.
> Biostatistician
> University of Washington
> Environmental and Occupational Health Sciences
> 4225 Roosevelt Way NE, # 100
> Seattle WA 98105-6099
>
More information about the Bioconductor
mailing list