[BioC] Understanding the columns in the limma results output

James W. MacDonald jmacdon at uw.edu
Wed Aug 29 16:56:45 CEST 2012


Hi Jorge,

My bad. The coefficients for write.fit() are the fold changes, and you 
do get a column for each. As an example, with fake data.

 > library(limma)
 > mat <- matrix(rnorm(12e5), ncol = 12)
 > design <- model.matrix(~0+factor(rep(1:4, each = 3)))
 > contrast <- matrix(c(1,-1,0,0,1,0,-1,0,1,0,0,-1), ncol = 3)
 > fit2 <- contrasts.fit(fit, contrast)
 > fit2 <- eBayes(fit2)
 > rslt <- decideTests(fit2)
 > write.fit(fit2, rslt, "tmp.txt")
 > dat <- read.table("tmp.txt", header=T)
 > head(dat)
   Coef.1 Coef.2 Coef.3   t.1   t.2   t.3 p.value.1 p.value.2 p.value.3    F
1 -0.837 -0.352 -0.430 -1.02 -0.43 -0.53   0.30621   0.66699   0.59932 0.35
2 -1.200 -0.599  0.524 -1.47 -0.73  0.64   0.14148   0.46259   0.52034 1.67
3 -1.014 -1.533 -1.515 -1.24 -1.87 -1.85   0.21627   0.06176   0.06492 1.54
4  0.520  0.308  1.059  0.64  0.38  1.30   0.52390   0.70573   0.19449 0.60
5 -0.848  0.269 -0.611 -1.04  0.33 -0.75   0.29880   0.74132   0.45410 0.81
6 -0.094 -0.245 -0.474 -0.11 -0.30 -0.58   0.90847   0.76396   0.56088 0.13
   F.p.value Res.1 Res.2 Res.3
1   0.78697     0     0     0
2   0.17130     0     0     0
3   0.20359     0     0     0
4   0.61671     0     0     0
5   0.48698     0     0     0
6   0.94300     0     0     0

 > nrow(mat)
[1] 100000
 > nrow(dat)
[1] 100000

An alternative you might consider is the writeFit() function from my 
affycoretools package, which outputs the data in a slightly different 
format.

Best,

Jim



On 8/29/2012 2:55 AM, Jorge Miró wrote:
> Hi James,
>
> thanks for the explanation. I do not really understand the columns 
> yet. Shouldn't the FC be printed for every comparison as is done for 
> the Coef-columns? I just get one A-column.
> Is there any way of printing the results to a file with the same 
> columns as in topTable()?
> What I'm really want to have in the output is a p-value column, an FC 
> column and an adjusted p-value column.
>
> Best regards
>
>
> On Tue, Aug 28, 2012 at 5:41 PM, James W. MacDonald <jmacdon at uw.edu 
> <mailto:jmacdon at uw.edu>> wrote:
>
>     Hi Jorge,
>
>
>     On 8/28/2012 11:20 AM, Jorge Miró wrote:
>
>         Hi,
>
>         I have run the commands below to get an analysis of differential
>         expressions in my Affymetrix arrays
>
>         #Prepare the design and contrast matrices for my comparisons
>         of the three
>         groups in a loop-manner.
>
>             design<- model.matrix(~0+factor(c(1,1,1,2,2,2,3,3,3)))
>             colnames(design)<- c('GroupA', 'GroupB', 'GroupC')
>             contrast.matrix<- makeContrasts(GroupB-GroupA, GroupC-GroupA,
>
>         GroupC-GroupB, levels=design)
>
>         #Check design and contrast matrices
>
>             design
>
>            GroupA GroupB GroupC
>         1      1      0      0
>         2      1      0      0
>         3      1      0      0
>         4      0      1      0
>         5      0      1      0
>         6      0      1      0
>         7      0      0      1
>         8      0      0      1
>         9      0      0      1
>         attr(,"assign")
>         [1] 1 1 1
>         attr(,"contrasts")
>         attr(,"contrasts")$`factor(c(1, 1, 1, 2, 2, 2, 3, 3, 3))`
>         [1] "contr.treatment"
>
>             contrast.matrix
>
>                  Contrasts
>         Levels   GroupB - GroupA GroupC - GroupA GroupC - GroupB
>            GroupA              -1              -1               0
>            GroupB               1               0              -1
>            GroupC               0               1               1
>
>         #Fitting the eset to to the design and contrast
>
>             fit<- lmFit(exprs, design)
>             fit2<- contrasts.fit(fit, contrast.matrix)
>
>         #Computing the statistics
>
>             fit2<- eBayes(fit2)
>
>
>         Then I check the results with topTable and get the following
>         columns in the
>         output
>
>             topTable(fit2)
>
>                GroupB...GroupA GroupC...GroupA GroupC...GroupB
>          AveExpr        F
>           P.Value adj.P.Val
>         25031       2.3602203       2.4273830      0.06716267 5.021412
>         29.06509
>         7.844834e-05 0.9587773
>         12902      -0.4572897      -0.5680943     -0.11080467 7.516681
>         25.41608
>         1.365021e-04 0.9587773
>         7158       -0.4478660      -0.4296077      0.01825833 7.057833
>         23.48871
>         1.880100e-04 0.9587773
>         18358      -0.1002647       0.3304903      0.43075500 7.352807
>         22.78417
>         2.125096e-04 0.9587773
>         28768      -0.7695883      -1.3837750     -0.61418667 3.983044
>         22.47514
>         2.244612e-04 0.9587773
>         28820      -0.1708800      -0.9939680     -0.82308800 5.470826
>         18.25071
>         5.081473e-04 0.9587773
>         15238      -0.4850297      -0.4658157      0.01921400 7.071662
>         17.15191
>         6.440979e-04 0.9587773
>         24681      -0.3759717      -0.3486450      0.02732667 9.281578
>         16.47813
>         7.493077e-04 0.9587773
>         19246      -0.8675393      -0.5082140      0.35932533 8.123538
>         16.27776
>         7.845150e-04 0.9587773
>         28808       0.2601277       0.6909140      0.43078633 4.814602
>         16.21283
>         7.963487e-04 0.9587773
>
>         I want to export my results and write
>
>             results<- decideTests(fit2)
>             write.fit(fit2, results, "limma_results.txt", adjust="BH")
>
>         Now don't get the same columns as when using topTable which is
>         quite
>         confusing. Why don't I get the FC for the comparisons between
>         the different
>         groups as if I run topTable with the coef parameter (
>         "topTable(fit2,
>         coef=1)" )? The columns I get are the following
>
>
>     The simple answer is that they are two different functions with
>     different goals. But note that you do get the same information.
>
>
>
>         A
>
>         Coef.GroupB - GroupA
>         Coef.GroupC - GroupA
>         Coef.GroupC - GroupB
>
>         t.GroupB - GroupA
>         t.GroupC - GroupA
>         t.GroupC - GroupB
>
>         p.value.GroupB - GroupA
>         p.value.GroupC - GroupA
>         p.value.GroupC - GroupB
>
>         p.value.adj.GroupB - GroupA
>         p.value.adj.GroupC - GroupA
>         p.value.adj.GroupC - GroupB
>
>         F
>         F.p.value
>
>         Res.GroupB - GroupA
>         Res.GroupC - GroupA
>         Res.GroupC - GroupB
>
>
>         Could some body please try to explain what do the columns A,
>         Coef, F,
>         F.p.value and Res mean?
>
>
>     A - are your log fold change values
>     Coef - are your coefficients (you set up a cell means model, so
>     these are the sample means)
>     F - is an F-statistic, which tests the null hypothesis that none
>     of the sample means are different
>     F.p.value - is the p-value for the F-statistic
>     Res - is the results matrix you passed into write.fit(), showing
>     which contrast(s) were significant
>
>     Best,
>
>     Jim
>
>
>
>
>
>         #Session info
>
>             sessionInfo()
>
>         R version 2.15.0 (2012-03-30)
>         Platform: i386-pc-mingw32/i386 (32-bit)
>
>         locale:
>         [1] LC_COLLATE=Swedish_Sweden.1252  LC_CTYPE=Swedish_Sweden.1252
>           LC_MONETARY=Swedish_Sweden.1252 LC_NUMERIC=C
>           LC_TIME=Swedish_Sweden.1252
>
>         attached base packages:
>         [1] stats     graphics  grDevices utils     datasets  methods
>           base
>
>         other attached packages:
>         [1] limma_3.12.1       Biobase_2.16.0     BiocGenerics_0.2.0
>
>         loaded via a namespace (and not attached):
>         [1] affylmGUI_1.30.0      IRanges_1.14.4      
>          oneChannelGUI_1.22.10
>         stats4_2.15.0         tcltk_2.15.0
>         Best regards
>         JMA
>
>                 [[alternative HTML version deleted]]
>
>         _______________________________________________
>         Bioconductor mailing list
>         Bioconductor at r-project.org <mailto:Bioconductor at r-project.org>
>         https://stat.ethz.ch/mailman/listinfo/bioconductor
>         Search the archives:
>         http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
>     -- 
>     James W. MacDonald, M.S.
>     Biostatistician
>     University of Washington
>     Environmental and Occupational Health Sciences
>     4225 Roosevelt Way NE, # 100
>     Seattle WA 98105-6099
>
>

-- 
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099



More information about the Bioconductor mailing list