[BioC] Understanding the columns in the limma results output
James W. MacDonald
jmacdon at uw.edu
Wed Aug 29 16:56:45 CEST 2012
Hi Jorge,
My bad. The coefficients for write.fit() are the fold changes, and you
do get a column for each. As an example, with fake data.
> library(limma)
> mat <- matrix(rnorm(12e5), ncol = 12)
> design <- model.matrix(~0+factor(rep(1:4, each = 3)))
> contrast <- matrix(c(1,-1,0,0,1,0,-1,0,1,0,0,-1), ncol = 3)
> fit2 <- contrasts.fit(fit, contrast)
> fit2 <- eBayes(fit2)
> rslt <- decideTests(fit2)
> write.fit(fit2, rslt, "tmp.txt")
> dat <- read.table("tmp.txt", header=T)
> head(dat)
Coef.1 Coef.2 Coef.3 t.1 t.2 t.3 p.value.1 p.value.2 p.value.3 F
1 -0.837 -0.352 -0.430 -1.02 -0.43 -0.53 0.30621 0.66699 0.59932 0.35
2 -1.200 -0.599 0.524 -1.47 -0.73 0.64 0.14148 0.46259 0.52034 1.67
3 -1.014 -1.533 -1.515 -1.24 -1.87 -1.85 0.21627 0.06176 0.06492 1.54
4 0.520 0.308 1.059 0.64 0.38 1.30 0.52390 0.70573 0.19449 0.60
5 -0.848 0.269 -0.611 -1.04 0.33 -0.75 0.29880 0.74132 0.45410 0.81
6 -0.094 -0.245 -0.474 -0.11 -0.30 -0.58 0.90847 0.76396 0.56088 0.13
F.p.value Res.1 Res.2 Res.3
1 0.78697 0 0 0
2 0.17130 0 0 0
3 0.20359 0 0 0
4 0.61671 0 0 0
5 0.48698 0 0 0
6 0.94300 0 0 0
> nrow(mat)
[1] 100000
> nrow(dat)
[1] 100000
An alternative you might consider is the writeFit() function from my
affycoretools package, which outputs the data in a slightly different
format.
Best,
Jim
On 8/29/2012 2:55 AM, Jorge Miró wrote:
> Hi James,
>
> thanks for the explanation. I do not really understand the columns
> yet. Shouldn't the FC be printed for every comparison as is done for
> the Coef-columns? I just get one A-column.
> Is there any way of printing the results to a file with the same
> columns as in topTable()?
> What I'm really want to have in the output is a p-value column, an FC
> column and an adjusted p-value column.
>
> Best regards
>
>
> On Tue, Aug 28, 2012 at 5:41 PM, James W. MacDonald <jmacdon at uw.edu
> <mailto:jmacdon at uw.edu>> wrote:
>
> Hi Jorge,
>
>
> On 8/28/2012 11:20 AM, Jorge Miró wrote:
>
> Hi,
>
> I have run the commands below to get an analysis of differential
> expressions in my Affymetrix arrays
>
> #Prepare the design and contrast matrices for my comparisons
> of the three
> groups in a loop-manner.
>
> design<- model.matrix(~0+factor(c(1,1,1,2,2,2,3,3,3)))
> colnames(design)<- c('GroupA', 'GroupB', 'GroupC')
> contrast.matrix<- makeContrasts(GroupB-GroupA, GroupC-GroupA,
>
> GroupC-GroupB, levels=design)
>
> #Check design and contrast matrices
>
> design
>
> GroupA GroupB GroupC
> 1 1 0 0
> 2 1 0 0
> 3 1 0 0
> 4 0 1 0
> 5 0 1 0
> 6 0 1 0
> 7 0 0 1
> 8 0 0 1
> 9 0 0 1
> attr(,"assign")
> [1] 1 1 1
> attr(,"contrasts")
> attr(,"contrasts")$`factor(c(1, 1, 1, 2, 2, 2, 3, 3, 3))`
> [1] "contr.treatment"
>
> contrast.matrix
>
> Contrasts
> Levels GroupB - GroupA GroupC - GroupA GroupC - GroupB
> GroupA -1 -1 0
> GroupB 1 0 -1
> GroupC 0 1 1
>
> #Fitting the eset to to the design and contrast
>
> fit<- lmFit(exprs, design)
> fit2<- contrasts.fit(fit, contrast.matrix)
>
> #Computing the statistics
>
> fit2<- eBayes(fit2)
>
>
> Then I check the results with topTable and get the following
> columns in the
> output
>
> topTable(fit2)
>
> GroupB...GroupA GroupC...GroupA GroupC...GroupB
> AveExpr F
> P.Value adj.P.Val
> 25031 2.3602203 2.4273830 0.06716267 5.021412
> 29.06509
> 7.844834e-05 0.9587773
> 12902 -0.4572897 -0.5680943 -0.11080467 7.516681
> 25.41608
> 1.365021e-04 0.9587773
> 7158 -0.4478660 -0.4296077 0.01825833 7.057833
> 23.48871
> 1.880100e-04 0.9587773
> 18358 -0.1002647 0.3304903 0.43075500 7.352807
> 22.78417
> 2.125096e-04 0.9587773
> 28768 -0.7695883 -1.3837750 -0.61418667 3.983044
> 22.47514
> 2.244612e-04 0.9587773
> 28820 -0.1708800 -0.9939680 -0.82308800 5.470826
> 18.25071
> 5.081473e-04 0.9587773
> 15238 -0.4850297 -0.4658157 0.01921400 7.071662
> 17.15191
> 6.440979e-04 0.9587773
> 24681 -0.3759717 -0.3486450 0.02732667 9.281578
> 16.47813
> 7.493077e-04 0.9587773
> 19246 -0.8675393 -0.5082140 0.35932533 8.123538
> 16.27776
> 7.845150e-04 0.9587773
> 28808 0.2601277 0.6909140 0.43078633 4.814602
> 16.21283
> 7.963487e-04 0.9587773
>
> I want to export my results and write
>
> results<- decideTests(fit2)
> write.fit(fit2, results, "limma_results.txt", adjust="BH")
>
> Now don't get the same columns as when using topTable which is
> quite
> confusing. Why don't I get the FC for the comparisons between
> the different
> groups as if I run topTable with the coef parameter (
> "topTable(fit2,
> coef=1)" )? The columns I get are the following
>
>
> The simple answer is that they are two different functions with
> different goals. But note that you do get the same information.
>
>
>
> A
>
> Coef.GroupB - GroupA
> Coef.GroupC - GroupA
> Coef.GroupC - GroupB
>
> t.GroupB - GroupA
> t.GroupC - GroupA
> t.GroupC - GroupB
>
> p.value.GroupB - GroupA
> p.value.GroupC - GroupA
> p.value.GroupC - GroupB
>
> p.value.adj.GroupB - GroupA
> p.value.adj.GroupC - GroupA
> p.value.adj.GroupC - GroupB
>
> F
> F.p.value
>
> Res.GroupB - GroupA
> Res.GroupC - GroupA
> Res.GroupC - GroupB
>
>
> Could some body please try to explain what do the columns A,
> Coef, F,
> F.p.value and Res mean?
>
>
> A - are your log fold change values
> Coef - are your coefficients (you set up a cell means model, so
> these are the sample means)
> F - is an F-statistic, which tests the null hypothesis that none
> of the sample means are different
> F.p.value - is the p-value for the F-statistic
> Res - is the results matrix you passed into write.fit(), showing
> which contrast(s) were significant
>
> Best,
>
> Jim
>
>
>
>
>
> #Session info
>
> sessionInfo()
>
> R version 2.15.0 (2012-03-30)
> Platform: i386-pc-mingw32/i386 (32-bit)
>
> locale:
> [1] LC_COLLATE=Swedish_Sweden.1252 LC_CTYPE=Swedish_Sweden.1252
> LC_MONETARY=Swedish_Sweden.1252 LC_NUMERIC=C
> LC_TIME=Swedish_Sweden.1252
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods
> base
>
> other attached packages:
> [1] limma_3.12.1 Biobase_2.16.0 BiocGenerics_0.2.0
>
> loaded via a namespace (and not attached):
> [1] affylmGUI_1.30.0 IRanges_1.14.4
> oneChannelGUI_1.22.10
> stats4_2.15.0 tcltk_2.15.0
> Best regards
> JMA
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org <mailto:Bioconductor at r-project.org>
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
> --
> James W. MacDonald, M.S.
> Biostatistician
> University of Washington
> Environmental and Occupational Health Sciences
> 4225 Roosevelt Way NE, # 100
> Seattle WA 98105-6099
>
>
--
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099
More information about the Bioconductor
mailing list