[BioC] Paired limma analysis
James W. MacDonald
jmacdon at uw.edu
Fri Aug 3 15:30:09 CEST 2012
Hi mb3058 at columbia.edu,
On 8/2/2012 6:59 PM, mb3058 at columbia.edu wrote:
> I am running a paired analysis with limma.
> My platform is affymetrix mouse gene st.1.0 array
> I have a paired group (hi, lo) with duplicates per group.
> My script so far is this:
> exprs <- read.table("data.txt",as.is=T,sep="\t",header=T,row.names=1)
> targets<-readTargets("pairedTargetFile.txt")
> Pair <- factor(targets$Group)
> Treat <- factor(targets$Treatment, levels=c("hi","lo"))
> design <- model.matrix(~Pair+Treat)
> fit <- lmFit(exprs, design)
> fit <- eBayes(fit)
> #I want to print out the limma results for all the probes
> Results<- topTable(fit,number=241425,adjust.method="BH")
> write.table(Results,"paired_Results.txt",sep="\t")
>
> ID X.Intercept. Pair2 Treatlo AveExpr F P.Value
> adj.P.Val
> 10469363 10.590048 2.186194 -0.636504 11.364893
> 1566.532105 0.000000469 0.00000942
> 10365564 10.12985925 2.3047915 -0.2214415 11.17153425
> 1528.543391 0.000000495 0.00000942
> 10469367 9.8800415 2.440832 -0.652667 10.774124
> 1417.807515 0.000000583 0.00000942
> 10517519 10.1252575 2.183145 0.212485 11.3230725
> 1403.521849 0.000000596 0.00000942
> 10469380 9.61192575 2.9290725 -0.8228465 10.66503875
> 1396.225476 0.000000603 0.00000942
>
> Q1.I am trying to interpret some of the fields. Could someone help me
> understand what the fields X.Intercept. Pair2 Treatlo stand for?
Well, I can tell you what the coefficients are estimating, but make no
warranties concerning your understanding.
The X.Intercept can be thought of as a baseline. Technically it
estimates the mean expression of the hi treated samples for the first of
the pairs. The Pair2 coefficient estimates the difference between the
pairs, and the Treatlo coefficient estimates the difference between the
hi and lo treatment.
I would assume that you want to find those genes that are differentially
expressed between hi and lo, after adjusting for the pairing. For that
you want the Treatlo coefficient.
>
>
>
> Q2. When I run the following line to get results
> Results <- topTable(fit, coef="Treatlo"), I get the following results:
> ID logFC AveExpr t P.Value adj.P.Val B
> 10354461 4.390143 5.666826 14.06184 8.62E-05
> 0.5279716 -0.7055664
> 10583079 4.249853 6.16608 13.51989 1.02E-04
> 0.5279716 -0.7296509
> 10587318 4.180087 4.887135 13.34857 1.08E-04
> 0.5279716 -0.7378131
> 10370298 4.0321 5.502014 12.69299 1.33E-04 0.5279716
> -0.7717809
> 10392551 4.259277 4.614051 12.44214 1.45E-04
> 0.5279716 -0.7860399
> 10368320 4.387814 5.911313 12.36954 1.49E-04
> 0.5279716 -0.7903085
> 10587807 3.805685 4.826062 12.30147 1.52E-04
> 0.5279716 -0.7943703
> 10504815 3.793085 5.657937 12.27763 1.54E-04
> 0.5279716 -0.7958069
> 10357489 3.968974 4.666258 12.23582 1.56E-04
> 0.5279716 -0.7983445
> 10481521 3.757882 7.856306 12.15789 1.60E-04
> 0.5279716 -0.8031341
>
> However when I run the following command to get results for all the
> probes, I get different values for the same probes. See PValue, Adj. P
> value
Well, you don't show the command you use. I would assume you ran
topTable() without specifying a coefficient, in which case the following
portions from ?topTable might be instructive:
coef: column number or column name specifying which coefficient or
contrast of the linear model is of interest. For 'topTable',
can also be a vector of column subscripts, in which case the
gene ranking is by F-statistic for that set of contrasts.
and
'topTableF' ranks genes on the basis of moderated F-statistics for
one or more coefficients. If 'topTable' is called with 'coef' has
length greater than 1, then the specified columns will be
extracted from 'fit' and 'topTableF' called on the result.
'topTable' with 'coef=NULL' is the same as 'topTableF', unless the
fitted model 'fit' has only one column.
The default to topTable() is coef=NULL, in which case you are calling
topTableF(), and computing the F-statistic over all coefficients, which
I highly doubt is what you want to do.
Best,
Jim
> ID X.Intercept. Pair2 Treatlo AveExpr F P.Value
> adj.P.Val
> 10354461 2.963005 1.0175 4.390143 5.6668265
> 441.9193584 7.37E-06 1.48E-05
> 10583079 5.2467745 -2.411241 4.249853 6.1660805
> 516.5649337 5.25E-06 1.24E-05
> 10587318 2.88690775 -0.1796325 4.1800875 4.88713525
> 334.0547737 1.35E-05 2.19E-05
> 10370298 4.54191225 -2.1118965 4.0321005 5.50201425
> 408.7688335 8.74E-06 1.64E-05
> 10392551 2.4626765 0.043472 4.259277 4.614051
> 261.1260083 2.31E-05 3.27E-05
> 10368320 3.76314325 -0.0914745 4.3878135 5.91131275
> 377.3505032 1.04E-05 1.83E-05
> 10587807 2.69620575 0.4540285 3.8056845 4.82606225
> 325.4919972 1.43E-05 2.28E-05
> 10504815 5.27270925 -3.0226305 3.7930855 5.65793675
> 458.5242305 6.81E-06 1.42E-05
> 10357489 2.90779925 -0.4520565 3.9689735 4.66625775
> 286.4197106 1.89E-05 2.80E-05
> 10481521 6.637718 -1.320705 3.757882 7.8563065
> 794.2002708 2.06E-06 9.55E-06
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
--
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099
More information about the Bioconductor
mailing list