Fwd: [R] problems with limma
Gordon K Smyth
smyth at wehi.EDU.AU
Tue Dec 21 23:18:04 CET 2004
On Wed, December 22, 2004 12:11 am, r.ghezzo at staff.mcgill.ca said:
> ----- Forwarded message from r.ghezzo at staff.mcgill.ca -----
> Date: Mon, 20 Dec 2004 15:45:11 -0500
> From: r.ghezzo at staff.mcgill.ca
> Reply-To: r.ghezzo at staff.mcgill.ca
> Subject: [R] problems with limma
> To: r-help at stat.math.ethz.ch
>
> I try to send this message To Gordon Smyth at smyth at vehi,edu.au but it bounced
> back, so here it is to r-help
Questions about limma should be sent to the Bioconductor mailing list (the address is given in the
introduction of the Limma User's Guide for example).
> I am trying to use limma, just downloaded it from CRAN. I use R 2.0.1 on Win XP
> see the following:
>> library(RODBC)
>> chan1 <- odbcConnectExcel("D:/Data/mgc/Chips/Chips4.xls")
>> dd <- sqlFetch(chan1,"Raw") # all data 12000
>> #
>> nzw <- cbind(dd$NZW1C,dd$NZW2C,dd$NZW3C,dd$NZW1T,dd$NZW2T,dd$NZW3T)
>> akr <- cbind(dd$AKR1C,dd$AKR2C,dd$AKR3C,dd$AKR1T,dd$AKR2T,dd$AKR3T)
>> bas <- cbind(dd$NZW1C,dd$NZW2C,dd$NZW3C,dd$AKR1C,dd$AKR2C,dd$AKR3C)
>> #
>> design<-matrix(c(1,1,1,1,1,1,0,0,0,1,1,1),ncol=2)
>> fit1 <- lmFit(nzw,design)
>> fit1 <- eBayes(fit1)
>> topTable(fit1,adjust="fdr",number=5)
> M t P.Value B
> 12222 3679.480 121.24612 7.828493e-06 -4.508864
> 1903 3012.405 118.32859 7.828493e-06 -4.508866
> 9068 1850.232 92.70893 1.178902e-05 -4.508889
> 10635 2843.534 91.99336 1.178902e-05 -4.508890
> 561 18727.858 90.17085 1.178902e-05 -4.508893
>> #
>> fit2 <- lmFit(akr,design)
>> fit2 <- eBayes(fit2)
>> topTable(fit2,adjust="fdr",number=5)
> M t P.Value B
> 88 1426.738 80.48058 5.839462e-05 -4.510845
> 1964 36774.167 73.05580 5.839462e-05 -4.510861
> 5854 7422.578 68.60316 5.839462e-05 -4.510874
> 11890 1975.316 66.54480 5.839462e-05 -4.510880
> 9088 2696.952 64.16343 5.839462e-05 -4.510889
>> #
>> fit3 <- lmFit(bas,design)
>> fit3 <- eBayes(fit3)
>> topTable(fit3,adjust="fdr",number=5)
> M t P.Value B
> 6262 1415.088 100.78933 2.109822e-05 -4.521016
> 5660 1913.479 96.40903 2.109822e-05 -4.521020
> 11900 4458.489 94.30738 2.109822e-05 -4.521022
> 9358 1522.330 80.46641 3.346749e-05 -4.521041
> 11773 1784.483 73.76620 3.346749e-05 -4.521053
>> # Now lets do all together in Anova
>> #
>> all <- cbind(nzw,akr)
>> ts <- c(1,1,1,2,2,2,3,3,3,4,4,4)
>> ts <- as.factor(ts)
>> levels(ts) <- c("nzwC","nzwT","akrC","akrT")
>> design <- model.matrix(~0+ts)
>> colnames(design) <- levels(ts)
>> fit4 <- lmFit(all,design)
>> cont.matrix <- makeContrasts(
> + Baseline = akrC - nzwC,
> + NZW_Smk = nzwT - nzwC,
> + AKR_Smk = akrT - akrC,
> + Diff = (akrT - akrC) - (nzwT - nzwC),
> + levels=design)
>> fit42 <- contrasts.fit(fit4,cont.matrix)
>> fit42 <- eBayes(fit42)
>> #
>> topTable(fit42,coef="Baseline",adjust="fdr",number=5)
> M t P.Value B
> 3189 942.0993 13.57485 0.004062283 -4.528799
> 8607 2634.1826 11.23476 0.006913442 -4.530338
> 10242 -942.2860 -10.99253 0.006913442 -4.530551
> 283 -609.0831 -10.79354 0.006913442 -4.530735
> 3224 -1564.2572 -10.19429 0.008089034 -4.531351
> ----------------------------------------------------
> ------------- Shouldn't this be equal to fit1 above?
> ----------------------------------------------------
>> topTable(fit42,coef="NZW_Smk",adjust="fdr",number=5)
> M t P.Value B
> 7724 -246.5956 -8.687324 0.1615395 -4.591133
> 1403 -307.8660 -7.063312 0.4066814 -4.591363
> 3865 -253.4899 -6.585582 0.4598217 -4.591457
> 3032 -509.2413 -5.841901 0.8294166 -4.591640
> 2490 -240.3259 -5.338679 0.9997975 -4.591795
> ----------------------------------------------------
> ------------- Shouldn't this be equal to fit2 above?
> ------------- The P.Value are unreal!!
> ----------------------------------------------------
>> topTable(fit42,coef="AKR_Smk",adjust="fdr",number=5)
> M t P.Value B
> 11547 151.6622 6.380978 0.917470 -4.595085
> 12064 324.0851 6.337235 0.917470 -4.595085
> 6752 964.5478 5.858994 0.952782 -4.595086
> 10251 152.7587 5.339843 0.952782 -4.595087
> 1440 189.6056 4.933151 0.952782 -4.595089
> ----------------------------------------------------
> ------------- Shouldn't this be equal to fit3 above?
> ------------- The P.Value are unreal!!
> ----------------------------------------------------
>> topTable(fit42,coef="Diff",adjust="fdr",number=5)
> M t P.Value B
> 7724 302.6892 7.540195 0.4102211 -4.593201
> 1403 419.4962 6.805495 0.4102211 -4.593265
> 10251 270.5269 6.686796 0.4102211 -4.593277
> 3270 409.8391 6.414966 0.4192042 -4.593307
> 10960 -511.4711 -5.469247 0.9652171 -4.593435
>> #
>>
> So the results I get from just pairwise comparisons are very significant, but
> when I try the Anova way, the significance completely dissapears.
> Am I doing something completely wrong?
Your commands look correct but your data looks crazy. The M-values are supposed to be log-fold
changes, and you're getting values in the 10s of thousands. Perhaps you are trying analyse data
on the raw intensity scale and have huge differences between arrays and groups or very large
outliers. Note that the B-statistics are telling you that there is absolutely no differential
expression throughout. Warning bells should go up when you see such large t-statistics with such
small B-statistics. Please look at your data and do some quality assessment. At very least you
probably need to log-transform.
Your 4-group approach should give the same M-values as the 2-group approach, but the standard
errors and t-statistics will change because your error estimates change.
Gordon
> This is data from Affimetrix mouse chips.
> Thanks for any help
> Heberto Ghezzo
> Ph.D.
> Meakins-Christie Labs
> McGill University
> Montreal - Canada
More information about the R-help
mailing list