[BioC] extremely low adjusted p-values in regression using limma
Gordon K Smyth
smyth at wehi.EDU.AU
Sun Mar 2 05:02:39 CET 2008
Dear Artur,
> Message: 14
> Date: Fri, 29 Feb 2008 14:53:40 -0500
> From: "Artur Veloso" <abveloso at gmail.com>
> Subject: [BioC] extremely low adjusted p-values in regression using
> limma
> To: "Bioconductor List" <bioconductor at stat.math.ethz.ch>
>
> Dear all,
>
> I am trying to run a regression analysis on my microarray data using the
> limma and all the genes are showing adjusted p-values lower than 10 ^ - 20,
> which points me to some sort of major mistake on the code that I am writing
> for this analysis. I searched the limma user's guide and the Bioconductor
> mail archive for explanations on how to run regressions on limma but could
> not figure out what I was doing wrong from them.
You are getting very small p-values because you are testing the intercept
equal to zero, which obviously it isn't. All the relevant examples in the
User's Guide advise you to use topTable(fit,coef=2), which would solve
your problem.
> The code that I am using
> goes below and any help with it would be extremely appreciated.
> Also, is it possible to get the r-squared values for each gene in the
> analysis?
R-squared is of no interest in ANOVA situations. Even in regression with
normally distributed covariates it is of limited interest, because it is
just a transformation of the F-statistic. You could express the moderated
F-statistic as an Rsquared proportion by:
fit2 <- eBayes(fit[,-1]) # remove the intercept
df.total <- fit2$df.residual+fit2$df.prior
Ratio <- fit2$F*fit$rank/df.total
Rsquared <- Ratio/(1+Ratio)
Best wishes
Gordon
> Thank you very much,
>
> Artur Veloso
> Masters in Marine Biology Candidate
> College of Charleston, SC, USA
>
>> concentrations
> [1] 9.020 1.740 0.960 9.740 2.460 1.810 1.478 1.380 2.760 0.860
> [11] 2.520 3.950 5.210 1.090 26.760 1.670 0.860 1.340 7.500 5.380
>> regression.design <- model.matrix(~log(concentrations))
>> regression.design
> (Intercept) log(concentrations)
> 1 1 2.19944433
> 2 1 0.55388511
> 3 1 -0.04082199
> 4 1 2.27624112
> 5 1 0.90016135
> 6 1 0.59332685
> 7 1 0.39068982
> 8 1 0.32208350
> 9 1 1.01523068
> 10 1 -0.15082289
> 11 1 0.92425890
> 12 1 1.37371558
> 13 1 1.65057986
> 14 1 0.08617770
> 15 1 3.28690824
> 16 1 0.51282363
> 17 1 -0.15082289
> 18 1 0.29266961
> 19 1 2.01490302
> 20 1 1.68268837
> attr(,"assign")
> [1] 0 1
>> dim(vsn.normalized)
> [1] 12426 20
>> data.correlation <- duplicateCorrelation(vsn.normalized,regression.design
> ,ndups=2,spacing=1)
>> data.regression <- lmFit(vsn.normalized,regression.design
> ,ndups=2,spacing=1,correlation=data.correlation$consensus)
>> tail(topTable(eBayes(data.regression),number=6200))
> ID
> 1808 CV133193 triacylglycerol lipase activity CV_HP_010_Plate_96_B8
> 5545 CD648266 none CV_UNI_001_Plate_96_G6_bottom
> 5267 CV088280 integral to membrane CV_HP_009_Plate_96_D9_bottom
> 4055 CV133023 none CV_HP_011_Plate_96_D11_bottom
> 5935 CV132240 none CV_HP_004_Plate_96_C6_bottom
> 4392 AJ565488 none CG_NS1_001_Plate_96_G5_bottom
> X.Intercept. log.concentrations. F P.Value
> 1808 11.87372 -0.0947231 250.6960 2.810474e-24
> 5545 13.32032 0.1710411 244.8938 4.447522e-24
> 5267 11.13640 -0.1087474 242.2680 5.492360e-24
> 4055 11.03485 -0.1456426 239.7445 6.740600e-24
> 5935 10.69073 0.9076729 235.8741 9.263810e-24
> 4392 11.59845 -0.7217251 234.0277 1.079942e-23
> adj.P.Val
> 1808 2.818640e-24
> 5545 4.459725e-24
> 5267 5.506541e-24
> 4055 6.756913e-24
> 5935 9.284731e-24
> 4392 1.082206e-23
>
>> sessionInfo()
> R version 2.6.1 (2007-11-26)
> i386-pc-mingw32
>
> locale:
> LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
> States.1252;LC_MONETARY=English_United
> States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
>
> attached base packages:
> [1] tools stats graphics grDevices utils datasets methods
> base
>
> other attached packages:
> [1] statmod_1.3.3 vsn_3.2.1 affy_1.16.0
> preprocessCore_1.0.0 affyio_1.6.1
> [6] Biobase_1.16.2 limma_2.12.0
>
> loaded via a namespace (and not attached):
> [1] grid_2.6.1 lattice_0.17-2
More information about the Bioconductor
mailing list