[BioC] on the meaning and value of $stdev.unscaled in a limma's lmFitted MArrayLM object
Massimo Pinto
pintarello at gmail.com
Fri Dec 4 11:59:59 CET 2009
Greetings all,
I am writing to enquire about the meaning of some elements that are
provided in an limma::MArray-LM object resulting from a call to
limma::lmFit();
To avoid a verbose output, I tried to minimize the number of genes to
be fitted, by selecting just two genes from my expression set:
>eset.2167 <- eset.more[c(848,849)]
>fit.2167 <- lmFit(eset.2167, designo) # where 'designo' is my design matrix
I did try fitting a selection from my expression set containing just
one gene, but I understand that lmFit() does not like this, as it
yields:
"Error in fit$effects[(fit$rank + 1):narrays, , drop = FALSE] :
incorrect number of dimensions"
My concern, specifically, is that I am not sure how the standard
errors on the lmFit's coefficients are calculated, as they are
returned with a value that differs from what I get from a separate run
in MSExcel, using the built-in function linest(), with the same design
matrix and vector of y kept the same. Convincingly, the output of best
values of the fitted coefficients is identical in lmFit and MSExcel's
linest().
Take for example $stdev.unscaled which is defined (from ?MArrayLM) as
"$stdev.unscaled
matrix containing unscaled standard deviations of the coefficients or contrasts"
> fit.2167$stdev.unscaled
(Intercept) Dose1Gy Ageing6mo Dose1Gy:Ageing6mo
Ageing6mo:LabLNGS Dose1Gy:Ageing6mo:LabLNGS
A_23_P8812 0.5 0.7071068 0.7071068 1
0.7071068 1
A_24_P7642 0.5 0.7071068 0.7071068 1
0.7071068 1
Now to MSExcel, focusing on the first of the two genes (Gene ID 2167
mapped by probe A_23_P8812). Here are the standard errors on the
fitted coefficients, as results from a call to linest()
Dose1Gy:Ageing6mo:LabLNGS Ageing6mo:LabLNGS Dose1Gy:Ageing6mo Ageing6mo Dose1Gy (Intercept)
best estimates 0,50034025 -1,75687825 -0,3867455 3,055038 0,21681525 8,00081275
std errors
0,262705064 0,185760532 0,262705064 0,185760532 0,185760532 0,131352532
As one can see, the stderr on the (intercept), for example, is
0,131352532, whereas limma:lmFit() gave my a 0.5.
What is the reason for such apparent discrepancy?
Is there any a-priori mistake made here, since Excel fitted just one
gene while lmFit() focused on two genes in this specific example?
Thanking you in advance, here is my sessionInfo()
Yours,
Massimo
> sessionInfo()
R version 2.10.0 (2009-10-26)
x86_64-apple-darwin9.8.0
locale:
[1] C
attached base packages:
[1] grid tcltk tools stats graphics grDevices utils
datasets methods base
other attached packages:
[1] GOstats_2.12.0 graph_1.22.2 Category_2.12.0
affy_1.24.2 gplots_2.7.3 caTools_1.10
[7] bitops_1.0-4.1 gdata_2.6.1 gtools_2.6.1
hgug4112a.db_2.3.5 org.Hs.eg.db_2.3.6 RSQLite_0.7-3
[13] DBI_0.2-4 Agi4x44PreProcess_1.6.0 genefilter_1.28.0
annotate_1.24.0 AnnotationDbi_1.7.20 limma_3.2.1
[19] Biobase_2.6.0 svGUI_0.9-46 svSocket_0.9-48
svMisc_0.9-56
loaded via a namespace (and not attached):
[1] GO.db_2.3.5 GSEABase_1.8.0 RBGL_1.20.0
XML_2.6-0 affyio_1.13.5 preprocessCore_1.7.9
splines_2.10.0
[8] survival_2.35-7 xtable_1.5-6
>
Massimo Pinto
Post Doctoral Research Fellow
Enrico Fermi Centre and Italian Public Health Research Institute (ISS), Rome
http://claimid.com/massimopinto
More information about the Bioconductor
mailing list