[BioC] Unexpected results using limma with numerical factor
Gordon Smyth
smyth at wehi.edu.au
Thu Aug 26 09:37:48 CEST 2004
At 05:40 PM 25/08/2004, Matthew Hannah wrote:
>I've recently asked a similar question but have got no feedback, see
>point 2 onwards in the following.
>https://www.stat.math.ethz.ch/pipermail/bioconductor/2004-August/005802.
>html
>
>As I now understand it (perhaps/probably wrongly) the standard approach
>(equilivent to ANOVA?) is to use a non-numeric factor for the fit.
>However, lm() in R is also capable of regression fits. Looking (but not
>understanding) lm.fit in limma, it appears to be an independent function
>(lm bit written in fortran?) and doesn't call lm() from stats. So the
>question is really if lm.fit can do numeric regression?
I am afraid that you've got things a bit skew. lm.fit() is not a function
in limma, rather is a low level function in the stats package which is
called by lm(). The limma functions use lm.fit() and lm.wfit() just like
lm() does.
It isn't really helpful to think of ANOVA in the microarray context. ANOVA
is a special case of regression.
>Another consideration is that you wouldn't use a design (factor) but a
>numeric vector. My guess is that your design is being taken as a factor,
>(and if you look in the user guide) the -ve values may indicate dye
>swaps, which could be interesting! I've just tried lmfit with a numeric
>vector (but as there are replicates each number appears 3 times) and got
>meaningless results - all lods>30 and all tiny p-values 1e-24. So
>initially it looks like you can't use limma like this, but I'd like to
>hear an expert verdict as if the approach is completely wrong.
I don't think I understand your question. Have a talk with a statistician
at your own institution about factors, covariates and design matrices.
Gordon
>Anyway if you're looking for correlations then you could perhaps try
>pearson (see my previous post about p-values and R2 from lm().
>try-
>
>Test1 <- c(0.58,-2.36,-12.24,-14.84,0.15,-3.23,-11.66,-12.91)
>Correl <- esApply(eset, 1, function(x) {
>cor(x, Test1)
>})
>
>be aware that you might get alot of correlations by chance, particularly
>as your scores seem to be in 2 groups, close to zero and < -11. And a
>straight line between two groups gives a good pearson as it's sensitive
>to outliers.
>
>Perhaps it's best to change your test scores to factors anyway-
>Test1.high, Test1.low, Test2.high, Test2.low, and do a conventional
>limma analysis. Without a regular distribution of test scores,
>correlations are going to be largely meaningless.
>
>HTH, and that someone can answer this more definetely.
>
>Cheers,
>Matt
More information about the Bioconductor
mailing list