[BioC] Unexpected results using limma with numerical factor

Gordon Smyth smyth at wehi.edu.au
Thu Aug 26 09:37:48 CEST 2004


At 05:40 PM 25/08/2004, Matthew  Hannah wrote:
>I've recently asked a similar question but have got no feedback, see
>point 2 onwards in the following.
>https://www.stat.math.ethz.ch/pipermail/bioconductor/2004-August/005802.
>html
>
>As I now understand it (perhaps/probably wrongly) the standard approach
>(equilivent to ANOVA?) is to use a non-numeric factor for the fit.
>However, lm() in R is also capable of regression fits. Looking (but not
>understanding) lm.fit in limma, it appears to be an independent function
>(lm bit written in fortran?) and doesn't call lm() from stats. So the
>question is really if lm.fit can do numeric regression?

I am afraid that you've got things a bit skew. lm.fit() is not a function 
in limma, rather is a low level function in the stats package which is 
called by lm(). The limma functions use lm.fit() and lm.wfit() just like 
lm() does.

It isn't really helpful to think of ANOVA in the microarray context. ANOVA 
is a special case of regression.

>Another consideration is that you wouldn't use a design (factor) but a
>numeric vector. My guess is that your design is being taken as a factor,
>(and if you look in the user guide) the -ve values may indicate dye
>swaps, which could be interesting! I've just tried lmfit with a numeric
>vector (but as there are replicates each number appears 3 times) and got
>meaningless results - all lods>30 and all tiny p-values 1e-24. So
>initially it looks like you can't use limma like this, but I'd like to
>hear an expert verdict as if the approach is completely wrong.

I don't think I understand your question. Have a talk with a statistician 
at your own institution about factors, covariates and design matrices.

Gordon

>Anyway if you're looking for correlations then you could perhaps try
>pearson (see my previous post about p-values and R2 from lm().
>try-
>
>Test1 <- c(0.58,-2.36,-12.24,-14.84,0.15,-3.23,-11.66,-12.91)
>Correl <- esApply(eset, 1, function(x) {
>cor(x, Test1)
>})
>
>be aware that you might get alot of correlations by chance, particularly
>as your scores seem to be in 2 groups, close to zero and < -11. And a
>straight line between two groups gives a good pearson as it's sensitive
>to outliers.
>
>Perhaps it's best to change your test scores to factors anyway-
>Test1.high, Test1.low, Test2.high, Test2.low, and do a conventional
>limma analysis. Without a regular distribution of test scores,
>correlations are going to be largely meaningless.
>
>HTH, and that someone can answer this more definetely.
>
>Cheers,
>Matt



More information about the Bioconductor mailing list