[BioC] limma question: direct two-color design & modeling individual subject effects
Gordon K Smyth
smyth at wehi.EDU.AU
Tue May 1 01:02:44 CEST 2007
Dear Paul,
Your description of the limma model you've fitted is very clear, but you haven't explained exactly
what is in your picture. The data values on the y-axis don't appear to be the M-values you used
to fit the linear model, because we don't see the up-down pattern we'd expect to see from
dye-swaps. How have you obtained "fitted values"? Note that M-values are already log-ratios, so
it doesn't make sense to write "log2M".
lmFit() simply does least squares regression. It gives the same coefficients that you would get
from lm() for each gene. I suggest that you extract the M-value data for one gene, and experiment
with fitting the data using lm(), until you're satisfied that you understand the parametrization
and fitted values.
Best wishes
Gordon
> [BioC] limma question: direct two-color design & modeling individual subject effects
> Paul Shannon pshannon at systemsbiology.org
> Mon Apr 30 05:08:15 CEST 2007
>
> I've been working on and off for a few months with limma on a set of 28 2-color
> arrays made up of 14 dye-swap pairs. The main contrast in the arrays is between
> malaria parasite RNA extracted from maternal and from juvenile hosts;
> all the arrays can be described in these terms. This is the main effect we
> are studying, and limma is very helpful in elucidating it.
>
> The arrays can be more specifically described as comparisons between specific
> maternal subjects and specific juvenile subjects -- between different
> combinations of three mothers (m918, m836, m920) with six children (c073, c135,
> c140, c372, c451, c413, c425). I have trouble fitting models to some of these
> genes, failing to isolatethe effects of individual subjects where their effects seem
> to be strong.
>
> (A good example can be seen at http://gaggle.systemsbiology.net/pshannon/tmp/7346.png,
> where the effect of m920 is pronounced, but apparently missed by my lmFit/eBayes model.)
>
> Here are some few lines from each of the matrices I use that lead to that plot.
>
> ---- head (targets)
>
> SlideNumber Name FileName Cy3 Cy5 Mother Child
> 1 2254 slide2254 m918c073-cy3cy5.gpr maternal juvenile m918 c073
> 2 2261 slide2261 m918c073-cy5cy3.gpr juvenile maternal m918 c073
> 3 2258 slide2258 m836c073-cy3cy5.gpr maternal juvenile m836 c073
> 4 2265 slide2265 m836c073-cy5cy3.gpr juvenile maternal m836 c073
> 5 2341 slide2341 m836c135-cy3cy5.gpr maternal juvenile m836 c135
> 6 2344 slide2344 m836c135-cy5cy3.gpr juvenile maternal m836 c135
>
> ----- head (design)
>
> mother child maternal
> 1 m918 c073 Low
> 2 m918 c073 High
> 3 m836 c073 Low
> 4 m836 c073 High
> 5 m836 c135 Low
> 6 m836 c135 High
>
> ---- create the model
>
> model <- model.matrix (~maternal + mother + child, design)
>
> head (model)
> (Intercept) maternalHigh motherm918 motherm920 childc135 childc140 childc372 childc413
childc425 childc451
> 1 1 0 1 0 0 0 0 0
0 0
> 2 1 1 1 0 0 0 0 0
0 0
> 3 1 0 0 0 0 0 0 0
0 0
> 4 1 1 0 0 0 0 0 0
0 0
> 5 1 0 0 0 1 0 0 0
0 0
> 6 1 1 0 0 1 0 0 0
0 0
>
> ---- fit the data
>
> fit <- lmFit (MA, model)
> efit <- eBayes (fit)
>
> # one example of poor fit. with probe 7346, the m920 effect is very strong, but the coefficients
> # don't reflect that. instead, most of the influence is allocated to the maternal effect, which
> # nicely models all the comparisons except those involving m920. the fit there is strikingly
> # poor, with high residuals. I can't make sense of the tiny motherm920 coefficient:
>
> > efit$coef [7346,]
> (Intercept) maternalHigh motherm918 motherm920 childc135 childc140 childc372
childc413 childc425 childc451
> -3.62867124 7.49268173 0.24858455 -0.02635289 -0.67898282 -0.24566235 -0.24673763
0.10618603 -0.37520911 -0.02761610
>
> The plot of the fitted & actual values can be found at
>
> http://gaggle.systemsbiology.net/pshannon/tmp/7346.png
>
> I may be over-interpreting, or mis-interpreting, or even misrepresenting all this. But after lots
> of head scratching, lots of reading and experiments, I can't get the coefficients to do what I
think
> they should. Perhaps it's my failure to use a contrast matrix. Or something else.
>
> Any suggestions? I'll be really grateful for any advice.
>
> Thanks!
>
> - Paul
More information about the Bioconductor
mailing list