[BioC] A single linear model for all
Arne.Muller at aventis.com
Arne.Muller at aventis.com
Wed Apr 28 15:21:36 CEST 2004
Hello,
A while ago I've posted "programming problem: running many ANOVAs" (I actually got a very sophisticated reply - too sophisticated for me :-( ...). Following this posting I came across another problem with linear models.
I usually run a simple linear model including including all my factors (dose, time, batch) for each probeset on the array. I.e. I construct and run >12,000 linear models and anovas. The model could be:
Value ~ batch + time, + dose
I was thinking about running just a single linear model that includes the probes( actually the probes sets i.e. the genes)
Value ~ gene + batch + time + dose + probe*batch + probe*time + probe*dose
The gene (probeset) interacts with each main effect.
the actual dataframe would look like this:
Value batch time dose gene
5.225589 NEW 24h 000mM 100001_at
5.207835 NEW 24h 000mM 100001_at
4.138210 NEW 24h 000mM 100001_at
7.253535 OLD 24h 000mM 100001_at
...
4.018591 PRG 04h 025mM 100001_at
7.205778 PRG 04h 000mM 100001_at
8.191978 NEW 24h 000mM 100002_at
I'm abolutely not sure about this. There are several problems:
1. What about degrees of freedom, they're huge?
2. Don't know how to interpret summary(fit)
3. Computitionally impossible (on my machine) ;-( ...
I'm more interested in whether anybody here has already tried this seriously, i.e. worked on the statistical theory + biological interpretation.
kind regards,
Arne
--
Arne Muller, Ph.D.
Toxicogenomics, Aventis Pharma
arne dot muller domain=aventis com
More information about the Bioconductor
mailing list