# [Bioc-devel] Making hypothesis testing easier with design matrices?

Ryan C. Thompson rct at thompsonclan.org
Tue Dec 11 06:41:34 CET 2012

```Dear Gordon,

After a bit of pen-and-paper work, I see what you mean about additive
models. I constructed a simple 2x2 additive model (i.e. "~a+b" where a
and b each have 2 levels) and tried to solve for all 4 groups, and found
that it was impossible. The best that can be done is solving for two out
of the four, plus the mean of the other two. Clearly an interaction term
would be required to resolve the other two. So I see that my proposal is
indeed impossible to carry out in the general case, and in every case
where it is possible, one may as well use a no-intercept parametrization
and be done with it. Thanks for clarifying.

additive model, but only one factor (or equivalently, one set of
interacting factors) is of interest and the rest are blocking factors.
For example, suppose the model is "~condition + donor", but donor is
just a blocking factor and only condition is of interest. If one used
the no-intercept formula "~0+condition+donor" and set "donor" to use
sum-to-zero contrasts, then am I correct in thinking that the
coefficients corresponding to levels of "condition" would then be usable
as estimates of the average logCPM for each condition? If so, would
these estimates be any better than simply computing logCPM individually
for each sample and taking the mean of all the samples in each group?

Sincerely,
-Ryan

On Mon 10 Dec 2012 05:56:07 PM PST, Gordon K Smyth wrote:
>
> Dear Ryan,
>
> Thanks for your suggestion. I think though that the attribute that
> you are thinking of implementing is not actually something that exists
> in general.
...
> This is so only for one-way designs, i.e., for single factor experiments.
...
> For additive models however, I think there is no shortcut to a user
> trying to understand what the fitted coefficients represent.
>
> Best wishes
> Gordon

```