[BioC] Modelling interaction using limma package

Ramon Diaz-Uriarte rdiaz at cnio.es
Wed Sep 22 11:29:41 CEST 2004


Dear Fangxin,


On Tuesday 21 September 2004 18:38, Fangxin Hong wrote:
> Dear Ramon;
>    Thank you very much for the detailed answer of my question, it works
> for me!!!

I am glad it was helpful.

>
> One more question, for one factor, if different individual (gene) has
> different number of levels (thus different design and constrast matrix).
> Theoretically, this is still doable using the proposed linear model with
> Emprical Bayes method according to the paper, but in practice, do you or
> anybdy on this list know how to do it?
>
> what I am thinking is to divide the whole data set into several groups,
> all individuals in one group has same number of levels, then apply limma
> package (lmFit and eBayes) as usual. Although this changes the original
> assumption that variance from all individuals are from the same
> distribution with the one that variance from individuals in each group are
> from the same distribution, it should still be valid analysis as each
> group will still have large amount of genes. Does anyone have better idea
> beside writing my own code of doing eBayes.


I do not know how I'd approach the issue, and you obviously have thought about 
it more than I. However, I am having a hard time seeing how you get different 
levels for different genes for the same array. This factor must be "gene 
specific", but then, does it make sense to try a "single unified analysis" 
across all genes? It would even seem that the null you are testing is not the 
same across genes. For instance, if your factor has levels 1, 2, 3, and in 
gene A you have levels 1 and 2, and in gene B you have levels 2 and 3, then: 
for A, H_o: \mu_1 = \mu_2, but for B, H_o: \mu_2 = \mu_3.  This seems very 
strange to me...

Best,

R.



>
> Thanks.
>
> Fangxin
>
> > Dear Fangxin,
> >
> > On Tuesday 14 September 2004 19:41, Fangxin Hong wrote:
> >> Does anyone know how to model interaction (two-way ANOVA) using limma
> >> package (lmFit)?
> >
> > You need to set up the design matrix and decide what contrasts you are
> > interested in. At the bottom there is a somewhat long example (you'll
> > have to
> > supply your own data; this is from a script I have); I have added a few
> > checks to make sure I was testing what I wanted to test. Please note
> > that, for some reason I don't recall now, I made some minor modification
> > to classifyTestsF because it was more convenient for my purposes.
> > [Disclaimer:
> > my example is probably overly complicated].
> >
> >> Also, it seems that eBayes can be applied to other model fit object,
> >> like
> >> 'lm.series', 'gls.series' and  'rlm.series' beside 'lmFit'. Does anyone
> >> use this before?
> >
> > Yes. I've used it with lm and rlm, and maybe also with gls. No major
> > differences. But of course, you might want to make sure that using gls or
> > rlm
> > (or lm) makes sense in your case; for instance, the MASS book (by
> > Venables &
> > Ripley) I think contains some cautionary points about using rlm and
> > related
> > methods with too small sample sizes.
> >
> >
> > Hope this helps.
> >
> >
> > ******************************
> >
> > The example; I can email you privately this as an attachement.
> >
> > ### treat is a factor, and age:centered is continuous.
> >
> >
> > options(contrasts = c("contr.sum", "contr.poly"))
> >
> > d.trt.BY.age <- model.matrix(~ -1 + treat + treat:age.centered)
> > colnames(d.trt.BY.age) <- c("Colon", "Mama", "Normal",
> >                          "Colon.by.age", "Mama.by.age", "Normal.by.age")
> > contrasts.d.trt.BY.age <-
> >     makeContrasts(Colon - Normal,
> >                   Mama - Normal,
> >                   Mama - Colon,
> >                   0.5*Colon + 0.5* Mama - Normal, ## testing "cancer" vs.
> > normal
> >                   Colon.by.age - Normal.by.age,
> >                   Mama.by.age - Normal.by.age,
> >                   Mama.by.age - Colon.by.age,
> >                   levels = d.trt.BY.age)
> >
> >
> > lm.trt.BY.age <- lm.series(d.clean, d.trt.BY.age)
> > lm.trt.BY.age.eb <- ebayes(lm.trt.BY.age)
> > lm.trt.BY.age.contrasts <- contrasts.fit(lm.trt.BY.age,
> >                                          contrasts.d.trt.BY.age)
> > lm.trt.BY.age.contrasts.eb <- ebayes(lm.trt.BY.age.contrasts)
> >
> >
> >
> >
> > d.trt.BY.age2 <- model.matrix(~ -1 + treat + age.centered +
> >                               treat*age.centered)
> > colnames(d.trt.BY.age2) <- c("Colon", "Mama", "Normal",
> >                              "age", "Colon.by.age", "Mama.by.age")
> > contrasts.d.trt.BY.age2 <- makeContrasts(Colon.by.age,
> >                                         Mama.by.age,
> >                                         levels = d.trt.BY.age2)
> >
> >
> >
> >
> > #### Get the interaction F-statistic
> > tmp <-
> >     classifyTestsF(lm.trt.BY.age2.contrasts.eb$t,
> >                    contrasts = contrasts.d.trt.BY.age2,
> >                    design = d.trt.BY.age2,
> >                    df = lm.trt.BY.age2$df +
> > lm.trt.BY.age2.contrasts.eb$df,
> >                    fstat.only = TRUE)
> > lm.interaction.Fstat <- cbind(F.stat =tmp$Fstatistic,
> >                               df1 = rep(2, length(tmp$Fstatistic)),
> >                               df2 = tmp$df2)
> > lm.interaction.Fstat <- cbind(lm.interaction.Fstat,
> >                               p.value = pf(lm.interaction.Fstat[,1],
> >                               lm.interaction.Fstat[,2],
> >                               lm.interaction.Fstat[,3],
> >                               lower.tail = FALSE))
> > lm.interaction.Fstat <- cbind(lm.interaction.Fstat,
> >                               fdr.p.value
> > p.adjust(lm.interaction.Fstat[, 4],
> >                                        method = "fdr"))
> > summary(lm.interaction.Fstat)
> >
> >
> >
> > #############    Checks
> >
> > ### check appropriate contrasts testing for testing interaction via
> > calssifyTestsF
> > l0 <- lm(d.clean[2, ] ~ -1 + treat + age.centered)
> > l1 <- lm(d.clean[2, ] ~ -1 + treat + age.centered + age.centered*treat)
> > l2 <- lm(d.clean[2, ] ~ -1 + treat + treat:age.centered)
> >
> > anova(l0)
> > anova(l1)
> > anova(l2)
> >
> > anova(l0, l1) ## this and the following are the same
> > anova(l0, l2)
> > anova(l1, l2) ## nothing, as expected
> >
> > ## just for checking!!! Note I modify ts, etc.
> > lm.trt.BY.age2.contrasts.eb$t[2,] <- summary(l1)$coefficients[5:6, 3]
> >
> > classifyTestsF(lm.trt.BY.age2.contrasts.eb$t[2, ],
> >                contrasts = contrasts.d.trt.BY.age2[2, ],
> >                design = d.trt.BY.age2,
> >                df = lm.trt.BY.age2$df[2, ] +
> >                lm.trt.BY.age2.contrasts.eb$df,
> >                fstat.only = TRUE)
> > ### OK, works as it should
> >
> >
> >
> >
> > ### changes to classifyTestsF
> >
> > ## I change the return value when fstat.only = TRUE
> > classifyTestsF <-
> > function (object, cor.matrix = NULL, design = NULL,
> >           contrasts = diag(ncol(design)),
> >     df = Inf, p.value = 0.01, fstat.only = FALSE)
> > {
> >     if (is.list(object)) {
> >         if (is.null(object$t))
> >             stop("tstat cannot be extracted from object")
> >         if (missing(design) && !is.null(object$design))
> >             design <- object$design
> >         if (missing(contrasts) && !is.null(object$contrasts))
> >             contrasts <- object$contrasts
> >         if (missing(df) && !is.null(object$df.prior) &&
> >             !is.null(object$df.residual))
> >             df <- object$df.prior + object$df.residual
> >         tstat <- as.matrix(object$t)
> >     }
> >     else {
> >         tstat <- as.matrix(object)
> >     }
> >     ngenes <- nrow(tstat)
> >     ntests <- ncol(tstat)
> >     if (ntests == 1) {
> >         qT <- qt(p.value/2, df, lower.tail = FALSE)
> >         return(sign(tstat) * (abs(tstat) > qT))
> >     }
> >     if (!is.null(cor.matrix) && !is.null(design))
> >         stop("Cannot specify both cor.matrix and design")
> >     if (!is.null(design)) {
> >         design <- as.matrix(design)
> >         R <- chol(crossprod(design))
> >         cor.matrix <- crossprod(backsolve(R, contrasts, transpose =
> > TRUE)) d <- sqrt(diag(cor.matrix))
> >         cor.matrix <- cor.matrix/(d %*% t(d))
> >     }
> >     if (is.null(cor.matrix)) {
> >         r <- ntests
> >         Q <- diag(r)/sqrt(r)
> >     }
> >     else {
> >         E <- eigen(cor.matrix, symmetric = TRUE)
> >         r <- sum(E$values/E$values[1] > 1e-08)
> >         Q <- matvec(E$vectors[, 1:r], 1/sqrt(E$values[1:r]))/sqrt(r)
> >     }
> >     if (fstat.only) {
> >         fstat <- list()
> >         fstat$Fstatistic <- drop((tstat %*% Q)^2 %*% array(1, c(r, 1)))
> >         fstat$df1 <- r
> >         fstat$df2 <- df
> >         return(fstat)
> >     }
> >     qF <- qf(p.value, r, df, lower.tail = FALSE)
> >     if (length(qF) == 1)
> >         qF <- rep(qF, ngenes)
> >     result <- matrix(0, ngenes, ntests, dimnames = dimnames(tstat))
> >     if (is.null(colnames(tstat)) && !is.null(colnames(contrasts)))
> >         colnames(result) <- colnames(contrasts)
> >     for (i in 1:ngenes) {
> >         x <- tstat[i, ]
> >         if (any(is.na(x)))
> >             result[i, ] <- NA
> >         else if (crossprod(crossprod(Q, x)) > qF[i]) {
> >             ord <- order(abs(x), decreasing = TRUE)
> >             result[i, ord[1]] <- sign(x[ord[1]])
> >             for (j in 2:ntests) {
> >                 bigger <- ord[1:(j - 1)]
> >                 x[bigger] <- sign(x[bigger]) * abs(x[ord[j]])
> >                 if (crossprod(crossprod(Q, x)) > qF[i])
> >                   result[i, ord[j]] <- sign(x[ord[j]])
> >                 else break
> >             }
> >         }
> >     }
> >     new("TestResults", result)
> > }
> >
> >
> >
> >
> >
> >
> >
> > --
> > Ramón Díaz-Uriarte
> > Bioinformatics Unit
> > Centro Nacional de Investigaciones Oncológicas (CNIO)
> > (Spanish National Cancer Center)
> > Melchor Fernández Almagro, 3
> > 28029 Madrid (Spain)
> > Fax: +-34-91-224-6972
> > Phone: +-34-91-224-6900
> >
> > http://ligarto.org/rdiaz
> > PGP KeyID: 0xE89B3462
> > (http://ligarto.org/rdiaz/0xE89B3462.asc)

-- 
Ramón Díaz-Uriarte
Bioinformatics Unit
Centro Nacional de Investigaciones Oncológicas (CNIO)
(Spanish National Cancer Center)
Melchor Fernández Almagro, 3
28029 Madrid (Spain)
Fax: +-34-91-224-6972
Phone: +-34-91-224-6900

http://ligarto.org/rdiaz
PGP KeyID: 0xE89B3462
(http://ligarto.org/rdiaz/0xE89B3462.asc)



More information about the Bioconductor mailing list