[BioC] Limma time course analysis - defining comparisons
Helen Zhou
zhou.helen at yahoo.com
Tue Jun 17 19:01:01 CEST 2008
Dear James and Others
I have followed your suggestion from last week, about
trying to define all contrasts in a time course
analysis, and then using the nestedF option. However,
I am afraid that I am still a bit confused.
I run the analysis shown below, and try decideTests
both with and without method="nestedF". The rsult is
also shown below, but too summarize it is:
Without nested F:
summary(results)
# h24-h12 h12-h4 h24-h4
#-1 87 2 237
#0 21714 22280 21266
#1 482 1 780
With nested F
summary(resultsF)
# h24-h12 h12-h4 h24-h4
#-1 94 41 159
#0 21706 22210 21547
#1 483 32 577
I am not sure why I see such a big (relative) change
for 12hour versus 4hour. In both cases I have a direct
comparison. Acording to the help page for
classifyTestsF that is being called, it says:
"classifyTestsF uses a nested F-test approach giving
particular attention to correctly classifying genes
which have two or more significant t-statistics, i.e.,
are differential expressed under two or more
conditions. For each row of tstat, the overall
F-statistics is constructed from the t-statistics as
for FStat. At least one constrast will be classified
as significant if and only if the overall F-statistic
is significant. If the overall F-statistic is
significant, then the function makes a best choice as
to which t-statistics contributed to this result. The
methodology is based on the principle that any
t-statistic should be called significant if the F-test
is still significant for that row when all the larger
t-statistics are set to the same absolute size as the
t-statistic in question."
I think I understand that. But I am surprised that it
makes such a big difference. The difference is even
bigger with my real data set with 6 time points,
compared to the data from bioconductor that I am using
here. Also, which of the two approaches is more
trustworthy in the opinion of the bioconductor-list
users? With or without nestedF?
Thank you
Helen
# All commands
library(bronchialIL13)
data(HAHrma)
# Just for the IL13 samples
data <- HAHrma[,7:15]
# Design
targets <-
c("h12","h12","h12","h24","h24","h24","h4","h4","h4")
lev <- c("h12","h24","h4")
f <- factor(targets, levels=lev)
design <- model.matrix(~0+f)
colnames(design) <- lev
fit <- lmFit(data, design)
# 3-step contrasts, used to directly get all
comparisons
contrasts <- makeContrasts("h24-h12", "h12-h4",
"h24-h4", levels=design)
fit2 <- contrasts.fit(fit, contrasts)
fit2 <- eBayes(fit2)
# results
results <- decideTests(fit2, adjust="BH",
p.value=0.01)
resultsF <- decideTests(fit2, method="nestedF",
adjust="BH", p.value=0.01)
summary(results)
# h24-h12 h12-h4 h24-h4
#-1 87 2 237
#0 21714 22280 21266
#1 482 1 780
summary(resultsF)
# h24-h12 h12-h4 h24-h4
#-1 94 41 159
#0 21706 22210 21547
#1 483 32 577
--- "James W. MacDonald" <jmacdon at med.umich.edu>
wrote:
> Hi Helen,
>
> Helen Zhou wrote:
> > Dear Jim
> >
> > Thanks you for your answer. As I understand you
> are
> > recommending me to do direct comparisons between
> for
> > example 12-4h and 24-12h. However, could there not
> in
> > theory be a gene that was up a little bit in
> 12vs4h
> > and 24vs12h, so that the difference for neither of
> > these would be large enough to be significant, but
> for
> > 24vs4h the combined change might be significant?
> In
> > that case I guess I would need all 3 comparisons?
>
> Yes. The 'standard' linear modeling strategy is to
> fit the model and see
> if you have a significant F-statistic, then come
> back and do contrasts
> to see which particular comparison(s) is/are
> contributing to that
> significance. The general idea being that you want
> to limit the number
> of comparisons you are making so multiplicity is
> less of a problem.
>
> However, with microarray data this isn't usually
> what people do.
> Instead, they usually try to minimize the
> multiplicity by filtering out
> uninteresting reporters prior to making comparisons
> (based on variance,
> P/M/A calls, IQR, etc) and then computing all
> contrasts that they are
> interested in with the reporters that remain.
>
> You might consider using decideTests() using the
> 'nestedF' option - this
> is intended to correctly detect differential
> expression when two or more
> contrasts are significant, so in the case you
> mention may classify all
> three contrasts as significant.
>
> Best,
>
> Jim
>
>
> >
> > Thanks you
> > Helen
> >
> > --- "James W. MacDonald" <jmacdon at med.umich.edu>
> > wrote:
> >
> >> Hi Helen,
> >>
> >> Helen Zhou wrote:
> >>> Dear Sir/Madam
> >>>
> >>> I am trying to analyse a short time series,
> >> roughly
> >>> following section 8.8 in the limma user guide. I
> >> am
> >>> interested in differences between all time
> points.
> >> I
> >>> am not sure whether I have to make all the
> >> pariwise
> >>> comparisons directly, or whether they can be
> done
> >>> indirectly as well.
> >>>
> >>> For example, if I want to compare to time
> points,
> >> what
> >>> is the differences between the two methods
> listed
> >>> below.
> >>>
> >>> library(bronchialIL13)
> >>> # Just for the IL13 samples
> >>> data <- HAHrma[,7:15]
> >>> # Design
> >>> targets <-
> >>>
> >
>
c("h12","h12","h12","h24","h24","h24","h4","h4","h4")
> >>> lev <- c("h12","h24","h4")
> >>> f <- factor(targets, levels=lev)
> >>> design <- model.matrix(~0+f)
> >>> colnames(design) <- lev
> >>> fit <- lmFit(data, design)
> >>> # 2-step contrasts, used to indirectly get 24 to
> 4
> >>> hours as well as the other two comparisons
> >>> contrasts <- makeContrasts("h24-h12", "h12-h4",
> >>> levels=design)
> >>> fit2 <- contrasts.fit(fit, contrasts)
> >>> fit2 <- eBayes(fit2)
> >>> # Direct contrast of 24 to 4 hours
> >>> contrasts2 <- makeContrasts("h24-h4",
> >> levels=design)
> >>> fit3 <- contrasts.fit(fit, contrasts2)
> >>> fit3 <- eBayes(fit3)
> >>> # Comparison
> >>> topTable(fit2, coef=1:2)
> >>> topTable(fit3, coef=1)
> >> In the first case you are asking the question
> 'which
> >> reporters are
> >> different in either h24 vs h4 _or_ h12 vs h4',
> >> whereas in the second
> >> case you are asking 'which reporters are
> different
> >> between H24 and h4'.
> >>
> >> It is entirely possible that you could have a
> gene
> >> that isn't different
> >> between h24 and h4, but is different at h12. This
> >> would show up in the
> >> first comparison but not the second, so if you
> want
> >> to compare time
> >> points you are better off making direct contrasts
> >> rather than using the
> >> F-statistic for multiple contrasts (which will
> then
> >> require the
> >> additional step of figuring out which contrast(s)
> >> caused the statistic
> >> to be significant).
> >>
> >> Best,
> >>
> >> Jim
> >>
> >>
> >>> More or less the same probe sets are present,
> but
> >> in
> >>> different order and with different p values. Is
> >> the
> >>> difference because using coef=1:2 will go via an
> >>> F-test? And if I want the change from 24h-0h as
> >> well
> >>> as 42-12h and 12-4h, is it most correct for me
> to
> >>> specify that contrast directly? In my actual
> >>> experiment I have 4 time points, so will it be
> >> enough
> >>> for me with 3 possible comparisons, or will I
> have
> >> to
> >>> write all the 6 possible combinations?
> >>>
> >>> Thank you in advance for all your help.
> >>>
> >>> Yours truly
> >>> Mrs Helen Zhou
> >>>
> >>> P.S. I think this might have been mentioned on
> the
> >>> list before, but I could not find the email. In
> >> that
> >>> case, please excuse me for repeating this.
> >>>
> >>> _______________________________________________
> >>> Bioconductor mailing list
> >>> Bioconductor at stat.math.ethz.ch
> >>>
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> >>> Search the archives:
> >
>
http://news.gmane.org/gmane.science.biology.informatics.conductor
> >> --
> >> James W. MacDonald, M.S.
> >> Biostatistician
> >> Affymetrix and cDNA Microarray Core
> >> University of Michigan Cancer Center
> >> 1500 E. Medical Center Drive
> >> 7410 CCGC
> >> Ann Arbor MI 48109
> >> 734-647-5623
> >>
> >
> >
> > __________________________________________________
> > Do You Yahoo!?
> protection around
> > http://mail.yahoo.com
> >
>
>
=== message truncated ===
More information about the Bioconductor
mailing list