[BioC] limma2annaffy output and heatmap questions

Fri Sep 26 21:46:00 CEST 2008

Hi Hui-Yi,

Hui-Yi Chu wrote:
> Hi Jim,
> 
> Thank you for the answer of p.value! and sorry for no codes in mail. The
> followings are one example of my data and codes.
> 
> *Example *(partially copied from my result table when coef=1, 1st probeID):
> t-statistic  P-value  Fold-change  mut.a         mut.a        wt1
> wt2
> 108.84      0           10.28           9.90436     9.88196   10.2249
> 10.1021

This isn't possible. You can't have  a fold change of 10.28 for these 
values. The actual number will be something like -0.2 (The mean of 9.9 
and 9.88 minus the mean of 10.22 and 10.1 *cannot* be 10.28).

> 
> 
> *Codes*:
> ## data importing
> library("affy")
> library("limma")
> targets <- readTargets("fed_total.txt")
> aaa <- ReadAffy(filenames = targets $ filename)
> eset <- rma(aaa)
> pData(eset)
> samples <- data.frame(genotype = c("wt", "wt", "mut.a", "mut.a","mut.b",
> "mut.b"),replicate =c(1,2,1,2,1,2))
> varInfo <- data.frame(labelDescription=c("wt, mut.a, mut.b", "arb
> number"))
> pd <- new("AnnotatedDataFrame", data = samples, varMetadata = varInfo)
> phenoData(eset) <- new("AnnotatedDataFrame", data = samples, varMetadata =
> varInfo)
> 
> ## filtration
> library("genefilter")
> f1 <- anyNA
> f2 <- pOverA(0.25, log2(100))
> ff <- filterfun(f1, f2)
> selected <- genefilter(eset, ff)
> esetsub <- eset[selected,]
> 
> ## limma fit and contrast
> library("limma")
> design <- model.matrix(~0+factor(c(1,1,2,2,3,3)))
> colnames(design) <- c("wt", "mut.a","mut.b")
> fit <- lmFit(ef, design)
> fit2 <- eBayes(fit)
> contrast.matrix <- makeContrasts("mut.a vs wt" = mut.a-wt,
>                                  "mut.b vs wt" = mut.b-wt,
>                                  "Diff" = (mut.a-wt)-(mut.b-wt),
>                                  levels=design)

You realize that your "Diff" is mut.a - mut.b, yes?

> fit2 <- contrasts.fit (fit, contrast.matrix)
> fit3 <- eBayes(fit2)
> 
> ## find DEG and draw heatmap
> x <- topTable(fit3, coef=1, adjust="fdr", sort.by="P", number=10000)
> y <- x[x$adj.P.Val < 0.01,]
> results <- decideTests (fit3, method="separate",
>                         adjust.method="BH",p.value=0.01,lfc=1)
> 
>  ## Cluster and heatmaps ------ *draw in three ways, but confused with
> results*
>          ##1.   heatmap(as.matrix(esetsub[y$ID,]), scale="none")
> 
>            #2.   library("Heatplus")
>                   heatmap_2(esetsub[y$ID,]),  scale="none")
>            #3.  library(gplots)
>                  heatmap.2(esetsub[y$ID,]), scale= "row", trace="none",
> density.info="none")
>                ## or
>                  heatmap.2(esetsub[y$ID,]), scale= "none", trace="none",
> density.info="none")
> 
> ## html file output
> library("affycoretools")
> library("yeast2.db")
> annotation(eset) <- "yeast2.db"
> limma2annaffy(eset, fit3, design, adjust = "fdr",
>         contrast.matrix, annotation(eset), pfilt = 0.01, fldfilt= 1)
> 
> 
> *Questions*:
> I have tons of questions about heatmap, and I sincerely appreciate if
> anybody can give me some idea.
>        1. what is the scale??  I know there are some discussion in the
> maillist, but I am still confused. The result from scale=row is easy to
> interpret since we can see some blocks across samples, but how it works??

 From ?heatmap.2

  scale: character indicating if the values should be centered and
           scaled in either the row direction or the column direction,
           or none.  The default is '"row"' if 'symm' false, and
           '"none"' otherwise.

You might look at ?scale as well.

>        2.  when I change the coef=2, the heatmap result is totally
> different, I know the coef=2 is comparing mut.b and wt, so what about the
> same group of genes in mut.a?

Not sure what this means. It's a different comparison, so you get 
different genes. If you want the significant genes from the first 
comparison, just get the significant genes and extract those from your 
ExpressionSet.

>        3. The result of limma contains logFC, AvrExprs, t-statistic, p
> value,etc., which value that is used to draw cluster?? logFC or AveExprs??
> Some papers their color key display fold change, does that mean I have to do
> something to get the ratio before I draw heatmap?

Again, not sure what you are getting at. You aren't using anything from 
limma except for the gene names that are being called significant. The 
data are the expression values from your ExpressionSet.

> 
> 4. same question with html table, how does the fold change generate?

It is the difference between the average expression for each group.

> 
> 
> Thank you very much!!!!!!!!!!
> Hui-Yi
> 
> 
> *sessionInfo()*
> R version 2.7.2 (2008-08-25)
> i386-pc-mingw32
> 
> locale:
> LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
> States.1252;LC_MONETARY=English_United
> States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
> 
> attached base packages:
> [1] splines   tools     stats     graphics  grDevices utils     datasets
> [8] methods   base
> 
> other attached packages:
>  [1] yeast2.db_2.2.0      affycoretools_1.12.1 annaffy_1.12.1
>  [4] KEGG.db_2.2.0        biomaRt_1.14.1       RCurl_0.9-3
>  [7] GOstats_2.6.0        Category_2.6.0       RBGL_1.16.0
> [10] annotate_1.18.0      xtable_1.5-3         GO.db_2.2.0
> [13] AnnotationDbi_1.2.2  RSQLite_0.7-0        DBI_0.2-4
> [16] graph_1.18.1         affyPLM_1.16.0       gcrma_2.12.1
> [19] matchprobes_1.12.1   genefilter_1.20.0    survival_2.34-1
> [22] yeast2cdf_2.2.0      limma_2.14.6         affy_1.18.2
> [25] preprocessCore_1.2.1 affyio_1.8.1         Biobase_2.0.1
> 
> loaded via a namespace (and not attached):
> [1] cluster_1.11.11 XML_1.94-0.1
> 
> 
> 
> On Fri, Sep 26, 2008 at 8:32 AM, James W. MacDonald
> <jmacdon at med.umich.edu>wrote:
> 
>> Hi Hui-Yi,
>>
>> Hui-Yi Chu wrote:
>>
>>> Hi everyone,
>>>
>>> Apologize first for probably simple question below.
>>> Thanks for Jim's help, I've got some html files from limma2annaffy
>>> function,
>>> however, I am not pretty sure how to interpret the result of "fold
>>> change".
>>> Please correct me if I am wrong. Based on my knowledge, after limma, for
>>> example in coef=1, the fold change represents the contrast of the
>>> expression
>>> value means from two groups. But the html result made me confused:
>>>
>>> The header of html is like this:
>>> probes   Description   Pubmed   Gene Ontology   Pathway    t-statistic
>>> P-value    Fold-change    array1   array1  array2   array2
>>>
>>>
>>> The question is why the fold change is not really the value that (mean of
>>> group1)- (mean of group2) ?? If it is because of permutation??
>>>
>> The fold change *is* mean(goup1) - mean(group2). If you show us your code
>> and a snippet of the output we might be able to help (please see the posting
>> guide).
>>
>>  I've read limma user guide and ?limma2annaffycore, so I believe the value
>>> of
>>> array 1 and 2 are expression value.
>>> By the way, the function of pflt filter genes by p value, but if I want to
>>> filter gene by adj.p.val??
>>>
>> I don't know of any function pflt. However, if you mean the argument pfilt,
>> then it *does* filter by adjusted p-value if in fact you are adjusting the
>> p-value to begin with.
>>
>> Best,
>>
>> Jim
>>
>>
>>
>>> Thank you very much!!
>>> Hui-Yi
>>>
>>>        [[alternative HTML version deleted]]
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at stat.math.ethz.ch
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>
>> --
>> James W. MacDonald, M.S.
>> Biostatistician
>> Hildebrandt Lab
>> 8220D MSRB III
>> 1150 W. Medical Center Drive
>> Ann Arbor MI 48109-0646
>> 734-936-8662
>>
> 

-- 
James W. MacDonald, M.S.
Biostatistician
Hildebrandt Lab
8220D MSRB III
1150 W. Medical Center Drive
Ann Arbor MI 48109-0646
734-936-8662