[BioC] (no subject)

Liu, XiaoChuan xiaochuan.liu at mssm.edu
Fri Mar 8 17:30:38 CET 2013

Dear Simon,
For question 1: Thanks very much for your suggestion! I will read it.
For question2: Our biological hypotheses is that whether we can find the differentially expressed genes caused by the two factors amongst these samples? So do you think how to formulate this as a linear model? Like this belove?

fit5 = fitNbinomGLMs( cds, count ~ condition * treatment )
fit4 = fitNbinomGLMs( cds, count ~ 1 )
pvalsGLM6 = nbinomGLMTest( fit5, fit4 )
padjGLM6 = p.adjust( pvalsGLM6, method="BH" )

For question3: I did the OLS ANOVA analysis on the vST data. Do you think it is correct? The codes are attached.



-----Original Message-----
From: bioconductor-bounces at r-project.org [mailto:bioconductor-bounces at r-project.org] On Behalf Of Simon Anders
Sent: Friday, March 08, 2013 11:13 AM
To: bioconductor at r-project.org
Subject: Re: [BioC] (no subject)

Dear Leo

On 08/03/13 16:54, Liu, XiaoChuan wrote:
> 1.      What is the meaning when I use “count ~ 1”? Here 1 is a
> cut-off? Or other meaning? I saw you give an example like this in 
> DESeq Reference Manual. So I try to follow using it. But I do not know 
> the meaning for test.

No, "~ 1" means that no factors except for the intercept should be used in the model. This formula notation is not specific to DESeq, it is a part of R and hence discussed in most textbooks covering R. (If you are unfamiliar with linear models and the associated concepts, the books by Dobson or Dalgaard might be a useful read.)

> 2.      In Part 2, I did 6 different GLMs tests by DESeq. And I also
> do the overlap amongst results. The overlap sometimes is very small.
> Why are they so different? How to explain it? Could you give me some 
> comments?

Sorry, I don't understand the question. You perform tests for different hypotheses and are surprised to get different results?

You will need to tell us more specifically what biological hypotheses you wish to test, and then we can maybe advise you how to formulate this as a linear model.

> 3.      In Part 3, I also do the overlap with 6 results in Part 2.
> But the overlap are very small. I wonder if I make a mistake to
> misuse the variance stabilizing transformation? If I want to directly
> use the ANOVA function in R to calculate co-factor P-value, could I
> use the raw count? Or How to normalize the raw counts then I can use
> ANOVA function in R?

No, ordinary-least-square (OLS) ANOVA requires data to be homoscedastic 
and this count data is not. This is, after all, the whole point of 
either using GLMs on the raw count data, or OLS ANOVA on 
variance-stabilized data.

I would have expected some similarity between the results of a GLM 
ANODEV anaysis of the count data and the OLS ANOVA analysis on the vST 
data, but as you did not post the code you used, it is hard to say 
whether you may have made a mistake.


Bioconductor mailing list
Bioconductor at r-project.org
Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: code.txt
URL: <https://stat.ethz.ch/pipermail/bioconductor/attachments/20130308/dd02123e/attachment.txt>

More information about the Bioconductor mailing list