[BioC] Questions about edgeR
Mark Robinson
mark.robinson at imls.uzh.ch
Tue Dec 20 09:51:37 CET 2011
Hi Rabe,
You could make it easier by giving us more information. It's always a good plan to describe your objects and send the output of sessionInfo(). Since I don't know what is in all of your objects (e.g. 'my_groups_factor', 'my_data', etc.), I'll make some guesses given what you've described.
Below is a snippet of code (taken almost entirely from the glmFit/glmLRT documentation -- have you read this?) that hard-codes dispersion at 0.1 for 5 samples from different groups with no replication:
--------------
nlibs <- 5
ntags <- 100
dispersion.true <- 0.1
# Make first a factor with 5 levels
x <- factor(1:5)
design <- model.matrix(~x)
# Generate count data
y <- rnbinom(ntags*nlibs,mu=8,size=1/dispersion.true)
y <- matrix(y,ntags,nlibs)
colnames(y) <- paste("s",1:5,sep="")
rownames(y) <- paste("Gene",1:ntags,sep="")
d <- DGEList(y)
# Normalize
d <- calcNormFactors(d)
# Fit the NB GLMs
fit <- glmFit(d, design, dispersion=0.1)
# Likelihood ratio test for differences b/w groups
results <- glmLRT(d, fit, coef=2:5)
topTags(results)
--------------
Perhaps, you can modify this to suit your purpose.
Of course, as I hope you can appreciate, this is not as ideal as having biological replicates and estimating the dispersion through the sophisticated methods available ...
Regards,
Mark
On 20.12.2011, at 09:31, Asma rabe wrote:
> Hi Mark,
>
> Thank you for help.
>
> when i tried :
>
> >mm<-model.matrix(~0+my_groups_factor)
> > fit<-glmfit(my_data,mm,dispresion_value)
> > fit<-glmLRT(my_data,fit)
> > topTags(fit)
> ---------------
>
> i got:
>
> Error in abs(object$table$logFC) :
> Non-numeric argument to mathematical function
> ----------------
> Any idea?
>
> Thank you,
> Rabe
>
>
> On Tue, Dec 20, 2011 at 5:02 AM, Mark Robinson <mark.robinson at imls.uzh.ch> wrote:
> Hi Rabe,
>
> Some comments below.
>
>
> On 19.12.2011, at 10:57, Asma rabe wrote:
>
> > Hi All,
> >
> > I have RNA-seq data for different time points 0hr,2hr,4hr,6hrs and 8hrs.
> >
> > I would like to use edgeR to get genes that are differentially expressed
> > throughout the experiment.
> > In a recent edgeR user's manual ,it has been suggested to use a dispersion
> > value of 0.1 in case of no replicates for organisms that are genetically
> > similar to humans.
> > so it is possible to be used for mouse data for example,what about other
> > organisms like yeast ??
>
> Yes, it is possible. But, as mentioned in the manual, none of these solutions are ideal.
>
>
> > As exact test is used to get DE genes across 2 samples and i need to get DE
> > genes across my time course experiment,i used GLM.
> >
> > when i used GLM model to find DE genes across my samples i ran the
> > following commands:
> >
> > mm<-model.matrix(~0+my_groups_factor)
> > fit<-glmfit(my_data,mm,dispresion_value)
> > fit<-glmLRT(data,fit)
> > topTags(fit)
> >
> > -----
> > i got:
> >
> > coefficient: group8
> > logConc logFC PValue FDR
> > gene1 ..... ..... ....
> > ....
> > gene2 .. ....
> > .... ....
> > ..
> > .
> > ..
> >
> >
> > ---
> > I have a basic question(sorry i'm not statistician)
> > Why the last gorup 8hrs was chooosen to be the coef.??
>
> Have a look at the documentation for glmLRT and specifically the 'coef' argument:
>
> ?glmLRT
>
> ----
> coef: … By default, the
> last column of the design matrix is dropped to form the
> design matrix for the null model.
> ----
>
> You have 5 levels for your time course (therefore, design matrix has 5 columns). To test for any difference between groups, I would suggest reparameterizing your design matrix and then specifying the 'coef' argument. Something like the following:
>
> mm <- model.matrix(~my_groups_factor)
> fit <- glmFit(my_data,mm,dispresion_value)
> fit <- glmLRT(data,fit,coef=2:5)
> topTags(fit)
>
> Hope that helps.
>
> Regards,
> Mark
>
>
> >
> > Thank you in advance.
> > Best Regards,
> > Rabe
>
>
> ----------
> Prof. Dr. Mark Robinson
> Bioinformatics
> Institute of Molecular Life Sciences
> University of Zurich
> Winterthurerstrasse 190
> 8057 Zurich
> Switzerland
>
> v: +41 44 635 4848
> f: +41 44 635 6898
> e: mark.robinson at imls.uzh.ch
> o: Y32-J-34
> w: http://tiny.cc/mrobin
>
>
>
----------
Prof. Dr. Mark Robinson
Bioinformatics
Institute of Molecular Life Sciences
University of Zurich
Winterthurerstrasse 190
8057 Zurich
Switzerland
v: +41 44 635 4848
f: +41 44 635 6898
e: mark.robinson at imls.uzh.ch
o: Y32-J-34
w: http://tiny.cc/mrobin
More information about the Bioconductor
mailing list