[BioC] edgeR, comparing models

Gordon K Smyth smyth at wehi.EDU.AU
Wed May 23 07:23:49 CEST 2012


Dear Alpesh,

I've commited a change to edgeR today, so that you can now produce the 
qq-plot of the goodness of fit statistics by

   gof(fit, plot=TRUE)

Best wishes
Gordon

---------------------------------------------
Professor Gordon K Smyth,
Bioinformatics Division,
Walter and Eliza Hall Institute of Medical Research,
1G Royal Parade, Parkville, Vic 3052, Australia.
smyth at wehi.edu.au
http://www.wehi.edu.au
http://www.statsci.org/smyth

On Wed, 23 May 2012, Gordon K Smyth wrote:

> Dear Alpesh,
>
> What you're doing looks correct.
>
> However, if you are doing this analysis (Figure 2 from 
> http://nar.oxfordjournals.org/content/40/10/4288 ) to decide whether you need 
> to use tagwise dispersion for your data, note that we have never observed a 
> real data set for which tagwise estimation is not required. Hence it seems 
> unnecessary to make these plots as a routine diagnostic.
>
> Yes, dispersion=0 is Poisson.
>
> To plot highlighted blue points, see ?points as well as
>
>  xy <- qqnorm(pcom)
>
> etc.
>
> Best wishes
> Gordon
>
>> Date: Mon, 21 May 2012 10:15:37 -0500
>> From: Alpesh Querer <alpeshq at gmail.com>
>> To: Bioconductor mailing list <bioconductor at r-project.org>
>> Subject: [BioC] edgeR, comparing models
>> 
>> Hello edgeR patrons,
>> 
>> I have RNA-seq data for multiple samples with biological replicates, I want
>> to look at the goodness of fit for
>> fitting Poisson and NB models used by edgeR for common, trended and
>> tag-wise dispersion. Does setting dispersion=0
>> in glmFit use the Poisson model? Also I am using the following code to
>> generate and compare qqplots for the models(figure 2 of
>> the most recent edgeR manuscript),please let me know if I am using the
>> appropriate method. Also how to plot the blue points (genes with poor fit)?
>> 
>> y<-estimateGLMCommonDisp(y,design)
>> 
>> fitcommon <- glmFit(y,design)
>> 
>> y <- estimateGLMTrendedDisp(y,design)
>> 
>> fittrended <- glmFit(y,design)
>> 
>> y <- estimateGLMTagwiseDisp(y,design)
>> 
>> 
>> fittagwise <- glmFit(y,design)
>> 
>> gcom <- gof(fitcommon)
>> gtren <- gof(fittrended)
>> gtag <- gof(fittagwise)
>> 
>> zcom <- zscoreGamma(gcom$gof.statistics,shape=gcom$df/2,scale=2)
>> ztren <- zscoreGamma(gtren$gof.statistics,shape=gtren$df/2,scale=2)
>> ztag <- zscoreGamma(gtag$gof.statistics,shape=gtag$df/2,scale=2)
>> 
>> 
>> pcom=zcom[is.finite(zcom) ]
>> ptren=ztren[is.finite(ztren) ]
>> ptag=ztag[is.finite(ztag) ]
>> 
>> par(mfrow=c(3,1))
>> 
>> qqnorm(pcom)
>> qqline(pcom)
>> 
>> qqnorm(ptren)
>> qqline(ptren)
>> 
>> qqnorm(ptag)
>> qqline(ptag)
>> 
>> 
>>> sessionInfo()
>> R version 2.15.0 (2012-03-30)
>> Platform: i386-pc-mingw32/i386 (32-bit)
>> 
>> locale:
>> [1] LC_COLLATE=English_United States.1252
>> [2] LC_CTYPE=English_United States.1252
>> [3] LC_MONETARY=English_United States.1252
>> [4] LC_NUMERIC=C
>> [5] LC_TIME=English_United States.1252
>> 
>> attached base packages:
>> [1] splines   stats     graphics  grDevices utils     datasets  methods
>> [8] base
>> 
>> other attached packages:
>> [1] edgeR_2.6.3         limma_3.12.0        BiocInstaller_1.4.4
>> 
>> loaded via a namespace (and not attached):
>> [1] tools_2.15.0
>> 
>> 
>> Thanks,
>> Al
>

______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}



More information about the Bioconductor mailing list