Thanks Gordon for your replay. I really appreciate it

If fact, I did the analysis with a different design having same results. But it's true that the example in the 3.3.1 is easier to understand. So if a change my edgeR code to use this design/contrast, the results are comparable with DESeq :

> library(limma)
> library(edgeR)
> rawdata<-read.delim(file="ALL_MTcD_9.5_NO_MTsD1",header=TRUE,row.names=1)
> y<-DGEList(counts=rawdata,genes=rawdata[,0])
> y
An object of class "DGEList"
$counts
                WTsD_1_9.5 WTsD_2_9.5 WTsD_3_9.5 WTcD_1_9.5 WTcD_2_9.5 WTcD_3_9.5 MTsD_2_9.5 MTsD_3_9.5 MTcD_1_9.5 MTcD_2_9.5 MTcD_3_9.5
mmu-let-7a-1-3p          4          7          1          5          4          4          5          3          8          5          2
mmu-let-7a-2-3p          2          1          0          2          1          0          0          0          0          0          1
mmu-let-7a-5p          195        310        263        748       1091        630        164        187        194        188        265
mmu-let-7b-3p            1          0          0          0          3          0          0          0          2          1          0
mmu-let-7b-5p            2          6          3         11         59         17          1          1         18         13          6
1268 more rows ...

$samples
           group lib.size norm.factors
WTsD_1_9.5     1   672649            1
WTsD_2_9.5     1   913140            1
WTsD_3_9.5     1   805774            1
WTcD_1_9.5     1   884400            1
WTcD_2_9.5     1   951267            1
6 more rows ...

$genes
data frame with 0 columns and 5 rows
1268 more rows ...

> y<-calcNormFactors(y)
> targets<-readTargets(row.names="FileName")
> targets
             FileName treatment mouseType
WTsD_1_9.5 WTsD_1_9.5 untreated        WT
WTsD_2_9.5 WTsD_2_9.5 untreated        WT
WTsD_3_9.5 WTsD_3_9.5 untreated        WT
WTcD_1_9.5 WTcD_1_9.5   treated        WT
WTcD_2_9.5 WTcD_2_9.5   treated        WT
WTcD_3_9.5 WTcD_3_9.5   treated        WT
MTsD_2_9.5 MTsD_2_9.5 untreated        MT
MTsD_3_9.5 MTsD_3_9.5 untreated        MT
MTcD_1_9.5 MTcD_1_9.5   treated        MT
MTcD_2_9.5 MTcD_2_9.5   treated        MT
MTcD_3_9.5 MTcD_3_9.5   treated        MT
> Group<-factor(paste(targets$treatment,targets$mouseType,sep="_"))
> cbind(targets,Group=Group)
             FileName treatment mouseType        Group
WTsD_1_9.5 WTsD_1_9.5 untreated        WT untreated_WT
WTsD_2_9.5 WTsD_2_9.5 untreated        WT untreated_WT
WTsD_3_9.5 WTsD_3_9.5 untreated        WT untreated_WT
WTcD_1_9.5 WTcD_1_9.5   treated        WT   treated_WT
WTcD_2_9.5 WTcD_2_9.5   treated        WT   treated_WT
WTcD_3_9.5 WTcD_3_9.5   treated        WT   treated_WT
MTsD_2_9.5 MTsD_2_9.5 untreated        MT untreated_MT
MTsD_3_9.5 MTsD_3_9.5 untreated        MT untreated_MT
MTcD_1_9.5 MTcD_1_9.5   treated        MT   treated_MT
MTcD_2_9.5 MTcD_2_9.5   treated        MT   treated_MT
MTcD_3_9.5 MTcD_3_9.5   treated        MT   treated_MT
> 
> design<-model.matrix(~0+Group)
> colnames(design)<-levels(Group)
> rownames(design)<-colnames(y)
> 
> #Overall dispersion of the dataset
> y<-estimateGLMCommonDisp(y,design,verbose=TRUE)
Disp = 0.01289 , BCV = 0.1135 
> #Estimation of the gene-wise dispersion
> y<-estimateGLMTrendedDisp(y,design)
> y<-estimateGLMTagwiseDisp(y,design)
> fit<-glmFit(y,design)
> contrast<-makeContrasts(
+   OverExOnMutants=(untreated_MT-treated_MT)-(untreated_WT-treated_WT)
+   ,levels=design)
> lrt<-glmLRT(fit,contrast=contrast[,"OverExOnMutants"])
> top<-topTags(lrt,n=100)
> head(top$table)
                    logFC    logCPM       LR       PValue          FDR
mmu-miR-222-3p  1.1568204  8.674346 34.79073 3.671169e-09 4.673398e-06
mmu-miR-423-5p -1.3875732  9.346753 32.55137 1.160833e-08 7.388699e-06
mmu-miR-92b-3p  0.8393514 14.808914 30.76576 2.911292e-08 1.235358e-05
mmu-let-7a-5p   1.4829445  8.738643 30.18508 3.927182e-08 1.249826e-05
mmu-miR-615-5p  2.0255706  5.640921 29.67739 5.102658e-08 1.299137e-05
mmu-miR-342-5p  2.5054672  5.530717 24.93025 5.944228e-07 1.260493e-04
> sessionInfo()
R version 3.0.1 (2013-05-16)
Platform: x86_64-apple-darwin10.8.0 (64-bit)

locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

attached base packages:
[1] grid      parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] gplots_2.11.0.1      MASS_7.3-26          KernSmooth_2.23-10   caTools_1.14         gdata_2.12.0.2       gtools_2.7.1         BiocInstaller_1.10.1 RColorBrewer_1.0-5  
 [9] DESeq_1.12.0         lattice_0.20-15      locfit_1.5-9.1       edgeR_3.2.3          limma_3.16.5         AnnotationDbi_1.22.5 Biobase_2.20.0       BiocGenerics_0.6.0  

loaded via a namespace (and not attached):
 [1] annotate_1.38.0    bitops_1.0-5       DBI_0.2-7          genefilter_1.42.0  geneplotter_1.38.0 IRanges_1.18.1     RSQLite_0.11.4     splines_3.0.1     
 [9] stats4_3.0.1       survival_2.37-4    tools_3.0.1        XML_3.95-0.2       xtable_1.7-1  

Thank you very much !


On 5 Jun 2013, at 03:35, Gordon K Smyth <smyth@wehi.edu.au> wrote:

> Dear Eduardo,
> 
> The design matrix you've used for edgeR is not correct, as I think you've already come to understand in your discussions with Simon.
> 
> Your four group experiment is of the type covered by Section 3.3 in the edgeR User's guide.  Why not have a read and follow the instructions?
> 
> I think that trying to interpret model formula in R may be causing you some confusion.  I recommend the simpler more intuitive approach described in Section 3.3.1 of the edgeR User's Guide.  edgeR gives you a way to explicitly say that you want:
> 
>  (MTsD - MTcD) - (WTsD -WTcD)
> 
> instead of trying to understand the meaning of coefficients in linear models.
> 
> BTW, please show sessionInfo() output.  Are you using the latest version of Bioconductor?  If not, upgrade!
> 
> Best wishes
> Gordon
> 
>> Date: Mon, 3 Jun 2013 13:06:55 +0200
>> From: Eduardo Andr?s Le?n <eandres@cnio.es>
>> To: "<bioconductor@r-project.org>" <bioconductor@r-project.org>
>> Subject: [BioC] DESeq /edgeR design problem for a small RNASeq
>> 
>> Hi all,
>> 	I'm trying to analyse data coming from a Small RNAseq with a "complex" design.
>> 
>> I have 2 different kind of mice,a wild type and a mutant. The mutant has an insert to over express a gene of interest. and It's a different mouse from the wild type because the wild type with the insert did't born.
> 
>> Second, in order to over express a gene, we add a drug, so we have 4 possibilities :
>> 
>> WTsD (wild type no treated)
>> WTcD (wild type treated) . Just to see if adding the drug is also affecting the mouse expression profile
>> MTsD (mouse with the insertion) . In this case, to see if the insertion is also affecting the normal expression profile of miRNAs.
>> MTcD (mouse with the insertion + drug) . How the over expression affects.
>> 
>> Of course this is also a time course design because the drug is added at 7 days, 9 days and 11 days,
>> 
>> So first of all, I'd like to see DE miRNAs taking into account just the overexpression of the gene (that is by removing the changes caused for the drug and for the mouse type ).
>> 
>> I've tried with DESeq with the following code :
>> 
>> library(DESeq)
>> 
>> #Reading count data
>> data<-read.table(file="ALL_MTcD_9.5",header=TRUE,row.names=1)
>> 
>> #Design
>> Design<-data.frame(
>> row.names=colnames(data),
>> treatment =c("untreated","untreated","untreated","treated","treated","treated","untreated","untreated","untreated","treated","treated","treated"),
>> mouseType =c("WT","WT","WT","WT","WT","WT","MT","MT","MT","MT","MT","MT")
>> )
>> 
>> cdsFull<-newCountDataSet(data,Design)
>> cdsFull<-estimateSizeFactors(cdsFull)
>> cdsFull<-estimateDispersions(cdsFull)
>> 
>> #fit with the mouse and the drug
>> experimento<-fitNbinomGLMs(cdsFull,count ~ mouseType + treatment)
>> #fit with the mouse + drug + interaction from mouse:drug
>> experimento_todos_factores<-fitNbinomGLMs(cdsFull,count ~ mouseType + treatment + mouseType:treatment)
>> 
>> pvalsGLM<-nbinomGLMTest(experimento_todos_factores,experimento)
>> padjGLM<-p.adjust(pvalsGLM,method="BH")
>> 
>> experimento$pval<-pvalsGLM
>> experimento$adj.pval<-padjGLM
>> 
>> 
>> This gives me 21 DE miRNAs.
>> 
>> As I'm not really sure if this design is correct. I've never seen something like this. So I've also used the edgeR package  using the example 4.4 in the vignette.
>> 
>> library(edgeR)
>> 
>> Mouse<-factor(c("WT","WT","WT","WT","WT","WT","MT","MT","MT","MT","MT","MT"))
>> Treatment<-factor(c("Untreated","Untreated","Untreated","Treated","Treated","Treated","Untreated","Untreated","Untreated","Treated","Treated","Treated"))
>> data.frame(Sample=colnames(y),Mouse,Treatment)
>> design<-model.matrix(~Mouse+Treatment)
>> rownames(design)<-colnames(y)
>> 
>> y<-estimateGLMCommonDisp(y,design,verbose=TRUE)
>> #Estimation of the gene-wise dispersion
>> y<-estimateGLMTrendedDisp(y,design)
>> y<-estimateGLMTagwiseDisp(y,design)
>> 
>> #Diff expresion
>> fit<-glmFit(y,design)
>> lrt<-glmLRT(fit)
>> topedgeR<-topTags(lrt,n=100)
>> 
>> So I guess that I'm testing the drug adjusting for baseline differences between the 2 mouse types. And I have 26 DE miRNAs
>> 
>> The problem here is that there is only 5 common miRNAs between DESeq and edgeR.
>> 
>> Is the design correct ?what I'm doing wrong ?
>> 
>> 
>> Thanks in advance !
>> 
> 
> ______________________________________________________________________
> The information in this email is confidential and intended solely for the addressee.
> You must not disclose, forward, print or use it without the permission of the sender.
> ______________________________________________________________________

===================================================
Eduardo Andrés León
Tlfn: (+34) 91 732 80 00 / 91 224 69 00 (ext 5054/3063)
e-mail: eandres@cnio.es        Fax: (+34) 91 224 69 76
Unidad de Bioinformática       Bioinformatics Unit
Centro Nacional de Investigaciones Oncológicas
C.P.: 28029                Zip Code: 28029
C/. Melchor Fernández Almagro, 3    Madrid (Spain)
http://bioinfo.cnio.es	http://bioinfo.cnio.es/people/eandres
===================================================


	[[alternative HTML version deleted]]

