[BioC] EdgeR: getting CPM values after batch effect correction
Ryan C. Thompson
rct at thompsonclan.org
Wed Aug 28 10:02:13 CEST 2013
Hi Gilgi,
If you're just using edgeR for the CPM calculation, I believe there's no
need to estimate dispersions; they won't be used in calculating the CPM
values. You may wish to use normalized.lib.sizes=TRUE in the call to cpm
to get CPM values that take into account the normalization factors
computed by calcNormFactors (otherwise there's no point in doing that
either).
Second, according to the help text for removeBatchEffect, you're not
supposed to include the batch effect in the design matrix. The design
matrix should include all the experimental variables, while the batch
variable should indicate the technical batching.
Finally, instead of doing a simple logCPM transform, you might also try
the variance-stabilizing transformation provided by the DESeq package,
which is intended for clustering and machine learning types of analyses.
-Ryan
On 08/27/2013 10:52 PM, Gilgi Friedlander wrote:
> Hi Ryan,
>
> Thanks a lot for the reply.
>
> I followed EdgeR user's manual, and defined a model:
> y1<-DGEList(counts=countdata1,group=batch)
> y1<- calcNormFactors( y1 )
> design1 <- model.matrix(~batch+Treat1)
>
> batch has values 1 or 2, according to the batch of the experiment that was done.
>
> Treat has 10 different samples.
>
> In order to define a contrast I did the following:
> y1 <- estimateGLMCommonDisp(y1, design1, verbose=TRUE)#Now we are ready to construct an edgeR specific
> y1 <- estimateGLMTrendedDisp(y1, design1)
> y1 <- estimateGLMTagwiseDisp(y1, design1)
> lrt <- glmLRT(fit,contrast=c(0,0,1,-1,0,0,0,0,0,0,0,0,0,0))
>
> I want now to get also the log counts after removal of the batch effect (for the purpose of clustering of the genes).
>
> Is it correct to obtain the batch removed log counts in the following way:
>
> logCPM <- cpm(y1, log=TRUE, prior.count=3)
> logCPM_batchRemoved<-removeBatchEffect(logCPM,batch=batch,design=design1)
>
> Many thanks,
> Gilgi
>
> -----Original Message-----
> From: Ryan [mailto:rct at thompsonclan.org]
> Sent: Wednesday, August 28, 2013 2:46 AM
> To: Gilgi Friedlander
> Cc: bioconductor at r-project.org
> Subject: Re: [BioC] EdgeR: getting CPM values after batch effect correction
>
> You would have to define batch correction. Are you talking about fitting a model of the form "~ experimetalVar + batchEffect" and then subtracting out the batch effect coefficient?
>
> On Sun Aug 25 11:11:55 2013, Gilgi Friedlander wrote:
>> Dear list,
>>
>> In edgeR, it possible to get CPM values after batch effect correction (and after TMM normalization)?
>>
>> Thanks a lot!
>>
>> [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list