[BioC] edgeR - GLM for multifactors

James W. MacDonald jmacdon at uw.edu
Tue Jun 11 16:06:19 CEST 2013


Hi Alan,

On 6/11/2013 12:14 AM, Alan Smith wrote:
> Hello List,
>
> I'm trying to perform DE analysis using edgeR on the following samples with
> 2 replicates per condition.
>
> WC - Wildtype or Normal control
> WI - Wildtype or Normal upon Infection
> LC - Dose1 control
> LI - Dose1 Infection
> HC - Dose2 control
> HI - Dose2 Infection
>
> MY design matrix:
>        HC HI LC LI WC WI
> WC1   0  0  0  0  1  0
> WC2   0  0  0  0  1  0
> WI1   0  0  0  0  0  1
> WI2   0  0  0  0  0  1
> LC1   0  0  1  0  0  0
> LC2   0  0  1  0  0  0
> LI1   0  0  0  1  0  0
> LI2   0  0  0  1  0  0
> HC1   1  0  0  0  0  0
> HC2   1  0  0  0  0  0
> HI1   0  1  0  0  0  0
> HI2   0  1  0  0  0  0
>
> Cond.contrasts<- makeContrasts(DoseEffect1=(LI-LC)-(WI-WC),
> DoseEffect2=(HI-HC)-(WI-WC), OverallDoseEffect=(HI+HC+LI+LC)-(WI+WC),
> InfectionEffect=(WI+HI+LI)-(WC+HC+LC), levels=design)
>
> With the code below I believe I managed to get DE genes in one to one
> comparison but I don't think I got through with the overall dose effect and
> infection effect (contrast made as above).
> Please go over the condition level contrasts and correct me if I'm wrong.

You are wrong ;-D

For a contrast to be valid, the coefficients are supposed to add up to 
zero. So as an example, let's look at your first contrast:

(LI-LC)-(WI-WC)

this can be written as

LI - LC - WI + WC

or more explicitly as

1*LI - 1*LC - 1*WI + 1*WC

and the coefficients are then

1-1-1+1

which adds up to zero. However in your third contrast, the coefficients 
don't add up to zero. In these instances (where you are trying to 
compare the mean of one group to the mean of another), it is easiest to 
just divide by the number of groups. So you would want

OverallDoseEffect=(HI+HC+LI+LC)/4-(WI+WC)/2

I think the fourth contrast is OK, as the coefficients add up to zero, 
but you can always divide each group by three, which will at least (in 
my mind) explicitly indicate that you are looking at the comparison of 
the means for each group.

Best,

Jim


>
> Hope this information is not too little/confusing to understand the problem.
>
> Appreciate your time and help.
>
> Thank you,
> Alan
>
>
> library(edgeR)
> xim<- read.delim("NreadCounts.txt", row.names=1, stringsAsFactors=FALSE)
> names(x)
> group = factor(c("WC", "WC", "WI", "WI", "LC", "LC", "LI", "LI", "HC",
> "HC", "HI", "HI"))
> y<- DGEList(counts=x, group=group)
> keep<- rowSums (cpm(y)>1)>= 3
> y2<- y[keep,]
> dim(y2)
> [1] 31865    12
> y2$samples$lib.size<- colSums(y2$counts)
> y3<- calcNormFactors (y2)
> levels (y3$samples$group)
> [1] "HC" "HI" "LC" "LI" "WC" "WI"
> design<- model.matrix(~0+group, data=y3$samples)
> colnames(design)<- levels(group)
> y3<- estimateGLMCommonDisp(y3, design, verbose=TRUE)
> y3<- estimateGLMTrendedDisp(y3, design)
> y3<- estimateGLMTagwiseDisp(y3, design)
> fit<- glmFit(y3, design)
>   OneONone.contrasts<- makeContrasts(HIvsHC=HI-HC, LIvsLC=LI-LC,
> WIvsWC=WI-WC, HCvsWC=HC-WC, LCvsWC=LC-WC, HIvsWI=HI-WI, LIvsWI=LI-WI,
> levels=design)
>> OneONone.contrasts
>        Contrasts
> Levels HIvsHC LIvsLC WIvsWC HCvsWC LCvsWC HIvsWI LIvsWI
>      HC     -1      0      0      1      0      0      0
>      HI      1      0      0      0      0      1      0
>      LC      0     -1      0      0      1      0      0
>      LI      0      1      0      0      0      0      1
>      WC      0      0     -1     -1     -1      0      0
>      WI      0      0      1      0      0     -1     -1
>
> Cond.contrasts<- makeContrasts(DoseEffect1=(LI-LC)-(WI-WC),
> DoseEffect2=(HI-HC)-(WI-WC), OverallDoseEffect=(HI+HC+LI+LC)-(WI+WC),
> InfectionEffect=(WI+HI+LI)-(WC+HC+LC), levels=design)
> Cond.contrasts
>        Contrasts
> Levels DoseEffect1 DoseEffect2 OverallDoseEffect InfectionEffect
>      HC           0          -1                 1              -1
>      HI           0           1                 1               1
>      LC          -1           0                 1              -1
>      LI           1           0                 1               1
>      WC           1           1                -1              -1
>      WI          -1          -1                -1               1
>> lrt.OverallDoseEffect<- glmLRT(fit,
> contrast=Cond.contrasts[,"OverallDoseEffect"])
>> summary(de<-decideTestsDGE(lrt.OverallDoseEffect))
>     [,1]
> -1 31865
> 0      0
> 1      0
>> lrt.InfectionEffect<- glmLRT(fit,
> contrast=Cond.contrasts[,"InfectionEffect"])
>> summary(de<-decideTestsDGE(lrt.InfectionEffect))
>     [,1]
> -1   270
> 0  31108
> 1    487
>
>
>> sessionInfo()
> R version 3.0.1 (2013-05-16)
> Platform: i386-w64-mingw32/i386 (32-bit)
>
> locale:
> [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United
> States.1252    LC_MONETARY=English_United States.1252
> [4] LC_NUMERIC=C                           LC_TIME=English_United
> States.1252
>
> attached base packages:
> [1] splines   stats     graphics  grDevices utils     datasets  methods
> base
>
> other attached packages:
> [1] edgeR_3.2.3  limma_3.16.5
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099



More information about the Bioconductor mailing list