[BioC] Please help! How to specify factors for a RCBD in edgeR

Gordon K Smyth smyth at wehi.EDU.AU
Tue Jan 31 01:18:59 CET 2012


Dear Tilahun,

The first step is that you need to create a data frame containing the 
experimental factors, just as you would for a SAS analysis.  So you need 
to create three factors:

   Tissue
   Treatment
   Block

each containing 24 values, one for each RNA sample.  Then the design marix 
is formed by:

   design <- model.matrix(~Tissue*Treatment+Block)

Type colnames(design) to see how the coefficients are defined.  You will 
see that the interaction coefficients are coefficients 6 to 8.

After fitting your linear model, you could find genes that show 
significant Tissue*Treatment interaction (on 3df) by

   lrt <- glmLRT(d, fit, coef=6:8)

and so on.

I don't understand your question "How do I obtain differentially expressed 
genes in each Tissue*Stress combination?", so I can't give specific advice 
on that.  Differentially expressed compared to what?

Best wishes
Gordon

---------------- original message -------------------
[BioC] Please help! How to specify factors for a RCBD in edgeR
Tilahun Abebe tilahun.abebe at uni.edu
Wed Jan 25 22:16:41 CET 2012

Hi,

I am learning how to use edgeR to analyze RNA-seq data generated from
Illumina GAII. The experimental design is fairly complex.

I have a mixed 4x2 factorial randomized complete block design (RCBD)
consisting of:

4 tissues: A, B, C, D
2 treatments: control, stressed
3 blocks: Block1, Block2, Block3

Tissue and treatment are fixed effects and block is a random effect.

Here is the code I tried to use in edgeR:

> counts <- read.delim( file = "Mycounts.txt", header = TRUE)
> rownames <-counts [ , 1 ]
> d <- counts [, 2:25]    # counts are in columns 2-25
> d
> group <- c(rep("Control", 3), rep("Stress", 3), rep("Control", 3),
rep("Stress", 3), rep("Control", 3), rep("Stress", 3), rep("Control", 3),
rep("Stress", 3))
> d <- DGEList(counts = d, group = group)
> design <- model.matrix(~group)
> d <- calcNormFactors(d)
> d$samples
> d <- estimateGLMCommonDisp(d, design)
> d <- estimateGLMTagwiseDisp(d, design)
> d$common.dispersion
> fit <- glmFit(d, design)
> lrt <- glmLRT(d, fit, coef=2)
> topTags(lrt, n=4)

I am interested to know genes differentially expressed in each of the four
tissues under stress. However, I feel like I am not specifying the factors
correctly in the design statement. My questions are:

1) How do I specify the fixed effects Tissue, Stress, and Tissue*Stress
interaction in the model?
2) How do I tell edgeR to use block as a random effect?
3) How do I obtain differentially expressed genes in each Tissue*Stress
combination?

I appreciate your help.

Cheers.

Tilahun Abebe, Ph.D.
University of northern Iowa
Cedar Falls, IA

______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}



More information about the Bioconductor mailing list