[BioC] EdgeR GLM gene expression analysis of 3 groups

Shona Wood shona.wood at liverpool.ac.uk
Thu Mar 24 12:35:50 CET 2011


Hi,

I have been trying to compare RNA-seq data 3 groups; young, middle aged and old. 
I have done pairwise comparisons between these groups but i would like to 
compare all three together. I have been trying to use the GLM method to do this, 
following the tutorial, 11 Case study: Oral carcinomas vs matched normal tissue. 
i understand that this only compares two conditions but i tried to adapt it to 
my needs. Being new at this I am unsure if I am doing it right and I get an 
error when trying to estimateCRdisp. Though I think the problem is actually to 
do with the way i have specified the design because "middle" doesnt seem to be 
in the design. Please see below:

R version 2.12.2 (2011-02-25)
Copyright (C) 2011 The R Foundation for Statistical Computing
ISBN 3-900051-07-0
Platform: i386-pc-mingw32/i386 (32-bit)

library(edgeR)
setwd ("/Users/swood/Documents/rna-seq analysis")
targets<-read.delim (file="targets.txt",
+ stringsAsFactors = FALSE, row.names = "label")
Error in data[[rowvar]] : attempt to select less than one element
targets<-read.delim (file="targets.txt")
targets$age<-factor(targets$age)
targets$sample<-factor(targets$sample)
targets
   X  files    age sample
1 p1 p1.txt  young      1
2 p2 p2.txt  young      2
3 p3 p3.txt  young      3
4 p4 p4.txt    old      4
5 p5 p5.txt    old      5
6 p6 p6.txt    old      6
7 p7 p7.txt middle      7
8 p8 p8.txt middle      8
9 p9 p9.txt middle      9
 d<-readDGE(targets, skip=1, comment.char='#')
 colnames(d)<-row.names(targets)
 d<-calcNormFactors(d)
 d
An object of class "DGEList"
$samples
   X  files    age sample lib.size norm.factors
1 p1 p1.txt  young      1 674816.7    0.7598895
2 p2 p2.txt  young      2 648856.9    0.9095203
3 p3 p3.txt  young      3 449797.3    1.2774867
4 p4 p4.txt    old      4 699036.0    0.8532141
5 p5 p5.txt    old      5 441649.7    1.1919854
6 p6 p6.txt    old      6 670755.9    0.8147330
7 p7 p7.txt middle      7 699630.1    1.0262252
8 p8 p8.txt middle      8 505795.8    1.2979023
9 p9 p9.txt middle      9 478527.1    1.0262468

$counts
                          1        2        3       4        5       6       7
ENSRNOG00000000007 38.23700 37.89900 34.85820 43.6703 40.81260 43.4009 42.4969
ENSRNOG00000000008  0.00000  0.00000  0.00000  0.0000  0.00000  0.0000  0.0000
ENSRNOG00000000009  0.00000  0.00000  0.00000  0.0000  0.00000  0.0000  0.0000
ENSRNOG00000000010  1.40963  8.28797  2.09491 17.1853  1.76635 14.7567  1.0840
ENSRNOG00000000012  0.00000  0.00000  0.00000  0.0000  0.00000  0.0000  0.0000
                          8       9
ENSRNOG00000000007 45.81270 48.6937
ENSRNOG00000000008  0.00000  0.0000
ENSRNOG00000000009  0.00000  0.0000
ENSRNOG00000000010  5.44889 17.7295
ENSRNOG00000000012  0.00000  0.0000
29510 more rows ...

 d<-d[rowSums(d$counts)>9,]
 design <- model.matrix(~ sample + age, data = targets)
 design
  (Intercept) sample2 sample3 sample4 sample5 sample6 sample7 sample8 sample9
1           1       0       0       0       0       0       0       0       0
2           1       1       0       0       0       0       0       0       0
3           1       0       1       0       0       0       0       0       0
4           1       0       0       1       0       0       0       0       0
5           1       0       0       0       1       0       0       0       0
6           1       0       0       0       0       1       0       0       0
7           1       0       0       0       0       0       1       0       0
8           1       0       0       0       0       0       0       1       0
9           1       0       0       0       0       0       0       0       1
  ageold ageyoung
1      0        1
2      0        1
3      0        1
4      1        0
5      1        0
6      1        0
7      0        0
8      0        0
9      0        0
attr(,"assign")
 [1] 0 1 1 1 1 1 1 1 1 2 2
attr(,"contrasts")
attr(,"contrasts")$sample
[1] "contr.treatment"

attr(,"contrasts")$age
[1] "contr.treatment"

d<-estimateCRDisp(d, design)
Error in beta[i, ] <- z %*% X : 
  number of items to replace is not a multiple of replacement length
 

If you could help me out with the design or suggest a reason for the error i 
would really appreciate it.

Thanks!



More information about the Bioconductor mailing list