[BioC] edgeR glmFit

Jahn Davik jahn.davik at bioforsk.no
Fri Dec 20 22:43:32 CET 2013


Hi Mark,



-----Opprinnelig melding-----
Fra: Mark Robinson [mailto:mark.robinson at imls.uzh.ch] 
Sendt: 20. desember 2013 15:27
Til: Jahn Davik
Kopi: bioconductor at r-project.org mailman
Emne: Re: [BioC] edgeR glmFit

Hi Jahn,

A few things:

1. Please don't email me directly.  USE THE BIOC MAILING LIST.  This is the third time I've mentioned this.  I've cc'd it again.
SORRY ABOUT THAT. WON'T HAPPEN AGAIN. I GET YOUR REPLY AND I JUST PRESS THE 'ANSWER' BUTTON.

2. You are using exactTest() with GLM-based dispersion estimation.
I  CHECKED THAT PART. AND ACTUALLY, WHEN I USED THE 'NON-GLM-BASED'  ESTIMATION CALLS, THE DISCREPANCIES WERE NOT AS STRIKING.


This is probably ok, although a bit non-standard.  Did you read and follow the case study examples?
I DID FOLLOW THE EXAMPLES. THEY DO NOT TOTALLY COVER MY DATA. IN PARTICULAR THE TIME POINT ELEMENT AND THE FACT THAT SUCH SAMPLES SHOULD BE CORRELATED DOES NOT SEEM TREATED. A TWO-WAY ANOVA APPROACH HAS BEEN SUGGESTED, BUT THAT DOES NOT TAKE THIS CORRELATION ISSUE INTO CONSIDERATION. I WOULD LOVE TO HEAR YOUR TAKE ON THIS.

3. I think your code is ok. You are comparing J and E at time point 0 and using different subsets of the whole dataset, you get different results.  The reason is that the dispersion is estimated differently over different sets of replicates.  I can't really tell what is happening without seeing the MDS plot and the BCV curves, but perhaps some of your replicates are wonky, or at least more variable in some conditions than others.
I SUSPECTED SOMETHING LIKE THAT. ON THE MDS PLOTS THE REPS LOOK JUST A LITTLE DIFFERENT.

4. 620000 is a lot of transcripts.  Lots of issues here, such as adaptor trimming before assembly (among others):
THAT NUMBER OF TRANSCRIPTS IS PRETTY AVERAGE ACCORDING TO THE TRINITY PEOPLE. IN THE edgeR CONTEXT, HOWEVER, I FILTER OUT TRANSCRIPTS WITH LOW HIT RATE. 


http://pathogenomics.bham.ac.uk/blog/2013/04/adaptor-trim-or-die-experiences-with-nextera-libraries/

Plus, I'm not sure how your counts were done, but you may want to consolidate the transcripts into clusters for gene-level counting.  In my experience, this is never a clean cut process, but you might consider using Corset or something similar:
https://code.google.com/p/corset-project/
i'LL CHECK THIS OUT.
ADAPTOR TRIMMING, ETC, IS DONE BY OUR SEQUENCE PROVIDOR. IN ADDITION I'VE USE TRIMMOMATIC TO GET RID OF LOW QUALITY READS.


Best wishes, Mark
THANKS.
HAVE A GOOD HOLYDAY

----------
Prof. Dr. Mark Robinson
Bioinformatics, Institute of Molecular Life Sciences University of Zurich http://ow.ly/riRea






On 20.12.2013, at 10:13, Jahn Davik <jahn.davik at bioforsk.no> wrote:

> Hi Mark,
> Thanks for your feedback.
> I had to take a couple of steps back in order to get this working, and I guess I still have some way to go here. I come from the SAS world and R is a little cryptic to me. Anyway, following your advice I apparently got the code working.
> Still, there is confusion (in my head).
>  
> The data I eventually want to play with is the output from Trinity containing ~620000 transcripts and 30 experimental cells (-> two genotypes sampled at 5 time points with 3 biological replicates, giving a 620000 by 30 matrix).
> So, to build on your last input, I made a subsample of this matrix and did only one time point.
>  
> CODE RUN LIKE THIS:
>  
> cots<-read.table("H:/bip1/RNAseq/Data/transcriptsRed.counts.matrix",he
> ader=T)
> head(cots)
>  
> data = as.matrix(cots);
> data = round(data);
> head(data)
> # write.table(data, file="H:/bip1/RNAseq/Data/round.matrix.txt", 
> quote=F, sep="\t")
>  
> keep<-rowSums(cpm(data)>2)>=3
> cots<-data[keep,]
> table(keep)
>  
> x<-cots[,c("E00R1","E00R2","E00R3","J00R1","J00R2","J00R3")]
> head(x)
>  
> dim(x)
>  
> Group<-factor(substring(colnames(x),1,1))
> Group
>  
> # QUICK START:
> # Classic part
>  
> y<-DGEList(counts=x,group=Group)
> y<-calcNormFactors(y)
>  
> plotMDS(y)
>  
> # estimating dispersion
> # over all genes
> y<-estimateGLMCommonDisp(y)
>  
> # genewise
> y<-estimateGLMTrendedDisp(y)
> y<-estimateGLMTagwiseDisp(y)
>  
> plotBCV(y)
>  
> et<-exactTest(y)
> topTags(et)
>  
>  
> YIELDING THIS OUTPUT AMONG FACTORS:
> > topTags(et)
> Comparison of groups:  J-E
>                      logFC logCPM   PValue   FDR
> comp406486_c0_seq1   -8.08   5.78 5.47e-05 0.117
> comp420150_c0_seq1    9.16   4.97 5.73e-05 0.117
> comp418672_c0_seq18 -16.39   6.73 6.11e-05 0.117
> comp391070_c0_seq1    7.70   6.70 6.20e-05 0.117
> comp421831_c2_seq2   -6.18   6.31 6.59e-05 0.117
> comp418672_c0_seq7   15.97   6.32 6.72e-05 0.117
> comp437190_c0_seq3   -5.80   6.79 7.13e-05 0.117
> comp431729_c0_seq18 -15.80   6.14 7.27e-05 0.117
> comp391070_c0_seq7    7.55   8.99 7.55e-05 0.117
> comp426823_c0_seq5  -15.55   5.89 8.29e-05 0.117
> Then I expanded the matrix with another time point (the last one in my experiment), and wanted to do the same comparison, e.g., comparing the J and E at time point 00.
> Here is the code I ran:
>  
> library("edgeR")
>  
> # setwd("H:/bip1/RNASeq/isoforms")
> # getwd()
>  
> options(digits=3)
>  
>  
> ## Read data
>  
> cots<-read.table("H:/bip1/RNAseq/Data/transcriptsRed.counts.matrix",he
> ader=T)
> head(cots)
>  
> data = as.matrix(cots);
> data = round(data);
> head(data)
> # write.table(data, file="H:/bip1/RNAseq/Data/round.matrix.txt", 
> quote=F, sep="\t")
>  
> keep<-rowSums(cpm(data)>2)>=3
> cots<-data[keep,]
> table(keep)
>  
> x<-cots[,c("E00R1","E00R2","E00R3","J00R1","J00R2","J00R3","E96R1","E9
> 6R2","E96R3","J96R1","J96R2","J96R3")]
> head(x)
>  
> dim(x)
>  
> Group<-factor(substring(colnames(x),1,3))
> Group
>  
> # QUICK START:
> # Classic part
>  
> y<-DGEList(counts=x,group=Group)
> y$samples
>  
> y<-calcNormFactors(y)
>  
> plotMDS(y)
>  
> # estimating dispersion
> # over all genes
> y<-estimateGLMCommonDisp(y)
>  
> # genewise
> y<-estimateGLMTrendedDisp(y)
> y<-estimateGLMTagwiseDisp(y)
>  
> plotBCV(y)
>  
> et<-exactTest(y, pair=c("E00","J00"))
> topTags(et)
>  
>  
> GIVING THE FOLLOWING OUTPUT:
>  
> Comparison of groups:  J00-E00 
>                      logFC logCPM   PValue   FDR
> comp436540_c0_seq6  -13.48   4.66 1.54e-05 0.416
> comp437851_c0_seq10 -12.68   3.14 2.63e-05 0.416
> comp436243_c0_seq45  12.79   3.65 3.78e-05 0.416
> comp434611_c0_seq48  12.51   3.12 4.02e-05 0.416
> comp420222_c0_seq1   12.21   3.12 4.58e-05 0.416
> comp425919_c0_seq13 -11.99   2.75 5.65e-05 0.416
> comp435680_c0_seq21   6.35   4.84 5.83e-05 0.416
> comp435275_c0_seq9   -6.80   4.93 6.03e-05 0.416
> comp434001_c0_seq7  -12.18   2.79 6.44e-05 0.416
> comp437026_c0_seq13  12.81   3.68 8.56e-05 0.416
>  
>  
>  
>  
> Now, I wonder why not these two runs give the same results? One possibility is of course that I code erroneously, but I realize that there may also be statistical explanations.
> I would be very happy to hear from you.
>  
> Best regards
> Jahn
>  
>  
>  
>  
>  
>  
>  
>  
>  
>  
> -----Opprinnelig melding-----
> Fra: Mark Robinson [mailto:mark.robinson at imls.uzh.ch]
> Sendt: 19. desember 2013 22:15
> Til: Jahn Davik
> Kopi: bioconductor at r-project.org mailman
> Emne: Re: [BioC] edgeR glmFit
>  
> Hi Jahn,
>  
> I've cc'd the BioC mailing list; it's best to keep a public record of these exchanges, so others can benefit.
>  
> So, I believe your problem is a matter of the class of the object that you are sending to estimateGLMCommonDisp(), etc..  Here is an example:
>  
> > y <- matrix(rnbinom(1000,mu=10,size=10),ncol=4)
> > d <- DGEList(counts=y,group=c(1,1,2,2))
> > design <- model.matrix(~group, data=d$samples)
> > d1 <- estimateGLMCommonDisp(y, design, verbose=TRUE)
> Disp = 0.08132 , BCV = 0.2852
> > class(d1)
> [1] "numeric"
> > d1 <- estimateGLMCommonDisp(d, design, verbose=TRUE)
> Disp = 0.08132 , BCV = 0.2852
> > class(d1)
> [1] "DGEList"
> attr(,"package")
> [1] "edgeR"
>  
> (if you send a count table, you get a different returned object than 
> if you send a DGEList object)
>  
> Sorry, I didn't completely spell it out in my last email, but it's there in the Quick Start and also in the case studies; typically, the starting point is creating a 'DGEList' object of the count table, which is the container that stores all the downstream components as well.
>  
> So, I would change your code to (roughly):
>  
> [. read data in .]
>  
> y <- y[keep,]
>  
> d <- DGEList(counts=y,group=Group)
> d <- calcNormFactors(d)
>  
> [. define design matrix .]
>  
> plotMDS(d)  # it's always a good plan to look at this
>  
> d <- estimateGLMCommonDisp(d,design)
> d <- estimateGLMTrendedDisp(d,design)
> d <- estimateGLMTagwiseDisp(d,design)
>  
> And, so on .
>  
> Cheers, Mark
>  
> ----------
> Prof. Dr. Mark Robinson
> Bioinformatics, Institute of Molecular Life Sciences University of 
> Zurich http://ow.ly/riRea
>  
>  
>  
>  
>  
>  
> On 19.12.2013, at 13:55, Jahn Davik <jahn.davik at bioforsk.no> wrote:
>  
> > Mark,
> > I'm afraid I have to bother you again. Sorry about that.
> > Anyway. An indication of where I loose it is appreciated.
> > 
> > This is what I send to R:
> > 
> > y<-read.table("H:/bip1/RNAseq/Data/transcriptsRed.counts.matrix",hea
> > de
> > r=T)
> > head(y)
> > y = as.matrix(cots);
> > y = round(data);
> > head(y)
> > 
> > dim(y)
> > 
> > targets<-read.table("H:/bip1/RNAseq/Data/targets.txt",header=T)
> > targets
> > 
> > #
> > XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
> > XX
> > XXXXXXX
> > # Filtrering.
> > # Make sure how you come up with these number !!!
> > # z<-data
> > 
> > keep<-rowSums(cpm(y)>2)>=3
> > y<-y[keep,]
> > table(keep)
> > #
> > XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
> > XX
> > XXXXxXX
> > 
> > 
> > ## Chapt 3.3.1 i Users Guide
> > 
> > # setting up a combined factor
> > 
> > Group<-factor(paste(targets$Treat,targets$Time,sep="."))
> > cbind(targets,Group=Group)
> > 
> > design<-model.matrix(~0+Group)
> > design
> > colnames(design)<-levels(Group)
> > design
> > 
> > y<-estimateGLMCommonDisp(y,design)
> > y<-estimateGLMTrendedDisp(y,design)
> > y<-estimateGLMTagwiseDisp(y,design)
> > fit<-glmFit(y,design)
> > 
> > 
> > # IT WORKS FINE (I BELIEVE) ALL THE WAY TO   estimate GLMTrendedDisp(y,design), WHERE IT RETURNS:
> > 
> > Error in return(NA, ntags) : multi-argument returns are not 
> > permitted In addition: Warning message:
> > In estimateGLMTrendedDisp.default(y, design) :
> >   No residual df: cannot estimate dispersion
> > 
> > 
> > THE OUTPUT IS LIKE THIS:
> > 
> > 
> > > y = as.matrix(cots);
> > > y = round(data);
> > > head(y)
> >                     E00R1 E00R2 E00R3 E01R1 E01R2 E01R3 E05R1 E05R2 E05R3 E48R2 E48R3 E96R1 E96R2 E96R3 J00R1 J00R2 J00R3 J01R1 J01R2 J01R3 J05R1 J05R2
> > comp168996_c1_seq1      0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     1     0     1     0
> > comp442719_c0_seq1      0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0    35     0
> > comp436057_c0_seq13     0     7    21     0     0     0    22     0     0     0     1     1    10    19     0     0     0     7     3     0     0     0
> > comp415319_c0_seq20    28     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0
> > comp428135_c0_seq8    250   229   172    58   104   114   187    95    85   135   100     0    65    83   219    59   134    82    87   188   178   116
> > comp73458_c0_seq1       0     0     0     0     0     0     0     0     2     0     0     0     0     0     0     0     0     0     0     1     0     0
> >                     J05R3 J48R2 J48R3 J96R1 J96R2 J96R3
> > comp168996_c1_seq1      0     0     0     0     0     0
> > comp442719_c0_seq1      0     0     0     0     0     0
> > comp436057_c0_seq13     0    19     0     1     0     0
> > comp415319_c0_seq20     0     0     0     0     0     0
> > comp428135_c0_seq8     76    26     0    41   132   101
> > comp73458_c0_seq1       0     0     0     0     0     0
> > >
> > > dim(y)
> > [1] 621694     28
> > >
> > > targets<-read.table("H:/bip1/RNAseq/Data/targets.txt",header=T)
> > > targets
> >    Sample Treat Time
> > 1   E00R1   Els  00h
> > 2   E00R2   Els  00h
> > 3   E00R3   Els  00h
> > 4   E01R1   Els  01h
> > 5   E01R2   Els  01h
> > 6   E01R3   Els  01h
> > 7   E05R1   Els  05h
> > 8   E05R2   Els  05h
> > 9   E05R3   Els  05h
> > 10  E48R2   Els  48h
> > 11  E48R3   Els  48h
> > 12  E96R1   Els  96h
> > 13  E96R2   Els  96h
> > 14  E96R3   Els  96h
> > 15  J00R1   Jon  00h
> > 16  J00R2   Jon  00h
> > 17  J00R3   Jon  00h
> > 18  J01R1   Jon  01h
> > 19  J01R2   Jon  01h
> > 20  J01R3   Jon  01h
> > 21  J05R1   Jon  05h
> > 22  J05R2   Jon  05h
> > 23  J05R3   Jon  05h
> > 24  J48R2   Jon  48h
> > 25  J48R3   Jon  48h
> > 26  J96R1   Jon  96h
> > 27  J96R2   Jon  96h
> > 28  J96R3   Jon  96h
> > >
> > > #
> > > XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
> > > XX
> > > XXXXXXXXX
> > > # Filtrering.
> > > # Make sure how you come up with these number !!!
> > > # z<-data
> > >
> > > keep<-rowSums(cpm(y)>2)>=3
> > > y<-y[keep,]
> > > table(keep)
> > keep
> > FALSE   TRUE
> > 558829  62865
> > > Group<-factor(paste(targets$Treat,targets$Time,sep="."))
> > > cbind(targets,Group=Group)
> >    Sample Treat Time   Group
> > 1   E00R1   Els  00h Els.00h
> > 2   E00R2   Els  00h Els.00h
> > 3   E00R3   Els  00h Els.00h
> > 4   E01R1   Els  01h Els.01h
> > 5   E01R2   Els  01h Els.01h
> > 6   E01R3   Els  01h Els.01h
> > 7   E05R1   Els  05h Els.05h
> > 8   E05R2   Els  05h Els.05h
> > 9   E05R3   Els  05h Els.05h
> > 10  E48R2   Els  48h Els.48h
> > 11  E48R3   Els  48h Els.48h
> > 12  E96R1   Els  96h Els.96h
> > 13  E96R2   Els  96h Els.96h
> > 14  E96R3   Els  96h Els.96h
> > 15  J00R1   Jon  00h Jon.00h
> > 16  J00R2   Jon  00h Jon.00h
> > 17  J00R3   Jon  00h Jon.00h
> > 18  J01R1   Jon  01h Jon.01h
> > 19  J01R2   Jon  01h Jon.01h
> > 20  J01R3   Jon  01h Jon.01h
> > 21  J05R1   Jon  05h Jon.05h
> > 22  J05R2   Jon  05h Jon.05h
> > 23  J05R3   Jon  05h Jon.05h
> > 24  J48R2   Jon  48h Jon.48h
> > 25  J48R3   Jon  48h Jon.48h
> > 26  J96R1   Jon  96h Jon.96h
> > 27  J96R2   Jon  96h Jon.96h
> > 28  J96R3   Jon  96h Jon.96h
> > > design<-model.matrix(~0+Group)
> > > design
> >    GroupEls.00h GroupEls.01h GroupEls.05h GroupEls.48h GroupEls.96h GroupJon.00h GroupJon.01h GroupJon.05h GroupJon.48h GroupJon.96h
> > 1             1            0            0            0            0            0            0            0            0            0
> > 2             1            0            0            0            0            0            0            0            0            0
> > 3             1            0            0            0            0            0            0            0            0            0
> > 4             0            1            0            0            0            0            0            0            0            0
> > 5             0            1            0            0            0            0            0            0            0            0
> > 6             0            1            0            0            0            0            0            0            0            0
> > 7             0            0            1            0            0            0            0            0            0            0
> > 8             0            0            1            0            0            0            0            0            0            0
> > 9             0            0            1            0            0            0            0            0            0            0
> > 10            0            0            0            1            0            0            0            0            0            0
> > 11            0            0            0            1            0            0            0            0            0            0
> > 12            0            0            0            0            1            0            0            0            0            0
> > 13            0            0            0            0            1            0            0            0            0            0
> > 14            0            0            0            0            1            0            0            0            0            0
> > 15            0            0            0            0            0            1            0            0            0            0
> > 16            0            0            0            0            0            1            0            0            0            0
> > 17            0            0            0            0            0            1            0            0            0            0
> > 18            0            0            0            0            0            0            1            0            0            0
> > 19            0            0            0            0            0            0            1            0            0            0
> > 20            0            0            0            0            0            0            1            0            0            0
> > 21            0            0            0            0            0            0            0            1            0            0
> > 22            0            0            0            0            0            0            0            1            0            0
> > 23            0            0            0            0            0            0            0            1            0            0
> > 24            0            0            0            0            0            0            0            0            1            0
> > 25            0            0            0            0            0            0            0            0            1            0
> > 26            0            0            0            0            0            0            0            0            0            1
> > 27            0            0            0            0            0            0            0            0            0            1
> > 28            0            0            0            0            0            0            0            0            0            1
> > attr(,"assign")
> > [1] 1 1 1 1 1 1 1 1 1 1
> > attr(,"contrasts")
> > attr(,"contrasts")$Group
> > [1] "contr.treatment"
> > 
> > > colnames(design)<-levels(Group)
> > > design
> >    Els.00h Els.01h Els.05h Els.48h Els.96h Jon.00h Jon.01h Jon.05h Jon.48h Jon.96h
> > 1        1       0       0       0       0       0       0       0       0       0
> > 2        1       0       0       0       0       0       0       0       0       0
> > 3        1       0       0       0       0       0       0       0       0       0
> > 4        0       1       0       0       0       0       0       0       0       0
> > 5        0       1       0       0       0       0       0       0       0       0
> > 6        0       1       0       0       0       0       0       0       0       0
> > 7        0       0       1       0       0       0       0       0       0       0
> > 8        0       0       1       0       0       0       0       0       0       0
> > 9        0       0       1       0       0       0       0       0       0       0
> > 10       0       0       0       1       0       0       0       0       0       0
> > 11       0       0       0       1       0       0       0       0       0       0
> > 12       0       0       0       0       1       0       0       0       0       0
> > 13       0       0       0       0       1       0       0       0       0       0
> > 14       0       0       0       0       1       0       0       0       0       0
> > 15       0       0       0       0       0       1       0       0       0       0
> > 16       0       0       0       0       0       1       0       0       0       0
> > 17       0       0       0       0       0       1       0       0       0       0
> > 18       0       0       0       0       0       0       1       0       0       0
> > 19       0       0       0       0       0       0       1       0       0       0
> > 20       0       0       0       0       0       0       1       0       0       0
> > 21       0       0       0       0       0       0       0       1       0       0
> > 22       0       0       0       0       0       0       0       1       0       0
> > 23       0       0       0       0       0       0       0       1       0       0
> > 24       0       0       0       0       0       0       0       0       1       0
> > 25       0       0       0       0       0       0       0       0       1       0
> > 26       0       0       0       0       0       0       0       0       0       1
> > 27       0       0       0       0       0       0       0       0       0       1
> > 28       0       0       0       0       0       0       0       0       0       1
> > attr(,"assign")
> > [1] 1 1 1 1 1 1 1 1 1 1
> > attr(,"contrasts")
> > attr(,"contrasts")$Group
> > [1] "contr.treatment"
> > 
> > > y<-estimateGLMCommonDisp(y,design)
> > > y<-estimateGLMTrendedDisp(y,design)
> > Error in return(NA, ntags) : multi-argument returns are not 
> > permitted In addition: Warning message:
> > In estimateGLMTrendedDisp.default(y, design) :
> >   No residual df: cannot estimate dispersion
> > > y<-estimateGLMTagwiseDisp(y,design)
> > Warning message:
> > In estimateGLMTagwiseDisp.default(y, design) :
> >   No residual df: setting dispersion to NA
> > >
> > 
> > 
> > Thank  you
> > jahn
> > 
> > 
> > 
> > 
> > 
> > -----Opprinnelig melding-----
> > Fra: Mark Robinson [mailto:mark.robinson at imls.uzh.ch]
> > Sendt: 18. desember 2013 16:12
> > Til: Jahn Davik [guest]
> > Kopi: bioconductor at r-project.org; Jahn Davik
> > Emne: Re: [BioC] edgeR glmFit
> > 
> > Hi Jahn,
> > 
> > Chapter 3 demonstrates the setup of design matrices; 3.1 says "In this chapter, we outline the principles for setting up the design matrix and forming contrasts for some typical experimental designs".  You would then combine that with what is shown in "1.4 Quick start" to do all the steps, something like:
> > 
> > > design <- model.matrix(~<add yours here>) y <-
> > > estimateGLMCommonDisp(y,design) y <-
> > > estimateGLMTrendedDisp(y,design) y <-
> > > estimateGLMTagwiseDisp(y,design) fit <- glmFit(y,design) lrt <- 
> > > glmLRT(fit,coef=<add yours here>)
> > > topTags(lrt)
> > 
> > In particular, you might have a look at some of the case studies in Chapter 4, for example 4.4 (but using your design matrix).
> > 
> > Hope that helps.
> > 
> > Best, Mark
> > 
> > 
> > ----------
> > Prof. Dr. Mark Robinson
> > Bioinformatics, Institute of Molecular Life Sciences University of 
> > Zurich http://ow.ly/riRea
> > 
> > 
> > 
> > 
> > 
> > 
> > On 18.12.2013, at 11:21, Jahn Davik [guest] <guest at bioconductor.org> wrote:
> > 
> > >
> > > I am working my way through the edgeR User's Guide and following section 3.3, I encounter a problem. I run the following commands - using my own counts data:
> > > ## Chapt 3.3 i Users Guide
> > >
> > > cots<-read.table("H:/bip1/RNAseq/Data/transcriptsRed.counts.matrix
> > > ",
> > > he
> > > ader=T)
> > > head(cots)
> > > data = as.matrix(cots);
> > > y = round(data);
> > > head(y)
> > >
> > > dim(data)
> > >
> > > targets<-read.table("H:/bip1/RNAseq/Data/targets.txt",header=T)
> > > targets
> > >
> > > # setting up a combined factor
> > >
> > > Group<-factor(paste(targets$Treat,targets$Time,sep="."))
> > > cbind(targets,Group=Group)
> > >
> > > design<-model.matrix(~0+Group)
> > > colnames(design)<-levels(Group)
> > > design
> > >
> > > fit<-glmFit(y,design)
> > >
> > > sessionInfo()
> > >
> > > Resulting in 'Error in glmFit.default(y, design) : No dispersion values provided.'
> > >
> > > I do not quite see where my mistake is and would appreciate help.
> > > Thank you.
> > >
> > > jahn
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > -- output of sessionInfo():
> > >
> > > cots<-read.table("H:/bip1/RNAseq/Data/transcriptsRed.counts.matrix
> > > ",
> > > he
> > > ader=T)
> > >> head(cots)
> > >                    E00R1  E00R2  E00R3 E01R1  E01R2  E01R3  E05R1 E05R2 E05R3  E48R2 E48R3 E96R1 E96R2 E96R3  J00R1 J00R2  J00R3 J01R1
> > > comp168996_c1_seq1    0.00   0.00   0.00  0.00   0.00   0.00   0.00  0.00  0.00   0.00  0.00  0.00  0.00  0.00   0.00  0.00   0.00  0.00
> > > comp442719_c0_seq1    0.00   0.00   0.00  0.00   0.00   0.00   0.00  0.00  0.00   0.00  0.00  0.00  0.00  0.00   0.00  0.00   0.00  0.00
> > > comp436057_c0_seq13   0.00   7.38  21.04  0.00   0.00   0.21  21.65  0.00  0.00   0.00  0.86  0.57  9.69 18.73   0.00  0.00   0.00  7.14
> > > comp415319_c0_seq20  28.17   0.00   0.00  0.00   0.00   0.00   0.00  0.00  0.00   0.00  0.00  0.00  0.00  0.00   0.00  0.00   0.00  0.00
> > > comp428135_c0_seq8  249.73 228.53 172.18 57.52 104.48 113.96 187.05 94.64 84.65 134.65 99.82  0.00 65.46 82.83 219.37 59.18 134.04 81.56
> > > comp73458_c0_seq1     0.00   0.00   0.00  0.00   0.00   0.00   0.00  0.00  2.00   0.00  0.00  0.00  0.00  0.00   0.00  0.00   0.00  0.00
> > >                   J01R2  J01R3  J05R1  J05R2 J05R3 J48R2 J48R3 J96R1  J96R2  J96R3
> > > comp168996_c1_seq1   1.00   0.00   1.00   0.00  0.00  0.00     0  0.00   0.00   0.00
> > > comp442719_c0_seq1   0.00   0.00  35.09   0.00  0.00  0.00     0  0.00   0.00   0.00
> > > comp436057_c0_seq13  3.07   0.00   0.05   0.00  0.00 19.09     0  0.60   0.00   0.00
> > > comp415319_c0_seq20  0.00   0.00   0.00   0.00  0.00  0.00     0  0.00   0.00   0.00
> > > comp428135_c0_seq8  87.00 188.03 177.74 115.65 75.66 26.50     0 41.02 131.62 101.48
> > > comp73458_c0_seq1    0.00   1.00   0.00   0.00  0.00  0.00     0  0.00   0.00   0.00
> > >> data = as.matrix(cots);
> > >> y = round(data);
> > >> head(y)
> > >                   E00R1 E00R2 E00R3 E01R1 E01R2 E01R3 E05R1 E05R2 E05R3 E48R2 E48R3 E96R1 E96R2 E96R3 J00R1 J00R2 J00R3 J01R1 J01R2
> > > comp168996_c1_seq1      0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     1
> > > comp442719_c0_seq1      0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0
> > > comp436057_c0_seq13     0     7    21     0     0     0    22     0     0     0     1     1    10    19     0     0     0     7     3
> > > comp415319_c0_seq20    28     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0
> > > comp428135_c0_seq8    250   229   172    58   104   114   187    95    85   135   100     0    65    83   219    59   134    82    87
> > > comp73458_c0_seq1       0     0     0     0     0     0     0     0     2     0     0     0     0     0     0     0     0     0     0
> > >                   J01R3 J05R1 J05R2 J05R3 J48R2 J48R3 J96R1 J96R2 J96R3
> > > comp168996_c1_seq1      0     1     0     0     0     0     0     0     0
> > > comp442719_c0_seq1      0    35     0     0     0     0     0     0     0
> > > comp436057_c0_seq13     0     0     0     0    19     0     1     0     0
> > > comp415319_c0_seq20     0     0     0     0     0     0     0     0     0
> > > comp428135_c0_seq8    188   178   116    76    26     0    41   132   101
> > > comp73458_c0_seq1       1     0     0     0     0     0     0     0     0
> > >>
> > >> dim(data)
> > > [1] 621694     28
> > >>
> > >> targets<-read.table("H:/bip1/RNAseq/Data/targets.txt",header=T)
> > >> targets
> > >  Sample Treat Time
> > > 1   E00R1   Els  00h
> > > 2   E00R2   Els  00h
> > > 3   E00R3   Els  00h
> > > 4   E01R1   Els  01h
> > > 5   E01R2   Els  01h
> > > 6   E01R3   Els  01h
> > > 7   E05R1   Els  05h
> > > 8   E05R2   Els  05h
> > > 9   E05R3   Els  05h
> > > 10  E48R2   Els  48h
> > > 11  E48R3   Els  48h
> > > 12  E96R1   Els  96h
> > > 13  E96R2   Els  96h
> > > 14  E96R3   Els  96h
> > > 15  J00R1   Jon  00h
> > > 16  J00R2   Jon  00h
> > > 17  J00R3   Jon  00h
> > > 18  J01R1   Jon  01h
> > > 19  J01R2   Jon  01h
> > > 20  J01R3   Jon  01h
> > > 21  J05R1   Jon  05h
> > > 22  J05R2   Jon  05h
> > > 23  J05R3   Jon  05h
> > > 24  J48R2   Jon  48h
> > > 25  J48R3   Jon  48h
> > > 26  J96R1   Jon  96h
> > > 27  J96R2   Jon  96h
> > > 28  J96R3   Jon  96h
> > >>
> > >> # setting up a combined factor
> > >>
> > >> Group<-factor(paste(targets$Treat,targets$Time,sep="."))
> > >> cbind(targets,Group=Group)
> > >  Sample Treat Time   Group
> > > 1   E00R1   Els  00h Els.00h
> > > 2   E00R2   Els  00h Els.00h
> > > 3   E00R3   Els  00h Els.00h
> > > 4   E01R1   Els  01h Els.01h
> > > 5   E01R2   Els  01h Els.01h
> > > 6   E01R3   Els  01h Els.01h
> > > 7   E05R1   Els  05h Els.05h
> > > 8   E05R2   Els  05h Els.05h
> > > 9   E05R3   Els  05h Els.05h
> > > 10  E48R2   Els  48h Els.48h
> > > 11  E48R3   Els  48h Els.48h
> > > 12  E96R1   Els  96h Els.96h
> > > 13  E96R2   Els  96h Els.96h
> > > 14  E96R3   Els  96h Els.96h
> > > 15  J00R1   Jon  00h Jon.00h
> > > 16  J00R2   Jon  00h Jon.00h
> > > 17  J00R3   Jon  00h Jon.00h
> > > 18  J01R1   Jon  01h Jon.01h
> > > 19  J01R2   Jon  01h Jon.01h
> > > 20  J01R3   Jon  01h Jon.01h
> > > 21  J05R1   Jon  05h Jon.05h
> > > 22  J05R2   Jon  05h Jon.05h
> > > 23  J05R3   Jon  05h Jon.05h
> > > 24  J48R2   Jon  48h Jon.48h
> > > 25  J48R3   Jon  48h Jon.48h
> > > 26  J96R1   Jon  96h Jon.96h
> > > 27  J96R2   Jon  96h Jon.96h
> > > 28  J96R3   Jon  96h Jon.96h
> > >>
> > >> design<-model.matrix(~0+Group)
> > >> colnames(design)<-levels(Group)
> > >> design
> > >  Els.00h Els.01h Els.05h Els.48h Els.96h Jon.00h Jon.01h Jon.05h Jon.48h Jon.96h
> > > 1        1       0       0       0       0       0       0       0       0       0
> > > 2        1       0       0       0       0       0       0       0       0       0
> > > 3        1       0       0       0       0       0       0       0       0       0
> > > 4        0       1       0       0       0       0       0       0       0       0
> > > 5        0       1       0       0       0       0       0       0       0       0
> > > 6        0       1       0       0       0       0       0       0       0       0
> > > 7        0       0       1       0       0       0       0       0       0       0
> > > 8        0       0       1       0       0       0       0       0       0       0
> > > 9        0       0       1       0       0       0       0       0       0       0
> > > 10       0       0       0       1       0       0       0       0       0       0
> > > 11       0       0       0       1       0       0       0       0       0       0
> > > 12       0       0       0       0       1       0       0       0       0       0
> > > 13       0       0       0       0       1       0       0       0       0       0
> > > 14       0       0       0       0       1       0       0       0       0       0
> > > 15       0       0       0       0       0       1       0       0       0       0
> > > 16       0       0       0       0       0       1       0       0       0       0
> > > 17       0       0       0       0       0       1       0       0       0       0
> > > 18       0       0       0       0       0       0       1       0       0       0
> > > 19       0       0       0       0       0       0       1       0       0       0
> > > 20       0       0       0       0       0       0       1       0       0       0
> > > 21       0       0       0       0       0       0       0       1       0       0
> > > 22       0       0       0       0       0       0       0       1       0       0
> > > 23       0       0       0       0       0       0       0       1       0       0
> > > 24       0       0       0       0       0       0       0       0       1       0
> > > 25       0       0       0       0       0       0       0       0       1       0
> > > 26       0       0       0       0       0       0       0       0       0       1
> > > 27       0       0       0       0       0       0       0       0       0       1
> > > 28       0       0       0       0       0       0       0       0       0       1
> > > attr(,"assign")
> > > [1] 1 1 1 1 1 1 1 1 1 1
> > > attr(,"contrasts")
> > > attr(,"contrasts")$Group
> > > [1] "contr.treatment"
> > >
> > >>
> > >> fit<-glmFit(y,design)
> > > Error in glmFit.default(y, design) : No dispersion values provided.
> > >>
> > >> sessionInfo()
> > > R version 3.0.2 (2013-09-25)
> > > Platform: x86_64-w64-mingw32/x64 (64-bit)
> > >
> > > locale:
> > > [1] LC_COLLATE=Norwegian (Bokmål)_Norway.1252  LC_CTYPE=Norwegian (Bokmål)_Norway.1252    LC_MONETARY=Norwegian (Bokmål)_Norway.1252
> > > [4] LC_NUMERIC=C                               LC_TIME=Norwegian (Bokmål)_Norway.1252  
> > >
> > > attached base packages:
> > > [1] parallel  stats     graphics  grDevices utils     datasets  methods   base   
> > >
> > > other attached packages:
> > > [1] edgeR_3.4.2        limma_3.18.6       DESeq_1.14.0       lattice_0.20-24    locfit_1.5-9.1     Biobase_2.22.0     BiocGenerics_0.8.0
> > >
> > > loaded via a namespace (and not attached):
> > > [1] annotate_1.40.0      AnnotationDbi_1.24.0 DBI_0.2-7            genefilter_1.44.0    geneplotter_1.40.0   grid_3.0.2        
> > > [7] IRanges_1.20.6       RColorBrewer_1.0-5   RSQLite_0.11.4       splines_3.0.2        stats4_3.0.2         survival_2.37-4   
> > > [13] tools_3.0.2          XML_3.98-1.1         xtable_1.7-1
> > >
> > > --
> > > Sent via the guest posting facility at bioconductor.org.
> > >
> > > _______________________________________________
> > > Bioconductor mailing list
> > > Bioconductor at r-project.org
> > > https://stat.ethz.ch/mailman/listinfo/bioconductor
> > > Search the archives:
> > > http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list