[BioC] edgeR glmFit
Mark Robinson
mark.robinson at imls.uzh.ch
Fri Dec 20 15:27:23 CET 2013
Hi Jahn,
A few things:
1. Please don't email me directly. USE THE BIOC MAILING LIST. This is the third time I've mentioned this. I've cc'd it again.
2. You are using exactTest() with GLM-based dispersion estimation. This is probably ok, although a bit non-standard. Did you read and follow the case study examples?
3. I think your code is ok. You are comparing J and E at time point 0 and using different subsets of the whole dataset, you get different results. The reason is that the dispersion is estimated differently over different sets of replicates. I can't really tell what is happening without seeing the MDS plot and the BCV curves, but perhaps some of your replicates are wonky, or at least more variable in some conditions than others.
4. 620000 is a lot of transcripts. Lots of issues here, such as adaptor trimming before assembly (among others):
http://pathogenomics.bham.ac.uk/blog/2013/04/adaptor-trim-or-die-experiences-with-nextera-libraries/
Plus, I'm not sure how your counts were done, but you may want to consolidate the transcripts into clusters for gene-level counting. In my experience, this is never a clean cut process, but you might consider using Corset or something similar:
https://code.google.com/p/corset-project/
Best wishes, Mark
----------
Prof. Dr. Mark Robinson
Bioinformatics, Institute of Molecular Life Sciences
University of Zurich
http://ow.ly/riRea
On 20.12.2013, at 10:13, Jahn Davik <jahn.davik at bioforsk.no> wrote:
> Hi Mark,
> Thanks for your feedback.
> I had to take a couple of steps back in order to get this working, and I guess I still have some way to go here. I come from the SAS world and R is a little cryptic to me. Anyway, following your advice I apparently got the code working.
> Still, there is confusion (in my head).
>
> The data I eventually want to play with is the output from Trinity containing ~620000 transcripts and 30 experimental cells (-> two genotypes sampled at 5 time points with 3 biological replicates, giving a 620000 by 30 matrix).
> So, to build on your last input, I made a subsample of this matrix and did only one time point.
>
> CODE RUN LIKE THIS:
>
> cots<-read.table("H:/bip1/RNAseq/Data/transcriptsRed.counts.matrix",header=T)
> head(cots)
>
> data = as.matrix(cots);
> data = round(data);
> head(data)
> # write.table(data, file="H:/bip1/RNAseq/Data/round.matrix.txt", quote=F, sep="\t")
>
> keep<-rowSums(cpm(data)>2)>=3
> cots<-data[keep,]
> table(keep)
>
> x<-cots[,c("E00R1","E00R2","E00R3","J00R1","J00R2","J00R3")]
> head(x)
>
> dim(x)
>
> Group<-factor(substring(colnames(x),1,1))
> Group
>
> # QUICK START:
> # Classic part
>
> y<-DGEList(counts=x,group=Group)
> y<-calcNormFactors(y)
>
> plotMDS(y)
>
> # estimating dispersion
> # over all genes
> y<-estimateGLMCommonDisp(y)
>
> # genewise
> y<-estimateGLMTrendedDisp(y)
> y<-estimateGLMTagwiseDisp(y)
>
> plotBCV(y)
>
> et<-exactTest(y)
> topTags(et)
>
>
> YIELDING THIS OUTPUT AMONG FACTORS:
> > topTags(et)
> Comparison of groups: J-E
> logFC logCPM PValue FDR
> comp406486_c0_seq1 -8.08 5.78 5.47e-05 0.117
> comp420150_c0_seq1 9.16 4.97 5.73e-05 0.117
> comp418672_c0_seq18 -16.39 6.73 6.11e-05 0.117
> comp391070_c0_seq1 7.70 6.70 6.20e-05 0.117
> comp421831_c2_seq2 -6.18 6.31 6.59e-05 0.117
> comp418672_c0_seq7 15.97 6.32 6.72e-05 0.117
> comp437190_c0_seq3 -5.80 6.79 7.13e-05 0.117
> comp431729_c0_seq18 -15.80 6.14 7.27e-05 0.117
> comp391070_c0_seq7 7.55 8.99 7.55e-05 0.117
> comp426823_c0_seq5 -15.55 5.89 8.29e-05 0.117
> Then I expanded the matrix with another time point (the last one in my experiment), and wanted to do the same comparison, e.g., comparing the J and E at time point 00.
> Here is the code I ran:
>
> library("edgeR")
>
> # setwd("H:/bip1/RNASeq/isoforms")
> # getwd()
>
> options(digits=3)
>
>
> ## Read data
>
> cots<-read.table("H:/bip1/RNAseq/Data/transcriptsRed.counts.matrix",header=T)
> head(cots)
>
> data = as.matrix(cots);
> data = round(data);
> head(data)
> # write.table(data, file="H:/bip1/RNAseq/Data/round.matrix.txt", quote=F, sep="\t")
>
> keep<-rowSums(cpm(data)>2)>=3
> cots<-data[keep,]
> table(keep)
>
> x<-cots[,c("E00R1","E00R2","E00R3","J00R1","J00R2","J00R3","E96R1","E96R2","E96R3","J96R1","J96R2","J96R3")]
> head(x)
>
> dim(x)
>
> Group<-factor(substring(colnames(x),1,3))
> Group
>
> # QUICK START:
> # Classic part
>
> y<-DGEList(counts=x,group=Group)
> y$samples
>
> y<-calcNormFactors(y)
>
> plotMDS(y)
>
> # estimating dispersion
> # over all genes
> y<-estimateGLMCommonDisp(y)
>
> # genewise
> y<-estimateGLMTrendedDisp(y)
> y<-estimateGLMTagwiseDisp(y)
>
> plotBCV(y)
>
> et<-exactTest(y, pair=c("E00","J00"))
> topTags(et)
>
>
> GIVING THE FOLLOWING OUTPUT:
>
> Comparison of groups: J00-E00
> logFC logCPM PValue FDR
> comp436540_c0_seq6 -13.48 4.66 1.54e-05 0.416
> comp437851_c0_seq10 -12.68 3.14 2.63e-05 0.416
> comp436243_c0_seq45 12.79 3.65 3.78e-05 0.416
> comp434611_c0_seq48 12.51 3.12 4.02e-05 0.416
> comp420222_c0_seq1 12.21 3.12 4.58e-05 0.416
> comp425919_c0_seq13 -11.99 2.75 5.65e-05 0.416
> comp435680_c0_seq21 6.35 4.84 5.83e-05 0.416
> comp435275_c0_seq9 -6.80 4.93 6.03e-05 0.416
> comp434001_c0_seq7 -12.18 2.79 6.44e-05 0.416
> comp437026_c0_seq13 12.81 3.68 8.56e-05 0.416
>
>
>
>
> Now, I wonder why not these two runs give the same results? One possibility is of course that I code erroneously, but I realize that there may also be statistical explanations.
> I would be very happy to hear from you.
>
> Best regards
> Jahn
>
>
>
>
>
>
>
>
>
>
> -----Opprinnelig melding-----
> Fra: Mark Robinson [mailto:mark.robinson at imls.uzh.ch]
> Sendt: 19. desember 2013 22:15
> Til: Jahn Davik
> Kopi: bioconductor at r-project.org mailman
> Emne: Re: [BioC] edgeR glmFit
>
> Hi Jahn,
>
> I've cc'd the BioC mailing list; it's best to keep a public record of these exchanges, so others can benefit.
>
> So, I believe your problem is a matter of the class of the object that you are sending to estimateGLMCommonDisp(), etc.. Here is an example:
>
> > y <- matrix(rnbinom(1000,mu=10,size=10),ncol=4)
> > d <- DGEList(counts=y,group=c(1,1,2,2))
> > design <- model.matrix(~group, data=d$samples)
> > d1 <- estimateGLMCommonDisp(y, design, verbose=TRUE)
> Disp = 0.08132 , BCV = 0.2852
> > class(d1)
> [1] "numeric"
> > d1 <- estimateGLMCommonDisp(d, design, verbose=TRUE)
> Disp = 0.08132 , BCV = 0.2852
> > class(d1)
> [1] "DGEList"
> attr(,"package")
> [1] "edgeR"
>
> (if you send a count table, you get a different returned object than if you send a DGEList object)
>
> Sorry, I didn't completely spell it out in my last email, but it's there in the Quick Start and also in the case studies; typically, the starting point is creating a 'DGEList' object of the count table, which is the container that stores all the downstream components as well.
>
> So, I would change your code to (roughly):
>
> [… read data in …]
>
> y <- y[keep,]
>
> d <- DGEList(counts=y,group=Group)
> d <- calcNormFactors(d)
>
> [… define design matrix …]
>
> plotMDS(d) # it's always a good plan to look at this
>
> d <- estimateGLMCommonDisp(d,design)
> d <- estimateGLMTrendedDisp(d,design)
> d <- estimateGLMTagwiseDisp(d,design)
>
> And, so on …
>
> Cheers, Mark
>
> ----------
> Prof. Dr. Mark Robinson
> Bioinformatics, Institute of Molecular Life Sciences University of Zurich http://ow.ly/riRea
>
>
>
>
>
>
> On 19.12.2013, at 13:55, Jahn Davik <jahn.davik at bioforsk.no> wrote:
>
> > Mark,
> > I'm afraid I have to bother you again. Sorry about that.
> > Anyway. An indication of where I loose it is appreciated.
> >
> > This is what I send to R:
> >
> > y<-read.table("H:/bip1/RNAseq/Data/transcriptsRed.counts.matrix",heade
> > r=T)
> > head(y)
> > y = as.matrix(cots);
> > y = round(data);
> > head(y)
> >
> > dim(y)
> >
> > targets<-read.table("H:/bip1/RNAseq/Data/targets.txt",header=T)
> > targets
> >
> > #
> > XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
> > XXXXXXX
> > # Filtrering.
> > # Make sure how you come up with these number !!!
> > # z<-data
> >
> > keep<-rowSums(cpm(y)>2)>=3
> > y<-y[keep,]
> > table(keep)
> > #
> > XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
> > XXXXxXX
> >
> >
> > ## Chapt 3.3.1 i Users Guide
> >
> > # setting up a combined factor
> >
> > Group<-factor(paste(targets$Treat,targets$Time,sep="."))
> > cbind(targets,Group=Group)
> >
> > design<-model.matrix(~0+Group)
> > design
> > colnames(design)<-levels(Group)
> > design
> >
> > y<-estimateGLMCommonDisp(y,design)
> > y<-estimateGLMTrendedDisp(y,design)
> > y<-estimateGLMTagwiseDisp(y,design)
> > fit<-glmFit(y,design)
> >
> >
> > # IT WORKS FINE (I BELIEVE) ALL THE WAY TO estimate GLMTrendedDisp(y,design), WHERE IT RETURNS:
> >
> > Error in return(NA, ntags) : multi-argument returns are not permitted
> > In addition: Warning message:
> > In estimateGLMTrendedDisp.default(y, design) :
> > No residual df: cannot estimate dispersion
> >
> >
> > THE OUTPUT IS LIKE THIS:
> >
> >
> > > y = as.matrix(cots);
> > > y = round(data);
> > > head(y)
> > E00R1 E00R2 E00R3 E01R1 E01R2 E01R3 E05R1 E05R2 E05R3 E48R2 E48R3 E96R1 E96R2 E96R3 J00R1 J00R2 J00R3 J01R1 J01R2 J01R3 J05R1 J05R2
> > comp168996_c1_seq1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0
> > comp442719_c0_seq1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 35 0
> > comp436057_c0_seq13 0 7 21 0 0 0 22 0 0 0 1 1 10 19 0 0 0 7 3 0 0 0
> > comp415319_c0_seq20 28 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> > comp428135_c0_seq8 250 229 172 58 104 114 187 95 85 135 100 0 65 83 219 59 134 82 87 188 178 116
> > comp73458_c0_seq1 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 1 0 0
> > J05R3 J48R2 J48R3 J96R1 J96R2 J96R3
> > comp168996_c1_seq1 0 0 0 0 0 0
> > comp442719_c0_seq1 0 0 0 0 0 0
> > comp436057_c0_seq13 0 19 0 1 0 0
> > comp415319_c0_seq20 0 0 0 0 0 0
> > comp428135_c0_seq8 76 26 0 41 132 101
> > comp73458_c0_seq1 0 0 0 0 0 0
> > >
> > > dim(y)
> > [1] 621694 28
> > >
> > > targets<-read.table("H:/bip1/RNAseq/Data/targets.txt",header=T)
> > > targets
> > Sample Treat Time
> > 1 E00R1 Els 00h
> > 2 E00R2 Els 00h
> > 3 E00R3 Els 00h
> > 4 E01R1 Els 01h
> > 5 E01R2 Els 01h
> > 6 E01R3 Els 01h
> > 7 E05R1 Els 05h
> > 8 E05R2 Els 05h
> > 9 E05R3 Els 05h
> > 10 E48R2 Els 48h
> > 11 E48R3 Els 48h
> > 12 E96R1 Els 96h
> > 13 E96R2 Els 96h
> > 14 E96R3 Els 96h
> > 15 J00R1 Jon 00h
> > 16 J00R2 Jon 00h
> > 17 J00R3 Jon 00h
> > 18 J01R1 Jon 01h
> > 19 J01R2 Jon 01h
> > 20 J01R3 Jon 01h
> > 21 J05R1 Jon 05h
> > 22 J05R2 Jon 05h
> > 23 J05R3 Jon 05h
> > 24 J48R2 Jon 48h
> > 25 J48R3 Jon 48h
> > 26 J96R1 Jon 96h
> > 27 J96R2 Jon 96h
> > 28 J96R3 Jon 96h
> > >
> > > #
> > > XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
> > > XXXXXXXXX
> > > # Filtrering.
> > > # Make sure how you come up with these number !!!
> > > # z<-data
> > >
> > > keep<-rowSums(cpm(y)>2)>=3
> > > y<-y[keep,]
> > > table(keep)
> > keep
> > FALSE TRUE
> > 558829 62865
> > > Group<-factor(paste(targets$Treat,targets$Time,sep="."))
> > > cbind(targets,Group=Group)
> > Sample Treat Time Group
> > 1 E00R1 Els 00h Els.00h
> > 2 E00R2 Els 00h Els.00h
> > 3 E00R3 Els 00h Els.00h
> > 4 E01R1 Els 01h Els.01h
> > 5 E01R2 Els 01h Els.01h
> > 6 E01R3 Els 01h Els.01h
> > 7 E05R1 Els 05h Els.05h
> > 8 E05R2 Els 05h Els.05h
> > 9 E05R3 Els 05h Els.05h
> > 10 E48R2 Els 48h Els.48h
> > 11 E48R3 Els 48h Els.48h
> > 12 E96R1 Els 96h Els.96h
> > 13 E96R2 Els 96h Els.96h
> > 14 E96R3 Els 96h Els.96h
> > 15 J00R1 Jon 00h Jon.00h
> > 16 J00R2 Jon 00h Jon.00h
> > 17 J00R3 Jon 00h Jon.00h
> > 18 J01R1 Jon 01h Jon.01h
> > 19 J01R2 Jon 01h Jon.01h
> > 20 J01R3 Jon 01h Jon.01h
> > 21 J05R1 Jon 05h Jon.05h
> > 22 J05R2 Jon 05h Jon.05h
> > 23 J05R3 Jon 05h Jon.05h
> > 24 J48R2 Jon 48h Jon.48h
> > 25 J48R3 Jon 48h Jon.48h
> > 26 J96R1 Jon 96h Jon.96h
> > 27 J96R2 Jon 96h Jon.96h
> > 28 J96R3 Jon 96h Jon.96h
> > > design<-model.matrix(~0+Group)
> > > design
> > GroupEls.00h GroupEls.01h GroupEls.05h GroupEls.48h GroupEls.96h GroupJon.00h GroupJon.01h GroupJon.05h GroupJon.48h GroupJon.96h
> > 1 1 0 0 0 0 0 0 0 0 0
> > 2 1 0 0 0 0 0 0 0 0 0
> > 3 1 0 0 0 0 0 0 0 0 0
> > 4 0 1 0 0 0 0 0 0 0 0
> > 5 0 1 0 0 0 0 0 0 0 0
> > 6 0 1 0 0 0 0 0 0 0 0
> > 7 0 0 1 0 0 0 0 0 0 0
> > 8 0 0 1 0 0 0 0 0 0 0
> > 9 0 0 1 0 0 0 0 0 0 0
> > 10 0 0 0 1 0 0 0 0 0 0
> > 11 0 0 0 1 0 0 0 0 0 0
> > 12 0 0 0 0 1 0 0 0 0 0
> > 13 0 0 0 0 1 0 0 0 0 0
> > 14 0 0 0 0 1 0 0 0 0 0
> > 15 0 0 0 0 0 1 0 0 0 0
> > 16 0 0 0 0 0 1 0 0 0 0
> > 17 0 0 0 0 0 1 0 0 0 0
> > 18 0 0 0 0 0 0 1 0 0 0
> > 19 0 0 0 0 0 0 1 0 0 0
> > 20 0 0 0 0 0 0 1 0 0 0
> > 21 0 0 0 0 0 0 0 1 0 0
> > 22 0 0 0 0 0 0 0 1 0 0
> > 23 0 0 0 0 0 0 0 1 0 0
> > 24 0 0 0 0 0 0 0 0 1 0
> > 25 0 0 0 0 0 0 0 0 1 0
> > 26 0 0 0 0 0 0 0 0 0 1
> > 27 0 0 0 0 0 0 0 0 0 1
> > 28 0 0 0 0 0 0 0 0 0 1
> > attr(,"assign")
> > [1] 1 1 1 1 1 1 1 1 1 1
> > attr(,"contrasts")
> > attr(,"contrasts")$Group
> > [1] "contr.treatment"
> >
> > > colnames(design)<-levels(Group)
> > > design
> > Els.00h Els.01h Els.05h Els.48h Els.96h Jon.00h Jon.01h Jon.05h Jon.48h Jon.96h
> > 1 1 0 0 0 0 0 0 0 0 0
> > 2 1 0 0 0 0 0 0 0 0 0
> > 3 1 0 0 0 0 0 0 0 0 0
> > 4 0 1 0 0 0 0 0 0 0 0
> > 5 0 1 0 0 0 0 0 0 0 0
> > 6 0 1 0 0 0 0 0 0 0 0
> > 7 0 0 1 0 0 0 0 0 0 0
> > 8 0 0 1 0 0 0 0 0 0 0
> > 9 0 0 1 0 0 0 0 0 0 0
> > 10 0 0 0 1 0 0 0 0 0 0
> > 11 0 0 0 1 0 0 0 0 0 0
> > 12 0 0 0 0 1 0 0 0 0 0
> > 13 0 0 0 0 1 0 0 0 0 0
> > 14 0 0 0 0 1 0 0 0 0 0
> > 15 0 0 0 0 0 1 0 0 0 0
> > 16 0 0 0 0 0 1 0 0 0 0
> > 17 0 0 0 0 0 1 0 0 0 0
> > 18 0 0 0 0 0 0 1 0 0 0
> > 19 0 0 0 0 0 0 1 0 0 0
> > 20 0 0 0 0 0 0 1 0 0 0
> > 21 0 0 0 0 0 0 0 1 0 0
> > 22 0 0 0 0 0 0 0 1 0 0
> > 23 0 0 0 0 0 0 0 1 0 0
> > 24 0 0 0 0 0 0 0 0 1 0
> > 25 0 0 0 0 0 0 0 0 1 0
> > 26 0 0 0 0 0 0 0 0 0 1
> > 27 0 0 0 0 0 0 0 0 0 1
> > 28 0 0 0 0 0 0 0 0 0 1
> > attr(,"assign")
> > [1] 1 1 1 1 1 1 1 1 1 1
> > attr(,"contrasts")
> > attr(,"contrasts")$Group
> > [1] "contr.treatment"
> >
> > > y<-estimateGLMCommonDisp(y,design)
> > > y<-estimateGLMTrendedDisp(y,design)
> > Error in return(NA, ntags) : multi-argument returns are not permitted
> > In addition: Warning message:
> > In estimateGLMTrendedDisp.default(y, design) :
> > No residual df: cannot estimate dispersion
> > > y<-estimateGLMTagwiseDisp(y,design)
> > Warning message:
> > In estimateGLMTagwiseDisp.default(y, design) :
> > No residual df: setting dispersion to NA
> > >
> >
> >
> > Thank you
> > jahn
> >
> >
> >
> >
> >
> > -----Opprinnelig melding-----
> > Fra: Mark Robinson [mailto:mark.robinson at imls.uzh.ch]
> > Sendt: 18. desember 2013 16:12
> > Til: Jahn Davik [guest]
> > Kopi: bioconductor at r-project.org; Jahn Davik
> > Emne: Re: [BioC] edgeR glmFit
> >
> > Hi Jahn,
> >
> > Chapter 3 demonstrates the setup of design matrices; 3.1 says "In this chapter, we outline the principles for setting up the design matrix and forming contrasts for some typical experimental designs". You would then combine that with what is shown in "1.4 Quick start" to do all the steps, something like:
> >
> > > design <- model.matrix(~<add yours here>) y <-
> > > estimateGLMCommonDisp(y,design) y <-
> > > estimateGLMTrendedDisp(y,design) y <-
> > > estimateGLMTagwiseDisp(y,design) fit <- glmFit(y,design) lrt <-
> > > glmLRT(fit,coef=<add yours here>)
> > > topTags(lrt)
> >
> > In particular, you might have a look at some of the case studies in Chapter 4, for example 4.4 (but using your design matrix).
> >
> > Hope that helps.
> >
> > Best, Mark
> >
> >
> > ----------
> > Prof. Dr. Mark Robinson
> > Bioinformatics, Institute of Molecular Life Sciences University of
> > Zurich http://ow.ly/riRea
> >
> >
> >
> >
> >
> >
> > On 18.12.2013, at 11:21, Jahn Davik [guest] <guest at bioconductor.org> wrote:
> >
> > >
> > > I am working my way through the edgeR User's Guide and following section 3.3, I encounter a problem. I run the following commands - using my own counts data:
> > > ## Chapt 3.3 i Users Guide
> > >
> > > cots<-read.table("H:/bip1/RNAseq/Data/transcriptsRed.counts.matrix",
> > > he
> > > ader=T)
> > > head(cots)
> > > data = as.matrix(cots);
> > > y = round(data);
> > > head(y)
> > >
> > > dim(data)
> > >
> > > targets<-read.table("H:/bip1/RNAseq/Data/targets.txt",header=T)
> > > targets
> > >
> > > # setting up a combined factor
> > >
> > > Group<-factor(paste(targets$Treat,targets$Time,sep="."))
> > > cbind(targets,Group=Group)
> > >
> > > design<-model.matrix(~0+Group)
> > > colnames(design)<-levels(Group)
> > > design
> > >
> > > fit<-glmFit(y,design)
> > >
> > > sessionInfo()
> > >
> > > Resulting in 'Error in glmFit.default(y, design) : No dispersion values provided.'
> > >
> > > I do not quite see where my mistake is and would appreciate help.
> > > Thank you.
> > >
> > > jahn
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > -- output of sessionInfo():
> > >
> > > cots<-read.table("H:/bip1/RNAseq/Data/transcriptsRed.counts.matrix",
> > > he
> > > ader=T)
> > >> head(cots)
> > > E00R1 E00R2 E00R3 E01R1 E01R2 E01R3 E05R1 E05R2 E05R3 E48R2 E48R3 E96R1 E96R2 E96R3 J00R1 J00R2 J00R3 J01R1
> > > comp168996_c1_seq1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
> > > comp442719_c0_seq1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
> > > comp436057_c0_seq13 0.00 7.38 21.04 0.00 0.00 0.21 21.65 0.00 0.00 0.00 0.86 0.57 9.69 18.73 0.00 0.00 0.00 7.14
> > > comp415319_c0_seq20 28.17 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
> > > comp428135_c0_seq8 249.73 228.53 172.18 57.52 104.48 113.96 187.05 94.64 84.65 134.65 99.82 0.00 65.46 82.83 219.37 59.18 134.04 81.56
> > > comp73458_c0_seq1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 2.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
> > > J01R2 J01R3 J05R1 J05R2 J05R3 J48R2 J48R3 J96R1 J96R2 J96R3
> > > comp168996_c1_seq1 1.00 0.00 1.00 0.00 0.00 0.00 0 0.00 0.00 0.00
> > > comp442719_c0_seq1 0.00 0.00 35.09 0.00 0.00 0.00 0 0.00 0.00 0.00
> > > comp436057_c0_seq13 3.07 0.00 0.05 0.00 0.00 19.09 0 0.60 0.00 0.00
> > > comp415319_c0_seq20 0.00 0.00 0.00 0.00 0.00 0.00 0 0.00 0.00 0.00
> > > comp428135_c0_seq8 87.00 188.03 177.74 115.65 75.66 26.50 0 41.02 131.62 101.48
> > > comp73458_c0_seq1 0.00 1.00 0.00 0.00 0.00 0.00 0 0.00 0.00 0.00
> > >> data = as.matrix(cots);
> > >> y = round(data);
> > >> head(y)
> > > E00R1 E00R2 E00R3 E01R1 E01R2 E01R3 E05R1 E05R2 E05R3 E48R2 E48R3 E96R1 E96R2 E96R3 J00R1 J00R2 J00R3 J01R1 J01R2
> > > comp168996_c1_seq1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
> > > comp442719_c0_seq1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> > > comp436057_c0_seq13 0 7 21 0 0 0 22 0 0 0 1 1 10 19 0 0 0 7 3
> > > comp415319_c0_seq20 28 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> > > comp428135_c0_seq8 250 229 172 58 104 114 187 95 85 135 100 0 65 83 219 59 134 82 87
> > > comp73458_c0_seq1 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0
> > > J01R3 J05R1 J05R2 J05R3 J48R2 J48R3 J96R1 J96R2 J96R3
> > > comp168996_c1_seq1 0 1 0 0 0 0 0 0 0
> > > comp442719_c0_seq1 0 35 0 0 0 0 0 0 0
> > > comp436057_c0_seq13 0 0 0 0 19 0 1 0 0
> > > comp415319_c0_seq20 0 0 0 0 0 0 0 0 0
> > > comp428135_c0_seq8 188 178 116 76 26 0 41 132 101
> > > comp73458_c0_seq1 1 0 0 0 0 0 0 0 0
> > >>
> > >> dim(data)
> > > [1] 621694 28
> > >>
> > >> targets<-read.table("H:/bip1/RNAseq/Data/targets.txt",header=T)
> > >> targets
> > > Sample Treat Time
> > > 1 E00R1 Els 00h
> > > 2 E00R2 Els 00h
> > > 3 E00R3 Els 00h
> > > 4 E01R1 Els 01h
> > > 5 E01R2 Els 01h
> > > 6 E01R3 Els 01h
> > > 7 E05R1 Els 05h
> > > 8 E05R2 Els 05h
> > > 9 E05R3 Els 05h
> > > 10 E48R2 Els 48h
> > > 11 E48R3 Els 48h
> > > 12 E96R1 Els 96h
> > > 13 E96R2 Els 96h
> > > 14 E96R3 Els 96h
> > > 15 J00R1 Jon 00h
> > > 16 J00R2 Jon 00h
> > > 17 J00R3 Jon 00h
> > > 18 J01R1 Jon 01h
> > > 19 J01R2 Jon 01h
> > > 20 J01R3 Jon 01h
> > > 21 J05R1 Jon 05h
> > > 22 J05R2 Jon 05h
> > > 23 J05R3 Jon 05h
> > > 24 J48R2 Jon 48h
> > > 25 J48R3 Jon 48h
> > > 26 J96R1 Jon 96h
> > > 27 J96R2 Jon 96h
> > > 28 J96R3 Jon 96h
> > >>
> > >> # setting up a combined factor
> > >>
> > >> Group<-factor(paste(targets$Treat,targets$Time,sep="."))
> > >> cbind(targets,Group=Group)
> > > Sample Treat Time Group
> > > 1 E00R1 Els 00h Els.00h
> > > 2 E00R2 Els 00h Els.00h
> > > 3 E00R3 Els 00h Els.00h
> > > 4 E01R1 Els 01h Els.01h
> > > 5 E01R2 Els 01h Els.01h
> > > 6 E01R3 Els 01h Els.01h
> > > 7 E05R1 Els 05h Els.05h
> > > 8 E05R2 Els 05h Els.05h
> > > 9 E05R3 Els 05h Els.05h
> > > 10 E48R2 Els 48h Els.48h
> > > 11 E48R3 Els 48h Els.48h
> > > 12 E96R1 Els 96h Els.96h
> > > 13 E96R2 Els 96h Els.96h
> > > 14 E96R3 Els 96h Els.96h
> > > 15 J00R1 Jon 00h Jon.00h
> > > 16 J00R2 Jon 00h Jon.00h
> > > 17 J00R3 Jon 00h Jon.00h
> > > 18 J01R1 Jon 01h Jon.01h
> > > 19 J01R2 Jon 01h Jon.01h
> > > 20 J01R3 Jon 01h Jon.01h
> > > 21 J05R1 Jon 05h Jon.05h
> > > 22 J05R2 Jon 05h Jon.05h
> > > 23 J05R3 Jon 05h Jon.05h
> > > 24 J48R2 Jon 48h Jon.48h
> > > 25 J48R3 Jon 48h Jon.48h
> > > 26 J96R1 Jon 96h Jon.96h
> > > 27 J96R2 Jon 96h Jon.96h
> > > 28 J96R3 Jon 96h Jon.96h
> > >>
> > >> design<-model.matrix(~0+Group)
> > >> colnames(design)<-levels(Group)
> > >> design
> > > Els.00h Els.01h Els.05h Els.48h Els.96h Jon.00h Jon.01h Jon.05h Jon.48h Jon.96h
> > > 1 1 0 0 0 0 0 0 0 0 0
> > > 2 1 0 0 0 0 0 0 0 0 0
> > > 3 1 0 0 0 0 0 0 0 0 0
> > > 4 0 1 0 0 0 0 0 0 0 0
> > > 5 0 1 0 0 0 0 0 0 0 0
> > > 6 0 1 0 0 0 0 0 0 0 0
> > > 7 0 0 1 0 0 0 0 0 0 0
> > > 8 0 0 1 0 0 0 0 0 0 0
> > > 9 0 0 1 0 0 0 0 0 0 0
> > > 10 0 0 0 1 0 0 0 0 0 0
> > > 11 0 0 0 1 0 0 0 0 0 0
> > > 12 0 0 0 0 1 0 0 0 0 0
> > > 13 0 0 0 0 1 0 0 0 0 0
> > > 14 0 0 0 0 1 0 0 0 0 0
> > > 15 0 0 0 0 0 1 0 0 0 0
> > > 16 0 0 0 0 0 1 0 0 0 0
> > > 17 0 0 0 0 0 1 0 0 0 0
> > > 18 0 0 0 0 0 0 1 0 0 0
> > > 19 0 0 0 0 0 0 1 0 0 0
> > > 20 0 0 0 0 0 0 1 0 0 0
> > > 21 0 0 0 0 0 0 0 1 0 0
> > > 22 0 0 0 0 0 0 0 1 0 0
> > > 23 0 0 0 0 0 0 0 1 0 0
> > > 24 0 0 0 0 0 0 0 0 1 0
> > > 25 0 0 0 0 0 0 0 0 1 0
> > > 26 0 0 0 0 0 0 0 0 0 1
> > > 27 0 0 0 0 0 0 0 0 0 1
> > > 28 0 0 0 0 0 0 0 0 0 1
> > > attr(,"assign")
> > > [1] 1 1 1 1 1 1 1 1 1 1
> > > attr(,"contrasts")
> > > attr(,"contrasts")$Group
> > > [1] "contr.treatment"
> > >
> > >>
> > >> fit<-glmFit(y,design)
> > > Error in glmFit.default(y, design) : No dispersion values provided.
> > >>
> > >> sessionInfo()
> > > R version 3.0.2 (2013-09-25)
> > > Platform: x86_64-w64-mingw32/x64 (64-bit)
> > >
> > > locale:
> > > [1] LC_COLLATE=Norwegian (Bokmål)_Norway.1252 LC_CTYPE=Norwegian (Bokmål)_Norway.1252 LC_MONETARY=Norwegian (Bokmål)_Norway.1252
> > > [4] LC_NUMERIC=C LC_TIME=Norwegian (Bokmål)_Norway.1252
> > >
> > > attached base packages:
> > > [1] parallel stats graphics grDevices utils datasets methods base
> > >
> > > other attached packages:
> > > [1] edgeR_3.4.2 limma_3.18.6 DESeq_1.14.0 lattice_0.20-24 locfit_1.5-9.1 Biobase_2.22.0 BiocGenerics_0.8.0
> > >
> > > loaded via a namespace (and not attached):
> > > [1] annotate_1.40.0 AnnotationDbi_1.24.0 DBI_0.2-7 genefilter_1.44.0 geneplotter_1.40.0 grid_3.0.2
> > > [7] IRanges_1.20.6 RColorBrewer_1.0-5 RSQLite_0.11.4 splines_3.0.2 stats4_3.0.2 survival_2.37-4
> > > [13] tools_3.0.2 XML_3.98-1.1 xtable_1.7-1
> > >
> > > --
> > > Sent via the guest posting facility at bioconductor.org.
> > >
> > > _______________________________________________
> > > Bioconductor mailing list
> > > Bioconductor at r-project.org
> > > https://stat.ethz.ch/mailman/listinfo/bioconductor
> > > Search the archives:
> > > http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list