[Bioc-sig-seq] Normalization in RNA-seq

Wed Nov 24 22:06:11 CET 2010

Dear Mayte

How do these plots look like when you make them separately for each 
subject? (In addition, you could colour the dots according to whether 
the differential expression analysis for the overall dataset calls them 
'significant').

Also, if you compute the M values for each patient separately, how does 
the pairs plot (scatterplot matrix) look like?

On a completely unrelated note, I recently saw a movie about studies 
with 3 patients: http://www.xtranormal.com/watch/6878253

	Best wishes
	Wolfgang

Il Nov/24/10 7:38 PM, Mayte Suarez-Farinas ha scritto:
> Dear All,
> I have being working with pair samples for 3 subjects using edgeR package
> and I am puzzle with the results of my normalization. After
> normalization, the data is skewed towards the LS group, and as a result,
> I get much more genes up than down-regulated. We have study this disease
> extensively in large samples with microarray and this is not the case
> there, so now I am suspicious of my normalization.
> I am including teh code and a pdf with the smear plot using the
> normalization options in edgeR. On all of them the data looks worst than
> after normalization.
> If someone can look to what I did and point to any mistake, I will
> really appreciate.
> I dont know if the point is that I am deleting the unmapped reads before
> normalization.
> I was instructed as such in the SeqAnswer forum.
>
>
> ## Reading Files
> files<- dir(pattern="*\\counts.txt$")
> files.pheno<-data.frame(files=files,
> group=factor(substr(files,1,2),levels=c("NL","LS")),
> Patient=factor(substr(files,3,4)))
> PScounts<-readDGE(files.pheno)
> colnames(PScounts)<-paste(PScounts$samples$group,PScounts$samples$Patient,sep='-')
>
> ##delete unmmaped reads
> unmmaped<-c('no_feature','ambiguous','not aligned','too low aQual')
> PScounts<-PScounts[-which(rownames(PScounts$counts)%in%unmmaped),]
>
> #Calculate Normalizations
> d.PS<- calcNormFactors(PScounts)
> pdf('Normalization Plots.pdf',height=10,width=10)
> layout(matrix(1:4,2,2,byrow=TRUE))
> a<-plotSmear(PScounts,
> panel.first=grid(),smooth.scatter=FALSE,main='before normalization')
> ma.plot(a$A,a$M,plot.method='add',cex=0)
> b<-plotSmear(d.PS, panel.first=grid(),smooth.scatter=FALSE,main='after TMM')
> ma.plot(b$A,b$M,plot.method='add',cex=0)
> rm(b)
> d.PS.2<- calcNormFactors(PScounts,method='RLE')
> b<-plotSmear(d.PS, panel.first=grid(),smooth.scatter=FALSE,main='after RLE')
> ma.plot(b$A,b$M,plot.method='add',cex=0)
> rm(b)
> d.PS.3<- calcNormFactors(PScounts,method='quantile')
> b<-plotSmear(d.PS.3, panel.first=grid(),smooth.scatter=FALSE,main='after
> quantile')
> ma.plot(b$A,b$M,plot.method='add',cex=0)
> rm(b)
> dev.off()
>
>>  d.PS$sample ###(after TMM)
> files group Patient lib.size norm.factors
> LS-25 LS252.counts.txt LS 25 23067191 0.9085
> LS-28 LS287.counts.txt LS 28 20684675 0.9056
> LS-29 LS292.counts.txt LS 29 19881245 0.9965
> NL-25 NL251.counts.txt NL 25 19665929 1.0129
> NL-28 NL286.counts.txt NL 28 22938039 1.1554
> NL-29 NL291.counts.txt NL 29 20541691 1.0422
>>
>>  d.PS.2$sample ###after RLE
> files group Patient lib.size norm.factors
> LS-25 LS252.counts.txt LS 25 23067191 0.9495
> LS-28 LS287.counts.txt LS 28 20684675 0.9898
> LS-29 LS292.counts.txt LS 29 19881245 1.0385
> NL-25 NL251.counts.txt NL 25 19665929 0.9592
> NL-28 NL286.counts.txt NL 28 22938039 1.0572
> NL-29 NL291.counts.txt NL 29 20541691 1.0104
>
>>  d.PS.3$sample ###after quantiles
> files group Patient lib.size norm.factors
> LS-25 LS252.counts.txt LS 25 23067191 0.8659
> LS-28 LS287.counts.txt LS 28 20684675 0.9656
> LS-29 LS292.counts.txt LS 29 19881245 1.1302
> NL-25 NL251.counts.txt NL 25 19665929 0.8887
> NL-28 NL286.counts.txt NL 28 22938039 1.0885
> NL-29 NL291.counts.txt NL 29 20541691 1.0939
>
>
>
>
> Mayte Suarez-Farinas
> Research Associate, The Rockefeller University
> Biostatistician, The Rockefeller University Hospital
> 1230 York Ave, Box 178,
> New York, NY, 10065
> +1(212) 327-8213
>
>
>
>
>
>
>
> _______________________________________________
> Bioc-sig-sequencing mailing list
> Bioc-sig-sequencing at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing