[Bioc-sig-seq] Normalization in RNA-seq

Mayte Suarez-Farinas farinam at mail.rockefeller.edu
Wed Nov 24 19:38:34 CET 2010


Dear All,
I have being working with pair samples for 3 subjects using edgeR  
package
and I am puzzle with the results of my normalization. After  
normalization, the data is skewed towards the LS group, and as a  
result, I get much more genes up than down-regulated. We have study  
this disease extensively in large samples with microarray and this is  
not the case there, so now I am suspicious of my normalization.
I am including teh code and a pdf with the smear plot using the  
normalization options in edgeR. On all of them the data looks worst  
than after normalization.
If someone can look to what I did and point to any mistake, I will  
really appreciate.
I dont know if the point is that I am deleting the unmapped reads  
before normalization.
I was instructed as such in the SeqAnswer forum.


## Reading Files
files <- dir(pattern="*\\counts.txt$")
files.pheno<-data.frame(files=files, group=factor(substr(files, 
1,2),levels=c("NL","LS")), Patient=factor(substr(files,3,4)))
PScounts <-readDGE(files.pheno)
colnames(PScounts)<-paste(PScounts$samples$group,PScounts$samples 
$Patient,sep='-')

##delete unmmaped reads
unmmaped<-c('no_feature','ambiguous','not aligned','too low aQual')
PScounts<-PScounts[-which(rownames(PScounts$counts)%in%unmmaped),]

#Calculate Normalizations
d.PS <- calcNormFactors(PScounts)
pdf('Normalization Plots.pdf',height=10,width=10)
layout(matrix(1:4,2,2,byrow=TRUE))
a<-plotSmear(PScounts,  
panel.first=grid(),smooth.scatter=FALSE,main='before normalization')
ma.plot(a$A,a$M,plot.method='add',cex=0)
b<-plotSmear(d.PS, panel.first=grid(),smooth.scatter=FALSE,main='after  
TMM')
ma.plot(b$A,b$M,plot.method='add',cex=0)
rm(b)
d.PS.2 <- calcNormFactors(PScounts,method='RLE')
b<-plotSmear(d.PS, panel.first=grid(),smooth.scatter=FALSE,main='after  
RLE')
ma.plot(b$A,b$M,plot.method='add',cex=0)
rm(b)
d.PS.3 <- calcNormFactors(PScounts,method='quantile')
b<-plotSmear(d.PS.3,  
panel.first=grid(),smooth.scatter=FALSE,main='after quantile')
ma.plot(b$A,b$M,plot.method='add',cex=0)
rm(b)
dev.off()

 > d.PS$sample ###(after TMM)
                  files group Patient lib.size norm.factors
LS-25 LS252.counts.txt    LS      25 23067191       0.9085
LS-28 LS287.counts.txt    LS      28 20684675       0.9056
LS-29 LS292.counts.txt    LS      29 19881245       0.9965
NL-25 NL251.counts.txt    NL      25 19665929       1.0129
NL-28 NL286.counts.txt    NL      28 22938039       1.1554
NL-29 NL291.counts.txt    NL      29 20541691       1.0422
 >
 > d.PS.2$sample  ###after RLE
                  files group Patient lib.size norm.factors
LS-25 LS252.counts.txt    LS      25 23067191       0.9495
LS-28 LS287.counts.txt    LS      28 20684675       0.9898
LS-29 LS292.counts.txt    LS      29 19881245       1.0385
NL-25 NL251.counts.txt    NL      25 19665929       0.9592
NL-28 NL286.counts.txt    NL      28 22938039       1.0572
NL-29 NL291.counts.txt    NL      29 20541691       1.0104

 > d.PS.3$sample   ###after quantiles
                  files group Patient lib.size norm.factors
LS-25 LS252.counts.txt    LS      25 23067191       0.8659
LS-28 LS287.counts.txt    LS      28 20684675       0.9656
LS-29 LS292.counts.txt    LS      29 19881245       1.1302
NL-25 NL251.counts.txt    NL      25 19665929       0.8887
NL-28 NL286.counts.txt    NL      28 22938039       1.0885
NL-29 NL291.counts.txt    NL      29 20541691       1.0939



Mayte Suarez-Farinas
Research Associate, The Rockefeller University
Biostatistician, The Rockefeller University Hospital
1230 York Ave, Box 178,
New York, NY, 10065
+1(212) 327-8213





-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://stat.ethz.ch/pipermail/bioc-sig-sequencing/attachments/20101124/5a1869c4/attachment-0002.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Normalization Plots.pdf
Type: application/pdf
Size: 10856331 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/bioc-sig-sequencing/attachments/20101124/5a1869c4/attachment-0001.pdf>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://stat.ethz.ch/pipermail/bioc-sig-sequencing/attachments/20101124/5a1869c4/attachment-0003.html>


More information about the Bioc-sig-sequencing mailing list