Dear List,

Hi! I am new to this list so here is a brief introduction: My name is Vishal
and I am a post doc at Cold Spring Harbor Lab working on Chip-chip / seq
data analysis. I have my background in computer algorithms so pardon me if I
make some errors with my Biological  and Statistical terminology.

Here is the problem that I am facing:

1) I have data from Nimblegen tiling arrays. I have 3 Bioreps each having 1
technical rep. There are no dye swaps. In each rep, there are duplicate
spots on the array. In this experiment, as I reconstructed the images from
the data, I see some "quite" bad spots in the red channel specially for
biorep2. I am sure most of you have faced this so do you usually include
this rep in your analysis, or not? How do you handle the statistical
confidence with your results if you do or dont?

2) I want to use the duplicate spots on each rep for my analysis. As of now,
I do the normalization, I average the duplicate spots and use that as my
input to the lmfit() function. I notice that after the average, the
correlation between the reps is better. I guess that is expected but I am
not satisfied with the averaging of the Spots. I believe that there is a
better way to do this than just take the average but I am just not aware of
that. I have used the duplicateCorrelation() function in Limma which gives
me a -0.04 correlation and its probably because the probes are position
randomized (even the duplicates are). So can anyone help me and tell me how
should I proceed and use these duplicate spots in a better way than just
simply averaging them? I appreciate any pointers that I can get.


Source code for this:

ma.loess<-normalizeWithinArrays(rg,method="loess", bc.method="none")

ma.quantile <-normalizeBetweenArrays(ma.loess, method="quantile")

ma.spot1.quantile<-ma.quantile[grep("SPOT1",ma.quantile$genes$GENE_EXPR_OPTION),]


ma.spot2.quantile<-ma.quantile[grep("SPOT2",ma.quantile$genes$GENE_EXPR_OPTION),]


ma.spot1.quantile<-ma.spot1.quantile[order(ma.spot1.quantile$genes$GENE_EXPR_OPTION,ma.spot1.quantile$genes$POSITION),]


ma.quantile <- ma.quantile[order(ma.quantile$genes$GENE_EXPR_OPTION,
ma.quantile$genes$POSITION),]

ma.spot2.quantile<-ma.spot2.quantile[order(ma.spot2.quantile$genes$GENE_EXPR_OPTION,ma.spot2.quantile$genes$POSITION),]


ma.avr.quantile<-ma.spot1.quantile
ma.avr.quantile$M<-(ma.spot1.quantile$M + ma.spot2.quantile$M)/2

fit.avg <- lmFit(ma.avr.quantile, design)
fit <- lmFit(ma.quantile, design)

--------------------------------
function: duplicateCorrelation() in limma as follows:

biolrep=c(1,1,2,2)
corfit.avr=duplicateCorrelation(ma.avr.quantile, ndups=2, block=biolrep)
--------------------------------

This did not work. I got a negative corelation of -0.04

I appreciate your time and help .

Sincerely,

Vishal

ps: Thank you Gordan Smith for writing Limma. I think its really a great
tool to have and I am very appreciative of it.

	[[alternative HTML version deleted]]