[BioC] duplicateCorrelation
Gordon Smyth
smyth at wehi.edu.au
Fri Nov 18 05:54:55 CET 2005
Dear Devin,
There are a couple of problems. Firstly, you've told us that your
replicates are 112 spots apart, but you haven't told limma this. So the
software is assuming that the replicates are side-by-side, which is the
default. You need instead:
> cor <- duplicateCorrelation(MA, design, ndups=3, spacing=112)
Secondly, two arrays is pretty minimal to estimate duplicate correlations.
The help page for duplicateCorrelation says:
For this function to return statistically useful results, there
must be at least two more arrays than the number of coefficients
to be estimated, i.e., two more than the column rank of 'design'.
Hence you need at least 3 arrays to have confidence in your results whereas
you have only two.
If you want to check that duplicateCorrelation() is getting the right
input, the best way is to check that your replicates really are at the
spacing you think they are. Your data files (ScanArray?) almost certainly
contain a gene ID column. Let's assume this column is called "ID". Use
> RG <- read.maimages(..., annotation="ID")
Then
> unwrapdups(MA$genes$ID, ndups=3, spacing=112)
is a matrix which should have three identical columns. Does it?
Best wishes
Gordon
>[BioC] duplicateCorrelation
>Devin Scannell scannedr at tcd.ie
>Fri Nov 18 02:03:07 CET 2005
>
>Hi,
>
>this is not a very interesting question but it has given me enough
>trouble to get me to mail the list so I hope somebody has time to
>reply.
>
>I have several two-colour arrays to analyze. Each probe is present
>three times on each chip and they are spaced 112 spots apart (not my
>decision). The consensus correlation returned by duplicateCorrelation
>is typically around zero which is surprising since the spots are close
>together and the data looks good in MA plots (even before
>normalization). A histogram of the individual correlations
>(cor$all.correlations from duplicateCorrelation) supports the
>conclusion that the within-chip replicates are poorly correlated.
>
>I am concerned that the numbers that are being handed to
>duplicateCorrelation are incorrect somehow but I am not sure what I am
>doing wrong (code below). I have looked at the code for
>duplicateCorrelation and cannot follow it so I was wondering if anyone
>can suggest a way to verify the correlations it is calculating. Ideally
>I would like to be able to select a specific gene, calculate the
>correlation between replicates myself and verify that this is the same
>as I obtain from duplicateCorrelation.
>
>Thanks in advance,
>Devin
>library(limma)
>
>targets <- readTargets()
>
>targets
> SlideNumber Name FileName Cy3 Cy5
>13 13 60H_9:12 13.csv WT1 60H1
>17 17 60H_12:9 17.csv 60H1 WT1
>
>flag.check <- function(x) as.numeric(x$Flags >= 3)
>RG <- read.maimages(targets$FileName, sep=",", columns=list(Rf="Ch1
>Median",Gf="Ch2 Median",Rb="Ch1 B Median",Gb="Ch2 B Median"),
>wt.fun=flag.check)
>
>RG$genes <- readGAL()
>RG$printer <- getLayout(RG$genes)
>
>RG.bgc <- backgroundCorrect(RG, method="normexp", offset=50)
>MA <- normalizeWithinArrays(RG.bgc, method="loess")
>
>design <- cbind(c(1,-1))
>cor <- duplicateCorrelation(MA, design, ndups=3)
More information about the Bioconductor
mailing list