[BioC] intraspotCorrelation vs duplicateCorrelation

Gordon K Smyth smyth at wehi.EDU.AU
Sat Feb 26 04:35:47 CET 2011


Dear Naomi,

duplicateCorrelation() calls mixedModel2Fit() (in the statmod package), 
which in turn calls glmgam.fit().  glmgam.fit() implements a 
Levenberg-Marquardt modification to the Fisher scoring algorithm to 
prevent divergence, so it is extremely reliable, much more so than a call 
to glm() would be.  On the other hand, intraspotCorrelation() calls 
remlscore() (also in the statmod package), which does ordinary Fisher 
scoring for a related model without any modification to prevent 
divergence.  So it is entirely possible that duplicateCorrelation() will 
work for a dataset for which intraspotCorrelation() does not.

However, to get a NULL result from intraspotCorrelation(), the model fit 
would have to fail for every single gene in your dataset.  Just failing 
for one or two would not cause a problem.

Best wishes
Gordon

> Date: Thu, 24 Feb 2011 20:20:12 -0500
> From: Naomi Altman <naomi at stat.psu.edu>
> To: bioconductor at r-project.org
> Subject: [BioC] intraspotCorrelation vs duplicateCorrelation
> Message-ID: <6.2.3.4.1.20110224200727.01ffea98 at imap.stat.psu.edu>
> Content-Type: text/plain; charset="us-ascii"; format=flowed
>
> I was having problems with intraspotCorrelation for a single channel
> analysis, so I decided to reorganize the data to use
> duplicateCorrelation.  I thought I would get the same answer, but I
> did not.  I hope someone can tell me why.  (As you will see below, I
> like the answer with duplicateCorrelation better!)
>
> MAp is the normalized 2-channel data.
> I have a targets list called targetp with the treatment
> information.  The code is below.
>
> ##############################################################
> # separate channel approach
> #############################################################
> targetSingle <- targetsA2C(targetp)
> rep=c(0,0,1,1,0,0,1,1,0,0,1,1)
> designp=model.matrix(~-1+targetSingle$Target+rep,levels=unique(c(cy3,cy5)))
> corfit =intraspotCorrelation(MAp, designp)
> corfit$cor
> NULL
> ##########################################################
> # the warning was
> #In remlscore(y, X, Z) : reml: Max iterations exceeded
> ##########################################################
>
> ############################################################
> # approach treating channels as separate arrays
> ############################################################
> RGp=RG.MA(MAp)
> Rp=log2(RGp$R)
> Gp=log2(RGp$G)
> exprP=cbind(Rp[,1],Gp[,1],Rp[,2],Gp[,2],Rp[,3],Gp[,3],Rp[,4],Gp[,4],Rp[,5],Gp[,5],Rp[,6],Gp[,6])
> corfitP = duplicateCorrelation(exprP, designp,  block =
> factor(c(1,1,2,2,3,3,4,4,5,5,6,6)))
> corfitP$cor
> [1] 0.4921254
>
> sessionInfo()
> R version 2.11.1 (2010-05-31)
> i386-pc-mingw32
>
> locale:
> [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United
> States.1252    LC_MONETARY=English_United States.1252
> [4] LC_NUMERIC=C                           LC_TIME=English_United
> States.1252
>
> attached base packages:
> [1] tools     stats     graphics  grDevices
> utils     datasets  methods   base
>
> other attached packages:
> [1] statmod_1.4.8    rat2302cdf_2.6.0
> limma_3.4.5      affy_1.26.1      Biobase_2.8.0
>
> loaded via a namespace (and not attached):
> [1] affyio_1.16.0         preprocessCore_1.10.0
>
>
> Thanks,
> Naomi

______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}



More information about the Bioconductor mailing list