[BioC] Question about Correlation (Limma package)

Gordon Smyth smyth at wehi.edu.au
Fri Apr 8 13:05:14 CEST 2005


At 08:01 PM 8/04/2005, bioconductor-request at stat.math.ethz.ch wrote:
>Date: Thu, 07 Apr 2005 12:28:58 -0400
>From: yzou1971 at netscape.net
>Subject: [BioC] Question about Correlation (Limma package)
>To: bioconductor at stat.math.ethz.ch
>
>Hi,
>
>I have 4 replicate slides. Within each slide, there are 3 replicate spots 
>for each gene. After doing the within Array normalization, I ran 
>lmFit(limma) with two ways and get different correlation values as following:
>
>(1)lmFit1
>
> > ADlmFit1 <- lmFit(ADraw.NormWA, ADdesign, ndups=3)
> > ADlmFit1$correlation
>[1] 0.75
>
>
>(2)lmFit2
>
> > cor <- duplicateCorrelation(ADraw.NormWA,ADdesign, ndups=3)
> > cor$consensus.correlation
>[1] 0.5548806
> > ADlmFit2 <- lmFit(ADraw.NormWA,ADdesign, ndups=3, 
> correlation=cor$consensus.correlation)
> > ADlmFit2$correlation
>[1] 0.5548806
>
>Questions:
>
>(1) Why I got different correlation value if I ran "duplicateCorrelation" 
>firstly? For my slides with diplicate spots, Should I always calculate the 
>correlation value using "duplicateCorrelation" before running "lmFit"?

Yes, of course, that is what duplicateCorrelation() is provided for. Unless 
you want to just use the default correlation value.

I prefer to separate the correlation estimation and the linear model fit 
into two separate functions because (i) both steps can be quite time 
consuming and (ii) I want you to look at the correlation value for 
reasonablenes before entering it into the linear model fit.

>If I should use "duplicateCorrelation" only when there are duplicate spots 
>within array?

Yes, naturally. If there are no duplicates, there is no correlation to 
estimate.

>(2) If I ran "lmFit" only I always got correlation 0.75, I also ran other 
>slides and got the same correlation 0.75, It looks that 0.75 is the 
>default correlation value if I didn't give specific value to correlation. 
>Why the "lmFit" always give correlation 0.75 for different data analysis? 
>It looks not reasonable. I'm very confused.

So you've done a lot of experiments and have decided that 0.75 must be the 
default value for 'correlation'. An alternate and less circuitious path 
might have been to read the help page. Typing args(lmFit) or ?lmFit would 
have told you immediately that 0.75 is the default value.

Gordon

>Thanks
>
>yi



More information about the Bioconductor mailing list