[BioC] duplicateCorrelation and design matrix
Carolyn Fitzsimmons
Carolyn.Fitzsimmons at imbim.uu.se
Sun Jul 3 12:13:29 CEST 2005
Hi Gordon, thanks for your reply. I have a few more questions:
Quoting Gordon K Smyth <smyth at wehi.EDU.AU>:
> > Date: Thu, 30 Jun 2005 11:44:02 +0000
> > From: Carolyn Fitzsimmons <Carolyn.Fitzsimmons at imbim.uu.se>
> > Subject: [BioC] duplicateCorrelation and design matrix
> > To: Bioconductor list <bioconductor at stat.math.ethz.ch>
> >
> > Hello,
> >
> > I need an explanation of how the design matrix influences the consensus
> > correlation of the duplicateCorrelation function when accounting for
> technical
> > replicates. Here is my specific example:
> >
> > Design matrix:
> >> design
> > RJf RJm WLf WLm
> > 1 0 0 0 1
> > 2 0 0 0 1
> > 3 0 0 0 1
> > 4 0 0 0 1
> > 5 0 0 0 1
> > 6 0 0 0 1
> > 7 0 0 0 1
> > 8 0 0 0 1
> > 9 0 0 1 0
> > 10 0 0 1 0
> > 11 0 0 1 0
> > 12 0 0 1 0
> > 13 0 0 1 0
> > 14 0 0 1 0
> > 15 0 0 1 0
> > 16 0 0 1 0
> > 17 0 1 0 0
> > 18 0 1 0 0
> > 19 0 1 0 0
> > 20 0 1 0 0
> > 21 0 1 0 0
> > 22 0 1 0 0
> > 23 0 1 0 0
> > 24 0 1 0 0
> > 25 1 0 0 0
> > 26 1 0 0 0
> > 27 1 0 0 0
> > 28 1 0 0 0
> > 29 1 0 0 0
> > 30 1 0 0 0
> > 31 1 0 0 0
> > 32 1 0 0 0
> > #
> > each second slide is a replicate of the first (eg. 1 and 2 are replicates,
> then
> > 3 and 4,... etc.). There are also 4 groups that I want to compare, with 4
> > individuals in each group (each duplicated). So I continue with the
> > duplicateCorrelation:
> > #
> >> cor <- duplicateCorrelation(Mmatrix_ny, design=design,
> > +
> >
>
block=c(1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,10,10,11,11,12,12,13,13,14,14,15,15,16,16))
> >> cor$cor
> > [1] -0.03060575
> > #
> > which is a pretty bad correlation so I probably should just use the
> technical
> > replicates as biological replicates (the limma user guide says). But in
> > another comparison I want to put all the arrays in 2 groups, see design
> > matrix:
> >> designWLRJ
> > RJ WL
> > 1 0 1
> > 2 0 1
> > 3 0 1
> > 4 0 1
> > 5 0 1
> > 6 0 1
> > 7 0 1
> > 8 0 1
> > 9 0 1
> > 10 0 1
> > 11 0 1
> > 12 0 1
> > 13 0 1
> > 14 0 1
> > 15 0 1
> > 16 0 1
> > 17 1 0
> > 18 1 0
> > 19 1 0
> > 20 1 0
> > 21 1 0
> > 22 1 0
> > 23 1 0
> > 24 1 0
> > 25 1 0
> > 26 1 0
> > 27 1 0
> > 28 1 0
> > 29 1 0
> > 30 1 0
> > 31 1 0
> > 32 1 0
> > #
> > and then do the duplicateCorrelation function and get a different
> correlation.
> > #
> >> corWLRJ <- duplicateCorrelation (Mmatrix_ny, design=designWLRJ,
> > +
> >
>
block=c(1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,10,10,11,11,12,12,13,13,14,14,15,15,16,16))
> >> corWLRJ$cor
> > [1] 0.01745252
> > #
> > Moreover when I compute the consensus correlation without using a design
> matrix
> > I get 0.1073055. I know from looking through previous posts and a lot of
> help
> > from Johan L. that the way the blocking is set up and using the design
> matrix
> > in these situations is correct.
>
> You've used three different non-equivalent design matrices. No more than one
> of these can be
> correct.
But if I need to group the individuals differently to test for differential
expression between different groupings of individuals (i.e. between
WLm/WLf/RJm/RJf and WL/RJ), the use of 2 different design matrixies in the
dupCorrelation function is warrented, yes?
>
> > So how is the consensus correlation actually
> > being calculated in the above situations? (in loose mathamatical terms if
> > possible, as you can probably tell from my question).
>
> In loose terms the correlation measures the variability between blocks
> relative to the variation
> within blocks. Over-simplifying the design matrix will increase the
> between-blocks variation,
> because it will now reflect differences between your treatments as well as
> differences between
> biological replicates. Hence the estimated correlation increases.
>
Okay. Now I believe I understand how it is calculated. When you use a design
matrix here you create blocks, then the blocking argument creates blocks within
blocks. (Correct me if this is wrong).
Best Regards, Carolyn
More information about the Bioconductor
mailing list