[R] Pooled Covariance Matrix
Murray Jorgensen
maj at waikato.ac.nz
Wed Sep 20 23:13:38 CEST 2006
Thank you, Professor Ripley. Murray Jorgensen
Prof Brian Ripley wrote:
> On Wed, 20 Sep 2006, Murray Jorgensen wrote:
>
>> I am in a discriminant analysis situation with a frame containing
>> several variables and a grouping factor, if you like:
>>
>> set.seed(200906)
>> exampledf <- as.data.frame(matrix(rnorm(50,5,2),nrow=10,ncol=5))
>> exampledf$Group <- factor(rep(c(1,2,3),c(3,3,4)))
>> exampledf
>>
>> I'm sure there must be a simple way to get the within group pooled
>> covariance matrix but I haven't found it yet.
>
> There are two versions of this, weighted and unweighted, and the
> difference caused confusion in the early discriminant analysis
> literature. (See MASS4 p.333.) The weighted version is conventional.
>
> Suppose you have a matrix X and a grouping factor g. Then either of
>
> group.means <- rowsum(X, g)/as.vector(table(g))
> group.means <- tapply(X, list(rep(g, ncol(X)), col(X)), mean)
>
> gives the group means, and var(X - group.means[g,]) seems to be what you
> want.
>
>> I started thinking that one might begin by forming a frame with the same
>> dimensions but containing the group means. But then I found a thread
>> from two years back called "Getting the groupmean for each person" which
>> seemed to imply that doing this was a bit subtle even for ncol=1. Hence
>> I will risk a question to the list.
>
> That thread seems to be about efficiency for very large matrices on R of
> two years' ago.
>
--
Dr Murray Jorgensen http://www.stats.waikato.ac.nz/Staff/maj.html
Department of Statistics, University of Waikato, Hamilton, New Zealand
Email: maj at waikato.ac.nz Fax 7 838 4155
Phone +64 7 838 4773 wk Home +64 7 825 0441 Mobile 021 1395 862
More information about the R-help
mailing list