[R] A basic statistics question
(Ted Harding)
Ted.Harding at wlandres.net
Tue Aug 12 23:32:32 CEST 2014
On 12-Aug-2014 19:57:29 Ron Michael wrote:
> Hi,
>
> I would need to get a clarification on a quite fundamental statistics
> property, hope expeRts here would not mind if I post that here.
>
> I leant that variance-covariance matrix of the standardized data is equal to
> the correlation matrix for the unstandardized data. So I used following data.
>
> Data <- structure(c(7L, 5L, 9L, 7L, 8L, 7L, 6L, 6L, 5L, 7L, 8L, 6L, 7L, 7L,
> 6L, 7L, 7L, 6L, 8L, 6L, 7L, 7L, 7L, 8L, 7L, 9L, 8L, 7L, 7L, 0L, 10L, 10L,
> 10L, 7L, 6L, 8L, 5L, 5L, 6L, 6L, 7L, 11L, 9L, 10L, 0L, 13L, 13L, 10L, 7L,
> 7L, 7L, 10L, 7L, 5L, 8L, 7L, 10L, 10L, 10L, 6L, 7L, 6L, 6L, 8L, 8L, 7L, 7L,
> 7L, 7L, 8L, 7L, 8L, 6L, 6L, 8L, 7L, 4L, 7L, 7L, 10L, 10L, 6L, 7L, 7L, 12L,
> 12L, 8L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 6L, 7L, 7L, 5L, 4L, 5L, 5L, 5L, 6L,
> 7L, 5L, 7L, 5L, 7L, 7L, 7L, 7L, 8L, 7L, 6L, 7L, 7L, 6L, 7L, 7L, 6L, 4L, 4L,
> 6L, 6L, 7L, 8L, 7L, 11L, 10L, 8L, 7L, 6L, 6L, 11L, 5L, 4L, 6L, 6L, 6L, 7L,
> 8L, 7L, 12L, 4L, 4L, 2L, 5L, 6L, 7L, 6L, 6L, 5L, 6L, 5L, 7L, 7L, 7L, 6L, 5L,
> 6L, 6L, 5L, 5L, 6L, 6L, 4L, 4L, 5L, 10L, 10L, 7L, 7L, 6L, 4L, 6L, 10L, 7L,
> 4L, 6L, 6L, 6L, 8L, 8L, 8L, 7L, 8L, 9L, 10L, 7L, 6L, 6L, 8L, 6L, 8L, 3L,
> 3L, 4L, 5L, 5L, 6L, 5L, 5L, 6L, 4L, 8L, 7L, 3L, 5L, 6L, 9L, 8L, 9L, 10L, 8L,
> 9L, 8L, 9L, 8L, 8L, 9L, 11L, 10L, 9L, 9L, 13L,
> 13L, 10L, 7L, 7L, 7L, 9L, 8L, 7L, 6L, 10L, 8L, 7L, 8L, 8L, 3L, 4L, 3L, 7L,
> 6L, 6L, 6L, 6L, 5L, 6L, 6L, 6L, 2L, 5L, 7L, 9L, 8L, 9L, 10L, 8L, 8L, 9L, 9L,
> 11L, 11L, 11L, 10L, 9L, 9L, 11L, 2L, 3L, 2L, 2L, 2L, 1L, 4L, 4L, 2L, 2L, 1L,
> 1L, 1L, 3L, 3L, 4L, 6L, 4L, 5L, 2L, 3L, 5L, 4L, 4L, 2L, 4L, 4L, 5L, 4L, 2L,
> 7L, 3L, 3L, 10L, 13L, 11L, 9L, 9L, 7L, 8L, 9L, 6L, 7L, 6L, 5L, 3L, 13L, 3L,
> 3L, 0L, 1L, 4L, 5L, 3L, 3L, 0L, 2L, 20L, 3L, 2L, 6L, 5L, 5L, 5L, 2L, 2L,
> 5L, 5L, 5L, 4L, 3L, 4L, 4L, 3L, 4L, 10L, 10L, 9L, 8L, 4L, 4L, 8L, 7L, 10L,
> 3L, 1L, 9L, 5L, 11L, 9L), .Dim = c(45L, 8L), .Dimnames = list(NULL, c("V1",
> "V7", "V13", "V19", "V25", "V31", "V37", "V43")))
>
> ____
> Data_Normalized <- apply(Data, 2, function(x) return((x - mean(x))/sd(x)))
>
> (t(Data_Normalized) %*% Data_Normalized)/dim(Data_Normalized)[1]
>
>
>
> Point is that I am not getting exact CORR matrix. Can somebody point me
> what I am missing here?
>
> Thanks for your pointer.
Try:
Data_Normalized <- apply(Data, 2, function(x) return((x - mean(x))/sd(x)))
(t(Data_Normalized) %*% Data_Normalized)/(dim(Data_Normalized)[1]-1)
and compare the result with
cor(Data)
And why? Look at
?sd
and note that:
Details:
Like 'var' this uses denominator n - 1.
Hoping this helps,
Ted.
-------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at wlandres.net>
Date: 12-Aug-2014 Time: 22:32:26
This message was sent by XFMail
More information about the R-help
mailing list