[R] A basic statistics question
(Ted Harding)
Ted.Harding at wlandres.net
Wed Aug 13 00:22:13 CEST 2014
On 12-Aug-2014 21:41:52 Rolf Turner wrote:
> On 13/08/14 07:57, Ron Michael wrote:
>> Hi,
>>
>> I would need to get a clarification on a quite fundamental statistics
>> property, hope expeRts here would not mind if I post that here.
>>
>> I leant that variance-covariance matrix of the standardized data is equal to
>> the correlation matrix for the unstandardized data. So I used following
>> data.
>
> <SNIP>
>
>> (t(Data_Normalized) %*% Data_Normalized)/dim(Data_Normalized)[1]
>>
>> Point is that I am not getting exact CORR matrix. Can somebody point
>> me what I am missing here?
>
> You are using a denominator of "n" in calculating your "covariance"
> matrix for your normalized data. But these data were normalized using
> the sd() function which (correctly) uses a denominator of n-1 so as to
> obtain an unbiased estimator of the population standard deviation.
>
> If you calculated
>
> (t(Data_Normalized) %*% Data_Normalized)/(dim(Data_Normalized)[1]-1)
>
> then you would get the same result as you get from cor(Data) (to within
> about 1e-15).
>
> cheers,
> Rolf Turner
One could argue about "(correctly)"!
>From the "descriptive statistics" point of view, if one is given a single
number x, then this dataset has no variation, so one could say that
sd(x) = 0. And this is what one would get with a denominator of "n".
But if the single value x is viewed as sampled from a distribution
(with positive dispersion), then the value of x gives no information
about the SD of the distribution. If you use denominator (n-1) then
sd(x) = NA, i.e. is indeterminate (as it should be in this application).
The important thing when using pre-programmed functions is to know
which is being used. R uses (n-1), and this can be found from
looking at
?sd
or (with more detail) at
?cor
Ron had assumed that the denominator was n, apparently not being aware
that R uses (n-1).
Just a few thoughts ...
Ted.
-------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at wlandres.net>
Date: 12-Aug-2014 Time: 23:22:09
This message was sent by XFMail
More information about the R-help
mailing list