[R] A basic statistics question
Rolf Turner
r.turner at auckland.ac.nz
Fri Aug 15 23:49:31 CEST 2014
On 16/08/14 01:29, Joshua Wiley wrote:
>
> On Wed, Aug 13, 2014 at 7:41 AM, Rolf Turner <r.turner at auckland.ac.nz
> <mailto:r.turner at auckland.ac.nz>> wrote:
>
> On 13/08/14 07:57, Ron Michael wrote:
>
> Hi,
>
> I would need to get a clarification on a quite fundamental
> statistics property, hope expeRts here would not mind if I post
> that here.
>
> I leant that variance-covariance matrix of the standardized data
> is equal to the correlation matrix for the unstandardized data.
> So I used following data.
>
>
> <SNIP>
>
>
> (t(Data_Normalized) %*% Data_Normalized)/dim(Data___Normalized)[1]
>
>
>
> Point is that I am not getting exact CORR matrix. Can somebody
> point me what I am missing here?
>
>
> You are using a denominator of "n" in calculating your "covariance"
> matrix for your normalized data. But these data were normalized
> using the sd() function which (correctly) uses a denominator of n-1
> so as to obtain an unbiased estimator of the population standard
> deviation.
>
>
> As a small point n - 1 is not _quite_ an unbiased estimator of the
> population SD see Cureton. (1968).
> Unbiased Estimation of the Standard Deviation, The American
> Statistician, 22(1).
>
> To see this in action:
>
> res <- unlist(parLapply(cl, 1:1e7, function(i) sd(rnorm(10, mean = 0, sd
> = 1))))
> correction <- function(n) {
> gamma((n-1)/2) * sqrt((n-1)/2) / gamma(n/2)
> }
> mean(res)
> # 0.972583
> mean(res * correction(10))
> # 0.9999216
>
> The calculation for sample variance is an unbiased estimate of the
> population variance, but square root is a nonlinear function and the
> square root of an unbiased estimator is not itself necessarily unbiased.
Aaaaarrrggghhh. Yes of course. I *know* that you don't get an unbiased
estimate of the sd by using n-1 in the denominator; you get an unbiased
estimate of the variance and as you say, sqrt() is a non-linear function
.....
I just didn't think carefully enuff before I wrote. Thanks for pulling
me up on this error.
cheers,
Rolf
--
Rolf Turner
Technical Editor ANZJS
More information about the R-help
mailing list