[R] Correlation discrepancy

(Ted Harding) ted.harding at wlandres.net
Tue Aug 23 13:38:41 CEST 2011


In addition, something has gone wrong, Vincy, with your data x,y
between evaluating cov(x,y) and evaluating your explicit formula.

If I repeat your commands:

  x = c(44,46,46,47,45,43,45,44)
  y = c(44,43,41,41,46,48,44,43)
  cov(x, y)
  # [1] -2.428571

  sum((x-mean(x))*(y-mean(y)))/8
  # [1] -2.125

which has the right sign and, when changed to incorporate the
correct denomonator (n-1 = 7) as suggested by Dimitris:

  sum((x-mean(x))*(y-mean(y)))/7
  # [1] -2.428571

gives exact agreement. With regard to your second formula, this
should correspondingly be:

  sum(x*y)/7 - (mean(x)*mean(y))*8/7
  # [1] -2.428571

again agreeing exactly. Your result:

>> covariance = sum((x-mean(x))*(y-mean(y)))/8   # no of of paired
>> obs. = 8
>>
>> or
>>
>> covariance = sum(x*y)/8-(mean(x)*mean(y))
>>
>> gives
>>
>> covariance = 2.125

agrees in numerical magnitude with the "1/8" form, but has
the wrong sign. Or maybe you simply mis-typed "-2.125" as "2.125".

Hoping this helps,
Ted.

On 23-Aug-11 11:25:15, Dimitris Rizopoulos wrote:
> well, you don't have the correct denominator, i.e., n-1,
> with n denoting the sample size. Have a look at the *Details*
> section of the online help file for cov(), and try also
> 
> sum((x-mean(x))*(y-mean(y)))/7
> cov(x, y)
> 
> 
> I hope it helps.
> 
> Best,
> Dimitris
> 
> 
> On 8/23/2011 1:18 PM, Vincy Pyne wrote:
>> Dear R list, I have one very elementary question regrading correlation
>> between two variables.
>>
>> x = c(44,46,46,47,45,43,45,44)
>> y = c(44,43,41,41,46,48,44,43)
>>
>>> cov(x, y)
>> [1] -2.428571
>>
>> However, if I try to calculate the covariance using the formula as
>>
>>
>> covariance = sum((x-mean(x))*(y-mean(y)))/8       # no of of paired
>> obs. = 8
>>
>> or
>>
>> covariance = sum(x*y)/8-(mean(x)*mean(y))
>>
>> gives
>>
>> covariance = 2.125
>>
>> I am not able to figure out where I am going wrong w.r.t. the
>> covariance formula. Kindly guide.
>>
>> Regards
>>
>> Vincy
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>      [[alternative HTML version deleted]]
>>
>>
>>
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> -- 
> Dimitris Rizopoulos
> Assistant Professor
> Department of Biostatistics
> Erasmus University Medical Center
> 
> Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
> Tel: +31/(0)10/7043478
> Fax: +31/(0)10/7043014
> Web: http://www.erasmusmc.nl/biostatistiek/
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <ted.harding at wlandres.net>
Fax-to-email: +44 (0)870 094 0861
Date: 23-Aug-11                                       Time: 12:38:36
------------------------------ XFMail ------------------------------



More information about the R-help mailing list