[R] Correlation question
Stephane Vaucher
vauchers at iro.umontreal.ca
Thu Sep 9 21:30:50 CEST 2010
Hi everyone,
Thanks for the help.
On Thu, 9 Sep 2010, Peter Ehlers wrote:
> The first thing to do when you get results that you don't expect is
> to check the help page. The page for cor clearly states that its
> input is to a *numeric* vector, matrix or data frame (my emphasis).
> I would not be happy if R simply ignored non-numeric data. After all,
> it's trivial to ensure that you feed only numeric data to cor().
Indeed, the documentation states that it takes a numeric input. It
does not state how it would react to an inappropriate input type. That's
why I expected either to produce an error message or accurate results. I did
not expect an incorrect result. I should not have assume that my
expectations would be correct.
> Having said that, I guess others have found cor() problematic when
> non-valid input is supplied and so R now (as of 2.11.0) issues an
> error message that "'x' must be numeric". You should always check the
> latest released version to see if changes have been made. The NEWS
> file for 2.11.0 contains this:
> cor() and cov() now test for misuse with non-numeric
> arguments, such as the non-bug report PR#14207.
> You're doing the right thing by asking here first before reporting.
> It would definitely not be a good idea to report a (non-)bug
> in an outdated version of R.
Since my manipulations were simple, I assumed that others would have
observed the same behaviour. In any case, I'm happy that the function
checks the respect of the preconditions preconditions. Otherwise, it would
have been good to add to the documentation and state that when there are
non-numeric data, cor() can compute garbage.
cheers,
Stephane
> -Peter Ehlers
>
> [rest snipped; not relevant to my comments.]
>
More information about the R-help
mailing list