[R] Correlation question

Peter Ehlers ehlers at ucalgary.ca
Thu Sep 9 21:06:12 CEST 2010


On 2010-09-09 11:53, Stephane Vaucher wrote:
> Hi Josh,
>
> Initially, I was expecting R to simply ignore non-numeric data. I guess I
> was wrong... I copy-pasted what I observe, and I do not get an error when

The first thing to do when you get results that you don't expect is
to check the help page. The page for cor clearly states that its
input is to a *numeric* vector, matrix or data frame (my emphasis).
I would not be happy if R simply ignored non-numeric data. After all,
it's trivial to ensure that you feed only numeric data to cor().

Having said that, I guess others have found cor() problematic when
non-valid input is supplied and so R now (as of 2.11.0) issues an
error message that "'x' must be numeric". You should always check the
latest released version to see if changes have been made. The NEWS
file for 2.11.0 contains this:

   cor() and cov() now test for misuse with non-numeric
   arguments, such as the non-bug report PR#14207.

> calculating correlations with text data. I can also do cor(test.n$P3,
> test$P7) without an error.
>
> If you have a function to select only numeric columns that
> you can share with me (and the list), that would be great. Of course, I'm
> wondering why your version of R produces different results from mine. I
> don't know if I should open a bug report. It would be good if someone

You're doing the right thing by asking here first before reporting.
It would definitely not be a good idea to report a (non-)bug
in an outdated version of R.

   -Peter Ehlers

[rest snipped; not relevant to my comments.]



More information about the R-help mailing list