[Rd] cor() fails with big dataframe

Mayeul KAUFFMANN mayeul.kauffmann at tiscali.fr
Thu Sep 16 14:33:15 CEST 2004


On Thu, 16 Sep 2004, Mayeul KAUFFMANN claimed:
> ?cor says it accepts data.frame. In fact, it does iff they have no (or

It actually says
       x: a numeric vector, matrix or data frame.
            ^^^^^^^
If you want to do the conversions as you say, you should be calling
data.matrix.

@@@@@@@@@@@@@@@@@@@@@@@@

Thanks a lot !!!
When reading it first , I mistranslated it in my mind in a phrase that
would mean
"a numeric vector, a matrix or a data frame." (I'm not a native  English
speaker). Sorry for all that stuff....

*But* let's admit that the two followings are not treated identically:
cor(x[,4],x[,5])
cor(x[,4:5])
in the first case, the non-numeric vector is transformed to a numeric one
in the second case, the (partially) non-numeric dataframe is not
transformed to a numeric one

To be more exact,
the doc should not say
       x: a numeric vector, matrix or data frame.
            ^^^^^^^
but
       x: a vector that can be coerced to numeric,  a numeric matrix or a
numeric data frame.
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^     ^^^^^^^
^^^^^^^


Cheers,
Mayeul

PS:
by the way, if someones changes the doc,
the claim  'The default is equivalent to  'y = x' (but more efficient).'
is inexact as evidenced by the following:
X <- (data.frame(x=rep(1,5),y=1:5))
> cor(X,X)
   x  y
x NA NA
y NA  1
Warning message:
The standard deviation is zero in: cor(x, y, na.method, method ==
"kendall")
> cor(X)
   x  y
x  1 NA
y NA  1
Warning message:
The standard deviation is zero in: cor(x, y, na.method, method ==
"kendall")



More information about the R-devel mailing list