[Rd] cor() fails with big dataframe

Prof Brian Ripley ripley at stats.ox.ac.uk
Thu Sep 16 14:39:26 CEST 2004


We do not in general say things like `can be coerced'.  It's taken for 
granted, and hard to be precise (your phrase is not precise, for there are 
non-numeric matrices that will be coerced, too).

We do expect what is stated as valid input to work, and do encourage users
to coerce objects themselves.

On Thu, 16 Sep 2004, Mayeul KAUFFMANN wrote:

> On Thu, 16 Sep 2004, Mayeul KAUFFMANN claimed:
> > ?cor says it accepts data.frame. In fact, it does iff they have no (or
> 
> It actually says
>        x: a numeric vector, matrix or data frame.
>             ^^^^^^^
> If you want to do the conversions as you say, you should be calling
> data.matrix.
> 
> @@@@@@@@@@@@@@@@@@@@@@@@
> 
> Thanks a lot !!!
> When reading it first , I mistranslated it in my mind in a phrase that
> would mean
> "a numeric vector, a matrix or a data frame." (I'm not a native  English
> speaker). Sorry for all that stuff....
> 
> *But* let's admit that the two followings are not treated identically:
> cor(x[,4],x[,5])
> cor(x[,4:5])

Where does it say that would be?

> in the first case, the non-numeric vector is transformed to a numeric one
> in the second case, the (partially) non-numeric dataframe is not
> transformed to a numeric one
> 
> To be more exact,
> the doc should not say
>        x: a numeric vector, matrix or data frame.
>             ^^^^^^^
> but
>        x: a vector that can be coerced to numeric,  a numeric matrix or a
> numeric data frame.
>                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^     ^^^^^^^
> ^^^^^^^
> 
> 
> Cheers,
> Mayeul
> 
> PS:
> by the way, if someones changes the doc,
> the claim  'The default is equivalent to  'y = x' (but more efficient).'
> is inexact as evidenced by the following:
> X <- (data.frame(x=rep(1,5),y=1:5))
> > cor(X,X)
>    x  y
> x NA NA
> y NA  1
> Warning message:
> The standard deviation is zero in: cor(x, y, na.method, method ==
> "kendall")
> > cor(X)
>    x  y
> x  1 NA
> y NA  1
> Warning message:
> The standard deviation is zero in: cor(x, y, na.method, method ==
> "kendall")
> 
> 

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-devel mailing list