[R] cor and missing values. Bug?
Frank E Harrell Jr
feh3k at spamcop.net
Wed May 26 21:16:38 CEST 2004
On 27 May 2004 00:20:17 +0200
Peter Dalgaard <p.dalgaard at biostat.ku.dk> wrote:
> "Robert W. Baer, Ph.D." <rbaer at atsu.edu> writes:
>
> > > Not to put too fine a point on it, but did you consider checking the
> > > NEWS file for the most recent version (1.9.0,
> > > http://cran.r-project.org/src/base/NEWS)?
> > >
> > > o The cor() function did not remove missing values in the
> > > non-Pearson case.
> >
> >
> > There is still something a little strange in version 1.9.0. What is
> > the source of the discrpancy between cor() and cor.test()?
>
> One ranks x and y before removing missing values, the other one
> removes them first and then ranks. It is not really desirable, but a
> better solution is nontrivial (esp. in the "pairwise.complete.obs"
> case) and we did document it in ?cor:
>
> Notice also that the ranking is (currently) done
> removing only cases that are missing on the variable itself,
> which may not be what you expect if you let 'use' be
> '"complete.obs"' or '"pairwise.complete.obs"'.
>
>
> --
> O__ ---- Peter Dalgaard Blegdamsvej 3
> c/ /'_ --- Dept. of Biostatistics 2200 Cph. N
> (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
> ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
>
Some of you may want to look at the old rcorr function in the Hmisc
package, which uses the pairwise complete obs method, uses some C code for
Spearman correlation, and is fast for large matrices.
Frank
---
Frank E Harrell Jr Professor and Chair School of Medicine
Department of Biostatistics Vanderbilt University
More information about the R-help
mailing list