[Rd] Incorrect handling of NA's in cor() (PR#6750)
msa at biostat.mgh.harvard.edu
msa at biostat.mgh.harvard.edu
Fri Apr 9 19:22:43 CEST 2004
Dear Uwe,
You are wrong. First, I've read the help file before
submitting the report. For two variables,
use="pairwise.complete.obs" and use="complete.obs" should be
equivalent, shouldn't it? Of sourse, the results will be
different when we have more than 2 variables. Second, with the
call you proposed I am also getting incorrect result:
> cor(x, y, use="pairwise.complete.obs", method="s")
[1] -0.1428571
The correct result is -0.4, as correctly calculated by
cor.test()
Regards
Marek Ancukiewicz
> X-Original-To: msa at biostat.mgh.harvard.edu
> Date: Fri, 09 Apr 2004 19:06:47 +0200
> From: Uwe Ligges <ligges at statistik.uni-dortmund.de>
> Organization: Fachbereich Statistik, Universitaet Dortmund
> X-Accept-Language: en-us, en, de-de, de
> Cc: R-bugs at biostat.ku.dk
>
> msa at biostat.mgh.harvard.edu wrote:
> > Full_Name: Marek Ancukiewicz
> > Version: 1.8.1
> > OS: Linux
> > Submission from: (NULL) (132.183.12.87)
> >
> >
> > Function cor() incorrectly handles missing observation with method="spearman":
> >
> >
> >>x <- c(1,2,3,NA,5,6)
> >>y <- c(4,NA,2,5,1,3)
> >>cor(x,y,use="complete.obs",method="s")
> >
> > [1] -0.1428571
> >
> >>cor(x[!is.na(x)&!is.na(y)],y[!is.na(x)&!is.na(y)],method="s")
> >
> > [1] -0.4
> >
> > These two results should be the same.
> >
>
>
> No! Please read at least the help file, ?cor, before submitting a bug
> report:
>
>
> "If use is "complete.obs" then missing values are handled by casewise
> deletion. Finally, if use has the value "pairwise.complete.obs" then the
> correlation between each pair of variables is computed using all
> complete pairs of observations on those variables."
>
>
> Hence
> cor(x, y, use="pairwise.complete.obs", method="s")
> is what you expect ...
>
> Uwe Ligges
>
More information about the R-devel
mailing list