[R] Weird differing results when using the Wilcoxon-test
peter dalgaard
pdalgd at gmail.com
Wed Aug 18 15:47:09 CEST 2010
On Aug 18, 2010, at 11:55 AM, Cedric Laczny wrote:
> I was able to trace down the unexpected behavior to the following line
> SIGMA <- sqrt((n.x * n.y/12) * ((n.x + n.y + 1) -
> sum(NTIES^3 - NTIES)/((n.x + n.y) * (n.x + n.y -
> 1))))
> My calculations of the Z-score for the normal approximation where based on
> using the standard deviation for ranks _without_ ties. The above formula seems
> to account for ties and thus, yields a slightly different z-score. However, the
> data seems to include at most 1 tie (based on rnorm), so it would be the same
> result as if it contained no tie (1^3 - 1 has the same result as 0^3 - 0,
> obviously ;) ) and thus I would expect the result to be the same as when using
> the formula for the standard deviation without ties.
Note the definition of NTIES <- table(r), counting the number of observations tied for a particular rank, so it is all ones if and only if there are NO ties in data.
(If you are in paper-and-pencil mode, these formulas are fairly easily worked out once you realize that you only need the mean and variance of the rank of a single observation -- the covariances are C(R1,R2) = -1/(N-1) V(R1) because of symmetry and the fact that the sum of all N ranks is fixed.)
--
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
More information about the R-help
mailing list