[R] Mann-Whitney U
Lucke, Joseph F
Joseph.F.Lucke at uth.tmc.edu
Wed Aug 15 21:35:35 CEST 2007
R and SPSS are using different but equivalent statistics. R is using
the rank sum of group1 adjusted for the mean rank. SPSS is using the
rank sum of group2 adjusted for the mean rank.
Example.
> G1=group1
> G2=group2[-length(group2)] #get rid of the NA
> n1=length(G1) #n1=28
> n2=length(G2) #n2=27
# convert to ranks
> W=rank(c(G1,G2))
> R1=W[1:n1] #put the ranks back into the groups
> R2=W[n1+1:n2]
#Get the sum of the ranks for each group
> W1=sum(R1)
> W2=sum(R2)
#Adjust for mean rank for group 1
> W1-n1*(n1+1)/2
[1] 405.5
#Adjust for mean rank for group 2
> W2-n2*(n2+1)/2
[1] 350.5
W1-n1*(n1+1)/2 gives R's result; W2-n2*(n2+1)/2 gives SPSS's result.
Ties throw a wrench in the works. R uses a continuity correction by
default, SPSS does not.
Taking out the continuity correction,
> wilcox.test(G1,G2,correct=FALSE)
Wilcoxon rank sum test
data: G1 and G2
W = 405.5, p-value = 0.6433
alternative hypothesis: true location shift is not equal to 0
Warning message:
cannot compute exact p-value with ties in: wilcox.test.default(G1, G2,
correct = FALSE)
This p-value is the same as SPSS's.
Consult a serious non-parametrics text. I used
Lehmann, E. L., Nonparametrics: Statistical methods based on ranks.
1975. Holden-Day. San Francisco, CA.
-----Original Message-----
From: r-help-bounces at stat.math.ethz.ch
[mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Natalie O'Toole
Sent: Wednesday, August 15, 2007 1:07 PM
To: r-help at stat.math.ethz.ch
Subject: Re: [R] Mann-Whitney U
Hi,
I do want to use the Mann-Whitney test which ranks my data and then uses
those ranks rather than the actual data.
Here is the R code i am using:
group1<-
c(1.34,1.47,1.48,1.49,1.62,1.67,1.7,1.7,1.7,1.73,1.81,1.84,1.9,1.96,2,2,
2.19,2.29,2.29,2.41,2.41,2.46,2.5,2.6,2.8,2.8,3.07,3.3)
> group2<-
c(0.98,1.18,1.25,1.33,1.38,1.4,1.49,1.57,1.72,1.75,1.8,1.82,1.86,1.9,1.9
7,2.04,2.14,2.18,2.49,2.5,2.55,2.57,2.64,2.73,2.77,2.9,2.94,NA)
> result <- wilcox.test(group1, group2, paired=FALSE, conf.level =
> 0.95,
na.action)
paired = FALSE so that the Wilcoxon rank sum test which is equivalent to
the Mann-Whitney test is used (my samples are NOT paired).
conf.level = 0.95 to specify the confidence level na.action is used
because i have a NA value (i suspect i am not using na.action in the
correct manner)
When i use this code i get the following error message:
Error in arg == choices : comparison (1) is possible only for atomic and
list types
When i use this code:
group1<-
c(1.34,1.47,1.48,1.49,1.62,1.67,1.7,1.7,1.7,1.73,1.81,1.84,1.9,1.96,2,2,
2.19,2.29,2.29,2.41,2.41,2.46,2.5,2.6,2.8,2.8,3.07,3.3)
> group2<-
c(0.98,1.18,1.25,1.33,1.38,1.4,1.49,1.57,1.72,1.75,1.8,1.82,1.86,1.9,1.9
7,2.04,2.14,2.18,2.49,2.5,2.55,2.57,2.64,2.73,2.77,2.9,2.94,NA)
> result <- wilcox.test(group1, group2, paired=FALSE, conf.level =
> 0.95)
I get the following result:
Wilcoxon rank sum test with continuity correction
data: group1 and group2
W = 405.5, p-value = 0.6494
alternative hypothesis: true location shift is not equal to 0
Warning message:
cannot compute exact p-value with ties in: wilcox.test.default(group1,
group2, paired = FALSE, conf.level = 0.95)
The W value here is 405.5 with a p-value of 0.6494
in SPSS, i am ranking my data and then performing a Mann-Whitney U by
selecting analyze - non-parametric tests - 2 independent samples and
then checking off the Mann-Whitney U test.
For the Mann-Whitney test in SPSS i am gettting the following results:
Mann-Whitney U = 350.5
2- tailed p value = 0.643
I think maybe the descrepancy has to do with the specification of the NA
values in R, but i'm not sure.
If anyone has any suggestions, please let me know!
I hope i have provided enough information to convey my problem.
Thank-you,
Nat
__________________
Natalie,
It's best to provide at least a sample of your data. Your field names
suggest
that your data might be collected in units of mm^2 or some similar
measurement of area. Why do you want to use Mann-Whitney, which will
rank
your data and then use those ranks rather than your actual data? Unless
your
sample is quite small, why not use a two sample t-test? Also,are your
samples paired? If they aren't, did you use the "paired = FALSE"
option?
JWDougherty
______________________________________________
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
------------------------------------------------------------------------
------------------------------------------------
This communication is intended for the use of the recipient to which it
is
addressed, and may
contain confidential, personal, and or privileged information. Please
contact the sender
immediately if you are not the intended recipient of this communication,
and do not copy,
distribute, or take action relying on it. Any communication received in
error, or subsequent
reply, should be deleted or destroyed.
------------------------------------------------------------------------
------------------------------------------------
This communication is intended for the use of the recipient to which it
is
addressed, and may
contain confidential, personal, and or privileged information. Please
contact the sender
immediately if you are not the intended recipient of this communication,
and do not copy,
distribute, or take action relying on it. Any communication received in
error, or subsequent
reply, should be deleted or destroyed.
------------------------------------------------------------------------
------------------------------------------------
This communication is intended for the use of the recipient to which it
is
addressed, and may
contain confidential, personal, and or privileged information. Please
contact the sender
immediately if you are not the intended recipient of this communication,
and do not copy,
distribute, or take action relying on it. Any communication received in
error, or subsequent
reply, should be deleted or destroyed.
[[alternative HTML version deleted]]
______________________________________________
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list