[R] Mann-Whitney U

Lucke, Joseph F Joseph.F.Lucke at uth.tmc.edu
Wed Aug 15 21:35:35 CEST 2007


R and SPSS are using different but equivalent statistics.  R is using
the rank sum of group1 adjusted for the mean rank. SPSS is using the
rank sum of group2 adjusted for the mean rank. 

Example.
> G1=group1
> G2=group2[-length(group2)] #get rid of the NA
> n1=length(G1) #n1=28
> n2=length(G2) #n2=27
# convert to ranks
> W=rank(c(G1,G2))
> R1=W[1:n1] #put the ranks back into the groups
> R2=W[n1+1:n2]
#Get the sum of the ranks for each group
> W1=sum(R1)
> W2=sum(R2)
#Adjust for mean rank for group 1
> W1-n1*(n1+1)/2
[1] 405.5
#Adjust for mean rank for group 2
> W2-n2*(n2+1)/2
[1] 350.5

W1-n1*(n1+1)/2 gives R's result; W2-n2*(n2+1)/2 gives SPSS's result.

Ties throw a wrench in the works.  R uses a continuity correction by
default, SPSS does not.
Taking out the continuity correction,
> wilcox.test(G1,G2,correct=FALSE)

        Wilcoxon rank sum test

data:  G1 and G2 
W = 405.5, p-value = 0.6433
alternative hypothesis: true location shift is not equal to 0 

Warning message:
cannot compute exact p-value with ties in: wilcox.test.default(G1, G2,
correct = FALSE) 

This p-value is the same as SPSS's.


Consult a serious non-parametrics text.  I used
Lehmann, E. L., Nonparametrics: Statistical methods based on ranks.
1975. Holden-Day. San Francisco, CA.


-----Original Message-----
From: r-help-bounces at stat.math.ethz.ch
[mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Natalie O'Toole
Sent: Wednesday, August 15, 2007 1:07 PM
To: r-help at stat.math.ethz.ch
Subject: Re: [R] Mann-Whitney U

Hi,

I do want to use the Mann-Whitney test which ranks my data and then uses
those ranks rather than the actual data.

Here is the R code i am using:

 group1<-
c(1.34,1.47,1.48,1.49,1.62,1.67,1.7,1.7,1.7,1.73,1.81,1.84,1.9,1.96,2,2,
2.19,2.29,2.29,2.41,2.41,2.46,2.5,2.6,2.8,2.8,3.07,3.3)
> group2<-
c(0.98,1.18,1.25,1.33,1.38,1.4,1.49,1.57,1.72,1.75,1.8,1.82,1.86,1.9,1.9
7,2.04,2.14,2.18,2.49,2.5,2.55,2.57,2.64,2.73,2.77,2.9,2.94,NA)
> result <-  wilcox.test(group1, group2, paired=FALSE, conf.level = 
> 0.95,
na.action)

paired = FALSE so that the Wilcoxon rank sum test which is equivalent to
the Mann-Whitney test is used (my samples are NOT paired).
conf.level = 0.95 to specify the confidence level na.action is used
because i have a NA value (i suspect i am not using na.action in the
correct manner)

When i use this code i get the following error message:

Error in arg == choices : comparison (1) is possible only for atomic and
list types

When i use this code:

 group1<-
c(1.34,1.47,1.48,1.49,1.62,1.67,1.7,1.7,1.7,1.73,1.81,1.84,1.9,1.96,2,2,
2.19,2.29,2.29,2.41,2.41,2.46,2.5,2.6,2.8,2.8,3.07,3.3)
> group2<-
c(0.98,1.18,1.25,1.33,1.38,1.4,1.49,1.57,1.72,1.75,1.8,1.82,1.86,1.9,1.9
7,2.04,2.14,2.18,2.49,2.5,2.55,2.57,2.64,2.73,2.77,2.9,2.94,NA)
> result <-  wilcox.test(group1, group2, paired=FALSE, conf.level = 
> 0.95)

I get the following result:

  Wilcoxon rank sum test with continuity correction

data:  group1 and group2
W = 405.5, p-value = 0.6494
alternative hypothesis: true location shift is not equal to 0 

Warning message:
cannot compute exact p-value with ties in: wilcox.test.default(group1,
group2, paired = FALSE, conf.level = 0.95) 

The W value here is 405.5 with a p-value of 0.6494


in SPSS, i am ranking my data and then performing a Mann-Whitney U by
selecting analyze - non-parametric tests - 2 independent samples  and
then checking off the Mann-Whitney U test.

For the Mann-Whitney test in SPSS i am gettting the following results:

Mann-Whitney U = 350.5
 2- tailed p value = 0.643

I think maybe the descrepancy has to do with the specification of the NA
values in R, but i'm not sure.


If anyone has any suggestions, please let me know!

I hope i have provided enough information to convey my problem.

Thank-you, 

Nat
__________________


Natalie,

It's best to provide at least a sample of your data.  Your field names 
suggest 
that your data might be collected in units of mm^2 or some similar 
measurement of area.  Why do you want to use Mann-Whitney, which will
rank 

your data and then use those ranks rather than your actual data?  Unless

your 
sample is quite small, why not use a two sample t-test?  Also,are your 
samples paired?  If they aren't, did you use the "paired = FALSE"
option?

JWDougherty

______________________________________________
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



------------------------------------------------------------------------
------------------------------------------------ 

This communication is intended for the use of the recipient to which it
is 
addressed, and may
contain confidential, personal, and or privileged information. Please 
contact the sender
immediately if you are not the intended recipient of this communication,

and do not copy,
distribute, or take action relying on it. Any communication received in 
error, or subsequent
reply, should be deleted or destroyed.


------------------------------------------------------------------------
------------------------------------------------ 

This communication is intended for the use of the recipient to which it
is 
addressed, and may
contain confidential, personal, and or privileged information. Please 
contact the sender
immediately if you are not the intended recipient of this communication,

and do not copy,
distribute, or take action relying on it. Any communication received in 
error, or subsequent
reply, should be deleted or destroyed.
------------------------------------------------------------------------
------------------------------------------------ 

This communication is intended for the use of the recipient to which it
is 
addressed, and may
contain confidential, personal, and or privileged information. Please 
contact the sender
immediately if you are not the intended recipient of this communication,

and do not copy,
distribute, or take action relying on it. Any communication received in 
error, or subsequent
reply, should be deleted or destroyed.
	[[alternative HTML version deleted]]

______________________________________________
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list