[R] nonparametric multiple sample comparison

Liaw, Andy andy_liaw at merck.com
Mon Apr 26 02:06:29 CEST 2004


Have you looked at the npmc package on CRAN?

HTH,
Andy

> From: aolinto_r at bignet.com.br
> 
> Hello all,
> 
> Here goes one of my first functions.
> 
> I want to make a nonparametric multiple sample comparison 
> with unequal sample 
> sizes (see Zar's Biostatistical Analysis, 3rd. Ed., pg. 201 
> Example 10.11, pg. 
> 288 Example 11.10). In the real world, I want to compare 
> samples of fish 
> length captured with different fishing gears.
> 
> After using the Kruskal-Wallis test I want to check the 
> differences between 
> each two samples. 
> 
> I wrote the function multcomp (see below) and it's working OK 
> but I still have 
> some doubts. To use it one must have a two-column dataframe 
> with groups and 
> measurements.
> 
> 1) In line 20 (Results <- data.frame...) I create a dataframe 
> to store the final 
> results. It has a fixed row number but it would be better to 
> have a variable 
> number, i.e., the number of combination of the groups taken 2 
> by 2 (4 groups = 
> 5 combinations, 6 groups = 15 combinations). Which function 
> in R returns the 
> number of combinations? I couldn't find it!
> 
> 2) In lines 35 to 39 I print the result tables. How can I 
> make a "clean" 
> print, without row numbers and quotation marks?
> 
> 3) Also I'd like to calculate the critical values "q" and 
> p-values but I 
> couldn't find the distribution indicated in Zar's Table B.15 
> (App107) "Critical Values of Q for Nonparametric Multiple 
> Comparison Testing". 
> Is it possible to calculate these numbers in R?
> 
> Thanks in advance for any help or comments to improve this function.
> 
> Best regards,
> 
> Antonio Olinto
> Sao Paulo Fisheries Institute
> Brazil
> 
> multcomp <- function(DataSet) {
> dat.multcomp <- DataSet
> names(dat.multcomp) <- c("VarCat","VarNum")
> attach(dat.multcomp)
> dat.multcomp$Rank <- rank(VarNum)
> attach(dat.multcomp)
> RankList <- aggregate(Rank,list(Rank=Rank),FUN=length)
> t <- length(RankList$Rank)
> st <- 0
> for (i in 1:t) if (RankList[i,2]>1) 
> st<-st+(RankList[i,2]^3-RankList[i,2])
> LevCat <- levels(dat.multcomp$VarCat)
> NLevCat <- aggregate(VarCat,list(LevCat=VarCat),FUN=length)
> RLevCat <- aggregate(Rank,list(LevCat=VarCat),FUN=sum)
> MLevCat <- aggregate(Rank,list(LevCat=VarCat),FUN=mean)
> SampleSummary <- 
> data.frame(LevCat,RLevCat[,2],NLevCat[,2],MLevCat[,2])
> names(SampleSummary)<-c("Samples","RSum","N","RMean")
> SampleSummary <- 
> SampleSummary[order(SampleSummary$RMean,decreasing=T),]
> NCat <- length(LevCat)
> N <- length(dat.multcomp$VarCat)
> Results <- data.frame(rep(NA,6),rep(NA,6),rep(NA,6),rep(NA,6))
> names(Results) <- c("Comparison","Difference","SE","Q")
> l <- 1
> for (i in 1:(NCat-1)) {
>    for (j in NCat:(i+1)) {
>    SE <- sqrt(((N*(N+1)/12)-(st/(12*(N-1))))*((1/SampleSummary[i,3])+
> (1/SampleSummary[j,3])))
>    Dif <- SampleSummary[i,4]-SampleSummary[j,4]
>    Q=Dif/SE
> Results[l,1] <- paste(SampleSummary[i,1],"vs",SampleSummary[j,1])
> Results[l,2] <- round(Dif,4)
> Results[l,3] <- round(SE,4)
> Results[l,4] <- round(Q,4)
> l <-l+1
>    }
> }
> print("Sample summary ranked by mean ranks")
> print(SampleSummary)
> print("")
> print("Table of multiple comparisons")
> print(Results)
> }
> 
> ===========
> 
> Data example taken from Zar (Biostatiscical Analysis):
> 
> dat.limno <- data.frame(scan(what = list (Pound=" ", pH=0), sep=" "))
> 1 7.68
> 1 7.69
> 1 7.7
> 1 7.7
> 1 7.72
> 1 7.73
> 1 7.73
> 1 7.76
> 2 7.71
> 2 7.73
> 2 7.74
> 2 7.74
> 2 7.78
> 2 7.78
> 2 7.8
> 2 7.81
> 3 7.74
> 3 7.75
> 3 7.77
> 3 7.78
> 3 7.8
> 3 7.81
> 3 7.84
> 4 7.71
> 4 7.71
> 4 7.74
> 4 7.79
> 4 7.81
> 4 7.85
> 4 7.87
> 4 7.91
> 
> kruskal.test(pH~Pound)
> 
>         Kruskal-Wallis rank sum test
> 
> data
> Kruskal-Wallis chi-squared = 11.9435, df = 3, p-value = 0.007579
> 
> multcomp(dat.limno)
> 
> [1] "Sample summary ranked by mean ranks"
>   Samples  RSum N    RMean
> 3       3 145.0 7 20.71429
> 4       4 163.5 8 20.43750
> 2       2 132.5 8 16.56250
> 1       1  55.0 8  6.87500
> [1] ""
> [1] "Table of multiple comparisons"
>   Comparison Difference     SE      Q
> 1     3 vs 1    13.8393 4.6923 2.9493
> 2     3 vs 2     4.1518 4.6923 0.8848
> 3     3 vs 4     0.2768 4.6923 0.0590
> 4     4 vs 1    13.5625 4.5332 2.9918
> 5     4 vs 2     3.8750 4.5332 0.8548
> 6     2 vs 1     9.6875 4.5332 2.1370
> 
> 
> -------------------------------------------------
> WebMail Bignet - O seu provedor do litoral
> www.bignet.com.br
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
> 
> 


------------------------------------------------------------------------------
Notice:  This e-mail message, together with any attachments,...{{dropped}}




More information about the R-help mailing list