[R] how to perform multiple comparison?

David Winsemius dwinsemius at comcast.net
Fri May 20 03:48:26 CEST 2016


> On May 19, 2016, at 5:19 PM, Jim Lemon <drjimlemon at gmail.com> wrote:
> 
> Hi laomeng_3,
> Have a look at the padjust function (stats).
> 
> Jim
> 
> 
> On Fri, May 20, 2016 at 1:56 AM, laomeng_3 <laomeng_3 at 163.com> wrote:
>> Hi all:
>> As to the anova, we can perform multiple comparison via TukeyHSD.
>> But as to chi-square test for frequency table,how to perform multiple comparison?
>> 
>> For example, if I want to compare 3 samples' ratio(the data has 3 rows,each row corresponds to 1 sample,and has 2 columns,each column corresponds to positive and negative respectively).
>> 
>> 
>> dat<-matrix(c(6,30,8,23,14,3),nrow=3)
>> dat
>>      [,1] [,2]
>> [1,]    6   23
>> [2,]   30   14
>> [3,]    8    3
>> 
>> 
>> 
>> chisq.test(dat)
>> 
>>       Pearson's Chi-squared test
>> 
>> data:  dat
>> X-squared = 17.9066, df = 2, p-value = 0.0001293
>> 
>> 
>> The result shows that the difference between the 3 samples is significant.But if I want to perform multiple comparison to find out which pair of samples is  significantly different,which function should be used?
>> 

It appears your question is which row(s) are contributing most greatly to the overall test of independence. The result of a `chisq.test(.)` (which is not what you see from its print method) has a component named residuals. (Read the help page : ?chisq.test)

x2 <- chisq.test(dat)
x2$residuals
           [,1]       [,2]
[1,] -2.3580463  2.4731398
[2,]  1.4481733 -1.5188569
[3,]  0.9323855 -0.9778942



Those row sums should be distributed as chi-squared statistics with one degree of freedom each, but since you have admittedly inflated the possibility of the type I error, it would be sensible to adjust the "p-statistics" using the function that Jim Lemon misspelled:

> rowSums(x2$residuals^2)
[1] 11.676803  4.404132  1.825620

> p.adjust( 1- pchisq( rowSums(x2$residuals^2), 1) )

[1] 0.001898526 0.071703921 0.176645786

So row 1 represents the only group that is "significantly different at the conventional level" from the expectations based on the overall sample collection. I also seem to remember that there is a function named CrossTable (in a package whose name I'm forgetting) that will deliver a SAS-style tabulation of row and column chi-squared statistics.

-- 
David.

>> 
>> Many thanks for your help.
>> 
>> My best
>> 
>> 
>> 
>> 发自 网易邮箱大师
>>        [[alternative HTML version deleted]]
>> 
>> 
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA



More information about the R-help mailing list