[BioC] Minimal group sizes in permutation tests

Robert Gentleman rgentlem at fhcrc.org
Tue Nov 13 17:45:51 CET 2007


Hi,

Claus-Dieter Mayer wrote:
> Hi Benjamin,
> 
> Benjamin Otto wrote:
>> Hey all,
>>
>> 1. For statistical tests there are usually minimal group sizes recommended
>> for appropriate working. For a chi-square test as example the lower level
>> was 10 obersvations in each field of the table, if I remember correctly.
>> What about permutation tests? Is there some kind of minimal recommendation
>> for group sizes? I can't find any hint on that.
>>   
> The group sizes determine how many different possible permutations they 
> are, eg. with 3 samples each in 2 groups you only have 20 permutations. 
> If you would use a 1-sided permutation-test in that situation the 
> smallest possible p-value thus would be 5%, i.e. you have no chance to 
> ever find a significant result at a 5% level (for a 2-sided test you 
> wouldn't even be able to get below 10%). In my opinion the number of 
> different permutations should be at least in the hundreds, so for a 
> 2-group comparison I wouldn't use a permutation test for anything less 
> than 5 per group (in which case you have 252 permutations). For other 
> designs you would have to calculate the number of possible permutations 
> to see whether it makes sense.
> Apart from that I see little other constraints in using a permutation 
> test as long as you are sure that under the nullhypothesis you are 
> testing the variables are "i.i.d" (= independently indentically distributed)

   Unless you have more possible permutations than you can compute, 
please make sure that you just enumerate all permutations, compute the 
test statistic for each and use that for your reference distribution. It 
makes no sense to sample from this distribution by generating random 
permutations if the number of permutations is small.


>> 2. As far as I understand the permutation p-value is given by the quantile
>> describing the position of the native p-value in the permutation p-value
>> distribution. So for 100 permutations and 5 values smaller than the native
>> one the new p-value would be 0.05. What happens when the original p-value is
>> the absolut minimum? Is such a thing like p-value equals zero defined?
>>   
> The classical definition of the p-value calculates the probability of 
> observing an outcome as extrem or even more extrem as the observed one. 
> So if the observed value is the smalles of the 100 permutations your 
> p-value would 1%.
>> 3. Given a design of 3x3 samples (20 permutations), will the test return
>> reasonable values? Doesn't look like it to me.
>>   
> for 2x3 samples it would be 20 permutations, but for 3x3 the number will 
> be bigger
> 
> Claus
>> Best regards,
>>
>> Benjamin Otto
>>
>>
>> ======================================
>> Benjamin Otto
>> University Hospital Hamburg-Eppendorf
>> Institute For Clinical Chemistry
>> Martinistr. 52
>> D-20246 Hamburg
>>
>> Tel.: +49 40 42803 1908
>> Fax.: +49 40 42803 4971
>> ======================================
>>
>>
>>
>>   
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>> ------------------------------------------------------------------------
>>
>>  
>>   
> 

-- 
Robert Gentleman, PhD
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M2-B876
PO Box 19024
Seattle, Washington 98109-1024
206-667-7700
rgentlem at fhcrc.org



More information about the Bioconductor mailing list