[R] Monte Carlo p-value (was "question")
Spencer Graves
spencer.graves at pdf.com
Tue Mar 9 04:43:00 CET 2004
What is the standard convention for a Monte Carlo p-value when the
observed outcome is more extreme than all simulations? The example
provided by Cédric Fine produced a Monte Carlo p-value, according to
Kjetil Halvorsen, of "2.2e-16 (based on 2000 replicates)". This seems
inappropriate to me.
By reading the code, I found that for this case,
PVAL <- sum(tmp$results >= STATISTIC)/B
where STATISTIC is the observed chi-square while "tmp$results" is a
vector of length B of chi-squares from Monte Carlo simulated tables with
the same marginals. Thus, PVAL ranges over seq(0, 1, length=(B+1)).
For the observed table, presumably, PVAL = 0. The function "chisq.test"
apparently returns an object of class "htest", and "print.htest" calls
"format.pval(x$p.value, digits = digits)", for which "format.pval(0, 4)"
is "< 2.2e-16".
Can someone provide an appropriate reference or sense of the
literature on the appropriate number to report for a Monte Carlo p-value
when the observed is more extreme than all the simulations? A value of
0 or "< 2.2e-16" violates my sense of the logic of this situation. If
the appropriate number is 0.5/B, then the line "PVAL <- sum(tmp$results
>= STATISTIC)/B" could be followed immediately by something like the
following:
if(PVAL==0) PVAL <- 0.5/B
Comments?
Best Wishes,
Spencer Graves
kjetil at entelnet.bo wrote:
>On 8 Mar 2004 at 16:38, cfinet at ens-lyon.fr wrote:
>
>
>
>>I do not manage to make a Fisher´s exact test with the next matrix :
>>
>>
>>
>
>You can consider using chisq.test() with the argument sim=TRUE:
>
>
>
>>mat <- matrix(scan(), 7, 12, byrow=TRUE)
>>
>>
>1: 1 3 0 1 2 9 0 0 2 5 8 6
>13: 0 3 3 0 0 0 0 5 0 3 0 0
>25: 0 0 0 0 0 10 0 0 0 0 10 0
>37: 0 2 0 2 6 14 0 0 6 0 10 6
>49: 0 5 0 0 0 7 0 0 0 2 8 0
>61: 0 0 1 9 4 7 2 1 4 2 12 5
>73: 0 6 0 0 0 0 0 0 0 5 1 3
>85:
>Read 84 items
>
>
>
>>chisq.test(mat, sim=TRUE)
>>
>>
>
> Pearson's Chi-squared test with simulated p-value (based
> on 2000 replicates)
>
>data: mat
>X-squared = 218.3366, df = NA, p-value = < 2.2e-16
>
>Kjetil Halvorsen
>
>
>
>
>
>> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
>>[1,] 1 3 0 1 2 9 0 0 2 5 8 6
>>[2,] 0 3 3 0 0 0 0 5 0 3 0 0
>>[3,] 0 0 0 0 0 10 0 0 0 0 10 0
>>[4,] 0 2 0 2 6 14 0 0 6 0 10 6
>>[5,] 0 5 0 0 0 7 0 0 0 2 8 0
>>[6,] 0 0 1 9 4 7 2 1 4 2 12 5
>>[7,] 0 6 0 0 0 0 0 0 0 5 1 3
>>
>>but I do not understand why it does not work since I obtain the next
>>error message :
>>
>>
>>
>>
>>
>>>fisher.test(enfin.matrix)
>>>
>>>
>>Error in fisher.test(enfin.matrix) : Bug in FEXACT: gave negative key
>>
>>thank you for considering my application
>>
>>cédric finet
>>
>>______________________________________________
>>R-help at stat.math.ethz.ch mailing list
>>https://www.stat.math.ethz.ch/mailman/listinfo/r-help
>>PLEASE do read the posting guide!
>>http://www.R-project.org/posting-guide.html
>>
>>
>
>______________________________________________
>R-help at stat.math.ethz.ch mailing list
>https://www.stat.math.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>
>
More information about the R-help
mailing list