[R] Discrepancy with p.value from t.test
Peter Langfelder
peter.langfelder at gmail.com
Wed Nov 2 00:32:46 CET 2011
On Tue, Nov 1, 2011 at 12:40 PM, Jonathan Thayn <jthayn at ilstu.edu> wrote:
> Sometimes the p.value returned by t.test() is the same that I calculate using pt() and sometimes it's not. I don't understand the difference. I'm sure there is a simple explanation but I haven't been able to find it, even after looking at the code for t.test.default. I apologize if this is a basic and obvious question. For example:
>
>> data(sleep)
>> t.test(extra~group,data=sleep,var.equal=T)
>
> # the p.value returned is 0.07939
>
>> 2*pt(-1.8608,18) # using the t.statistic and the df returned above
> [1] 0.0791887
>
> These p.values are the same. However, they are different when I use a different dataset:
>
>> data(beavers)
>> b1 <- beaver1$temp
>> b2 <- beaver2$temp
>> t.test(b1,b2,var.equal=T)
>
> # the p.value returned is 2.2e-16
>
>> 2*pt(-15.9366,212) # using the t.statistic and the df returned above
> [1] 4.10686e-38
>
>
If you read the output of t.test carefully, you will find something like
p-value < 2.2e-16
not
p-value = 2.2e-16
so the results are not inconsistent. Not sure why t.test is coded that
way, perhaps the p-value calculation is not very reliable below
roughly 2e-16. This issue could also come up if the function doesn't
use lower/upper tail of the distribution function as needed and then
must subtract the calculated results from 1 to obtain the returned
value.
Here's an example:
> x = rnorm(100)
> y = x<0
> t.test(x~y)
Welch Two Sample t-test
data: x by y
t = 12.9463, df = 97.424, p-value < 2.2e-16
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
1.397253 1.903200
sample estimates:
mean in group FALSE mean in group TRUE
0.7596083 -0.8906181
Now do a naive pt:
> pt(12.9463, df = 97.424)
[1] 1
my desired p-value is 1-pt(12.9463, df = 97.424) but that's zero. Of
course, I can get the p-value in a more intelligent way,
> pt(12.9463, df = 97.424, lower.tail = FALSE)
[1] 3.394337e-23
Peter
More information about the R-help
mailing list