[R] Paired t-tests

Sun Aug 15 21:48:00 CEST 2010

On Aug 15, 2010, at 3:31 PM, Peter Dalgaard wrote:

> Marc Schwartz wrote:
>> On Aug 15, 2010, at 9:05 AM, R Help wrote:
>>
>>> Hello List,
>>>
>>> I'm trying to do a paired t-test, and I'm wondering if it's  
>>> consistent
>>> with equations.  I have a dataset that has a response and two
>>> treatments (here's an example):
>>>
>>>  ID trt order          resp
>>> 17  1   0     1  0.0037513592
>>> 18  2   0     1  0.0118723051
>>> 19  4   0     1  0.0002610251
>>> 20  5   0     1 -0.0077951450
>>> 21  6   0     1  0.0022339952
>>> 22  7   0     2  0.0235195453
>>>
>>> The subjects were randomized and assigned to receive either the
>>> treatment or the placebo first, then the other.  I know I'll
>>> eventually have to move on to a GLM or something that incorporates  
>>> the
>>> order, but for now I wanted to start with a simple t.test.  My  
>>> problem
>>> is that, if I get the responses into two vectors x and y (sorted by
>>> ID) and do a t.test, and then compare that to a formula t.test, they
>>> aren't the same.
>>>
>>>> t.test(x,y,paired=TRUE)
>>> 	Paired t-test
>>>
>>> data:  x and y
>>> t = -0.3492, df = 15, p-value = 0.7318
>>> alternative hypothesis: true difference in means is not equal to 0
>>> 95 percent confidence interval:
>>> -0.010446921  0.007505966
>>> sample estimates:
>>> mean of the differences
>>>          -0.001470477
>>>
>>>> t.test(resp~trt,data=dat1[[3]],paired=TRUE)

Since neither resp or trt would be in dat1[[3]] wouldn't the fact that  
no error was reported imply that either dat1 had been attached (and we  
were not informed of hthat prior attach()-ment or that resp and trt  
are also object names besides being column names inside dat1?

>>> 	Paired t-test
>>>
>>> data:  resp by trt
>>> t = -0.3182, df = 15, p-value = 0.7547
>>> alternative hypothesis: true difference in means is not equal to 0
>>> 95 percent confidence interval:
>>> -0.007096678  0.005253173
>>> sample estimates:
>>> mean of the differences
>>>         -0.0009217521
>>>
>>> What I'm assuming is that the equation isn't retaining the inherent
>>> order of the dataset, so the pairing isn't matching up (even though
>>> the dataset is ordered by ID).  Is there a way to make the t.test
>>> retain the correct ordering?
>>>
>>> Thanks,
>>> Sam
>>
>>
>> See this thread from just 2 days ago:
>>
>>  https://stat.ethz.ch/pipermail/r-help/2010-August/249068.html
>>
>> perhaps focusing on Thomas' reply, which is the next post in the  
>> thread.
>>
>> Bottom line, don't use the formula method for a paired t test.
>
> Yes. I'm not sure the same problem is afoot here, though. In  
> particular,
> I'm puzzled by the fact that there are 15DF in both cases, but  
> different
> average difference. This kind of suggests to me that maybe the x and y
> are not computed correctly. (If only the ordering was scrambled, the
> average difference should be the same, but the variance typically
> inflated.)
>
> -- 
> Peter Dalgaard
-- 

David Winsemius, MD
West Hartford, CT