[R] data analysis

Tue Aug 7 05:20:35 CEST 2007

On Tue, 7 Aug 2007, ted.harding at nessie.mcc.ac.uk wrote:

> On 06-Aug-07 19:26:59, lamack lamack wrote:
>> Dear all, I have a factorial design where the
>> response is an ordered categorical response.
>>
>> treatment (two levels: 1 and 2)
>> time (four levels: 30, 60,90 and 120)
>> ordered response (0,1,2,3)
>>
>> could someone suggest a correct analysis or some references?
>
> For your data below, I would be inclined to start from here,
> which gives the counts for the different responses:
>
>
>               Response
>         --------------------
> Trt Time   0    1    2    3
> --------+--------------------+----
> Tr1  30 |       1         3  |  4
>     60 |       2    1    1  |  4
>     90 |       3    1       |  4
>    120 |       3    1       |  4
> --------+--------------------+---
> Tr2  30 |       2         2  |  4
>     60 |       3    1       |  4
>     90 |       3         1  |  4
>    120 |  1    2    1       |  4
> =================================
> Tr1     |  0    9    3    4  | 16
> --------+--------------------+---
> Tr2     |  1   10    2    3  | 16
> =================================
>
> This suggests that, if anything is happening there at all,
> it is a tendency for high response to occur at shorter times,
> and low response at longer times, with little if any difference
> between the treatments.
>
> To approach this formally, I would consider adopting a
> "re-randomisation" approach, re-allocating the outcomes at
> random in such a way as to preserve the marginal totals,
> and evaluating a statistic T, defined in terms of the counts
> and such as to be sensitive to the kind of effect you seek.
>
> Then situate the value of T obtained from the above counts
> within the distribution of T obtained by this re-randomisation.
>
> There must be, somewhere in R, routines which can perform this
> kind of constrained re-randomisation,but I'm not sufficiently
> familiar with that area of R to know for sure about them.

?r2dtable  for 2D tables.  But there is a classic way to do this without 
using randomization and holding the time*treatment marginals fixed: 
log-linear models.

> I hope other readers who know about this area in R can come
> up with suggestions!

However, that approach is not taking into account that the response is 
ordered. First make sure the variables are factors: here in data frame 
'dat'.

dat <- read.table("...", header=TRUE, colClasses="factor")
library(MASS)
summary(polr(response ~ time*treatment, data = dat))

suggests there is nothing very significant here, and dropping the 
interaction

> summary(polr(response ~ time+treatment, data = dat))

Re-fitting to get Hessian

Call:
polr(formula = response ~ time + treatment, data = dat)

Coefficients:
                 Value Std. Error   t value
time60     -1.7030709  1.0323027 -1.649779
time90     -2.1833059  1.0959290 -1.992196
time120    -2.7900588  1.1703586 -2.383935
treatment2 -0.8168075  0.7663541 -1.065836

shows a marginal effect of time:

> stepAIC(polr(response ~ time*treatment, data = dat))

selects a model with just 'time' as an explanatory variable.

> anova(polr(response ~ time, dat), polr(response ~ 1, dat))
Likelihood ratio tests of ordinal regression models

Response: response
   Model Resid. df Resid. Dev   Test    Df LR stat.    Pr(Chi)
1     1        29   66.58130
2  time        26   59.68091 1 vs 2     3 6.900383 0.07514162

again suggests that the effect of time is marginal.

References: obviously this is covered in MASS (see the R FAQ).

>
> best wishes,
> Ted.
>
>> subject treatment  time   response
>> 1       1            30       3
>> 2       1            30       3
>> 3       1            30       1
>> 4       1            30       3
>> 5       1            60       3
>> 6       1            60       1
>> 7       1            60       1
>> 8       1            60       2
>> 9       1            90       2
>> 10      1            90       1
>> 11      1            90       1
>> 12      1            90       1
>> 13      1           120       2
>> 14      1           120       1
>> 15      1           120       1
>> 16      1           120       1
>> 17      2            30       3
>> 18      2            30       3
>> 19      2            30       1
>> 20      2            30       1
>> 21      2            60       1
>> 22      2            60       2
>> 23      2            60       1
>> 24      2            60       1
>> 25      2            90       1
>> 26      2            90       1
>> 27      2            90       1
>> 28      2            90       3
>> 29      2           120       1
>> 30      2           120       2
>> 31      2           120       0
>> 32      2           120       1

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595