[R] Ordinal data - Regression Trees & Proportional Odds

John Fieberg John.Fieberg at dnr.state.mn.us
Wed May 28 23:26:03 CEST 2003

I have a data set w/ an ordinal response taking on one of 10 categories.
 I am considering using polr to fit a cumulative logits model.  I
previously fit the model in SAS (using proc logistic) which provides a
test for the proportional odds assumption (p < 0.001 for the test).  Are
there simple diagnostic plots that can be used to look at the validity
of this assumption and possibly help w/ modifying the model as
appropriate?  Any references or examples of useful R code for addressing
the proportional odds assumption would be much appreciated!

I also used a regression tree approach to explore this data set.  In
doing so, I treated the response as numeric, using the rpart library.  I
am rather new to regression trees - and wondered about the validity of
this approach.  I used cross-validation to prune the tree - but plots of
the response clearly indicate that the data are non-normal and don't
have equal variance (the data are highly skewed towards larger response
categories - values of 8-10).  I have seen some people suggest that the
tree approach is essentially non-parametric - but then I have seen other
references suggesting examination of residual plots and potential
transformations of the response to ensure homogeneity of variance.  For
this data set, it will be difficult to find an appropriate
transformation, given the large number of responses near 10 (i.e., the
fact that the data are constrained to be less than or equal to 10
results in strange residual plots).

Any help is much appreciated!

John Fieberg, Ph.D.
Wildlife Biometrician, Minnesota DNR
5463-C W. Broadway
Forest Lake, MN 55434

More information about the R-help mailing list