[R] Regression model for predicting ranks of the dependent variable
Frank Harrell
f.harrell at Vanderbilt.Edu
Sun Sep 15 10:52:44 CEST 2013
require(rms)
?orm # ordinal regression model
For a case study see Handouts in
http://biostat.mc.vanderbilt.edu/CourseBios330
Since you have lost the original values, one part of the case study will
not apply: the use of Mean().
Frank
-------------
I have a dataset which has several predictor variables and a dependent
variable, "score" (which is numeric). The score for each row is
calculated using a formula which uses some of the predictor variables.
But, the "score" figures are not explicitly given in the dataset. The
scores are only arranged in ascending order, and the ranks of the
numbers are given (like 1, 2, 3, 4, etc.; rank 1 means that the
particular row had the highest score, 2 means it had the second highest
score and so on). So, if the data has 100 rows, the output has ranks
from 1 to 100.
I don't think it would be proper to treat the output column as a numeric
one, since it is an ordinal variable, and the distance (difference in
scores) between ranks 1 and 2 may not be the same as that between ranks
2 and 3. However, most R regression models for ordinal regression are
made for output such as (high, medium, low), where each level of the
output does not necessarily correspond to a unique row. In my case, each
output (rank) corresponds to a unique row.
So please suggest me what models I could use for this problem. Will
treating the output as numeric instead of ordinal be a reasonable
approximation? Or will the usual models for ordinal regression work on
this dataset as well?
More information about the R-help
mailing list