John Haart
another83 at me.com
Fri Oct 1 13:24:02 CEST 2010
Frank,
Thats great thanks for the advice, i appreciate that brier score, AUC etc are a better method of validation and discrimination but when it comes to predictions of new data
> d <- data.frame(x1=c(.1,.5),x2=c(.5,.15))
> predict(f, d, type="fitted.ind")
>
> y=good y=better y=best
> 1 0.3199710 0.3560355 0.3239935
> 2 0.4153257 0.3437086 0.2409657
>
> predict mean(y) using codes 1,2,3
>
>
>> predict(f, d, type='mean', codes=TRUE)
>
> 1 2
> 2.004022 1.825640
How do i use this information to assign x1 and x2 into a category on the response scale (good,better,best?)
Thanks
John
On 1 Oct 2010, at 12:14, Frank Harrell wrote:
John,
Don't conclude that one category is the most probable when its probability
of being equaled or exceeded is a maximum. The first category would always
be the winner if that were the case.
When you say y=best remember that you are dealing with a probability model.
Nothing is forcing you to classify an observation, and unless the category's
probability is high, this may be dangerous. You might do well to consider a
more smooth approach such as using the generalized roc area (C-index) or its
related rank correlation measure Dxy. Also there are odds ratios.
Frank
Frank Harrell
Department of Biostatistics, Vanderbilt University
