[R] aucRoc in caret package [SEC=UNCLASSIFIED]

Thu Jun 2 03:24:38 CEST 2011

Please note that predicted1 and predicted2 are two sets of predictions instead of predictors. As you can see the predictions with only two levels, 1 is for hard and 2 for soft. I need to assess which one is more accurate. Hope this is clear now. Thanks.
Jin

-----Original Message-----
From: David Winsemius [mailto:dwinsemius at comcast.net] 
Sent: Thursday, 2 June 2011 10:55 AM
To: Li Jin
Cc: R-help at r-project.org
Subject: Re: [R] aucRoc in caret package [SEC=UNCLASSIFIED]

Using AUC for discrete predictor variables with inly two levels  
doesn't seem very sensible. What are you planning to to with this  
measure?

-- 
David.

On Jun 1, 2011, at 8:47 PM, <Jin.Li at ga.gov.au> <Jin.Li at ga.gov.au> wrote:

> Hi all,
> I used the following code and data to get auc values for two sets of  
> predictions:
>            library(caret)
>> table(predicted1, trainy)
>   trainy
>    hard soft
>  1   27    0
>  2   11   99
>> aucRoc(roc(predicted1, trainy))
> [1] 0.5
>
>
>> table(predicted2, trainy)
>   trainy
>    hard soft
>  1   27    2
>  2   11   97
>> aucRoc(roc(predicted2, trainy))
> [1] 0.8451621
>
> predicted1:
> 1 1 2 2 2 1 2 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 1 2 2 2 2 2 2 2 2 2 1 2  
> 2 2 2 2 1 2 2 2 2 1 1 2 2 2 2 2 1 2 2 2 2 2 1 2 2 2 2 2 2 2 2 2 2 2  
> 2 2 2 1 2 2 2 2 2 2 2 1 2 2 2 2 2 1 1 1 2 2 1 1 1 2 2 2 2 2 1 1 2 2  
> 2 2 2 2 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
>
> predicted2:
> 1 1 2 1 2 1 2 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 1 2 2 2 2 2 2 2 2 1 1 2  
> 2 2 2 2 1 2 2 2 2 1 1 2 2 2 2 2 1 2 2 2 2 2 1 2 2 2 2 2 2 2 2 2 2 2  
> 2 2 2 1 2 2 2 2 2 2 2 1 2 2 2 2 2 1 1 1 2 2 1 1 1 2 2 2 2 2 1 1 2 2  
> 2 2 2 2 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
>
> trainy:
> hard hard hard soft soft hard hard hard hard soft soft soft soft  
> soft soft hard soft soft soft soft soft soft hard soft soft soft  
> soft soft soft soft soft soft hard soft soft soft soft soft hard  
> soft soft soft soft hard hard soft soft soft hard soft hard soft  
> soft soft soft soft hard soft soft soft soft soft soft soft soft  
> hard soft soft soft soft soft hard soft soft soft soft soft soft  
> soft hard soft soft soft hard hard hard hard hard soft soft hard  
> hard hard soft hard soft soft soft hard hard soft soft soft soft  
> soft hard hard hard hard hard hard hard soft soft soft soft soft  
> soft soft soft soft soft soft soft soft soft soft soft hard soft  
> soft soft soft soft soft soft soft
> Levels: hard soft
>
>> Sys.info()
>                     sysname                       
> release                      version                     nodename
>                   "Windows"                      "XP"        "build  
> 2600, Service Pack 3"        "PC-60772"
>                     machine
>                       "x86"
>
> I would expect predicted1 is more accurate that the predicted2. But  
> the auc values show an opposite. I was wondering whether this is a  
> bug or I have done something wrong.  Thanks for your help in advance!
>
> Cheers,
>
> Jin
> ____________________________________
> Jin Li, PhD
> Spatial Modeller/Computational Statistician
> Marine & Coastal Environment
> Geoscience Australia
> GPO Box 378, Canberra, ACT 2601, Australia
>
> Ph: 61 (02) 6249 9899; email:  
> jin.li at ga.gov.au<mailto:jin.li at ga.gov.au>
> _______________________________________
>
>
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT