[R] How to compare areas under ROC curves calculated with ROCR package
Frank Samuelson
expiregmane0306.m.cudgle at neverbox.com
Thu Mar 23 15:54:02 CET 2006
The seROC routine you included is an very good approximation to the
standard error of the Mann-Whitney-Wilcoxon/Area under the ROC curve
statistic. It is derived from negative exponential models, but works
very well in general (e.g. Hanley and McNeil, Diagnostic Radiology,
1982, v. 143, p. 29).
A more general estimator of the variance is given by Campbell,
Douglas and Bailey, Proc. Computers in Cardiology, 1988, p.267)
I've implemented that in R code included below. It is not an unbiased
estimator, but it is very close.
The cROC function is probably not what you want, however.
It assumes that the data from the two different area measures
are independent. You said your measures are "from the same dataset."
Your different AUC measures will be highly correlated.
There are a number of methods to deal with correlated ROC curves
in existence.
If you are interested in performing hypothesis testing on the difference
in AUC of two parameters, I would suggest a permutation test.
Permuting the ranks of the data between parameters is
simple and works well.
-Frank
##################################################################
AuROC<-function(neg,pos) { #empirical Area under ROC/ Wilcoxon-Mann-.... stat.
# Also calculate the empirical variance thereof. Goes as O(n*log(n)).
nx<-length(neg);
ny<-length(pos);
nall<-nx+ny;
rankall<-rank(c(neg,pos)) # rank of all samples with respect to one another.
rankpos<-rankall[(nx+1):nall]; # ranks of the positive cases
ranksum <-sum(rankpos)-ny*(ny+1)/2 #sum of ranks of positives among negs.
ranky<-rank(pos); ## ranks of just the y's (positives) among themselves
rankyx<-rankpos-ranky # ranks of the y's among the x's (negatives)
p21<-sum(rankyx*rankyx-rankyx)/nx/(nx-1)/ny; #term in variance
rankx<-rank(neg); ## ranks of x's (negatives) among each other
## reverse ranks of x's with respect to y's.
rankxy<- ny- rankall[1:nx]+ rankx ;
p12<- sum(rankxy*rankxy-rankxy)/nx/ny/(ny-1); #another variance term
a<-ranksum/ny/nx; # the empirical area
v<-(a*(1-a)+(ny-1)*(p12-a*a) + (nx-1)*(p21-a*a))/nx/ny;
c(a,v); # return vector containing Mann-Whitney stat and the variance.
}
####################################################
Laurent Fanchon wrote:
> Dear all,
>
> I try to compare the performances of several parameters to diagnose
> lameness in dogs.
> I have several ROC curves from the same dataset.
> I plotted the ROC curves and calculated AUC with the ROCR package.
>
> I would like to compare the AUC.
> I used the following program I found on R-help archives :
>
> From: Bernardo Rangel Tura
> Date: Thu 16 Dec 2004 - 07:30:37 EST
>
> seROC<-function(AUC,na,nn){
> a<-AUC
> q1<-a/(2-a)
> q2<-(2*a^2)/(1+a)
> se<-sqrt((a*(1-a)+(na-1)*(q1-a^2)+(nn-1)*(q2-a^2))/(nn*na))
> se
> }
>
> cROC<-function(AUC1,na1,nn1,AUC2,na2,nn2,r){
> se1<-seROC(AUC1,na1,nn1)
> se2<-seROC(AUC2,na2,nn2)
>
> sed<-sqrt(se1^2+se2^2-2*r*se1*se2)
> zad<-(AUC1-AUC2)/sed
> p<-dnorm(zad)
> a<-list(zad,p)
> a
> }
>
> The author of this script says: "The first function (seROC) calculate the standard error of ROC curve, the
> second function (cROC) compare ROC curves."
>
> What do you think of this script?
> Is there any function to do it better in ROCR?
>
> Any help would be greatly appreciated.
>
> Laurent Fanchon
> DVM, MS
> Ecole Nationale Vétérinaire d'Alfort
> FRANCE
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>
More information about the R-help
mailing list