[R] convenient way to calculate specificity, sensitivity and accuracy from raw data
Gabor Grothendieck
ggrothendieck at gmail.com
Mon Sep 1 13:33:34 CEST 2008
Some junk got in at the beginning. It should be:
Lines <- "video 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
2 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
3 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
4 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
5 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0
6 6 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0
7 7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
8 8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0
9 9 0 0 0 0 0 0 0 0 0 1 0 1 1 0 1 1 0 0 0 1 0
10 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11 11 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
12 12 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
13 13 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
14 14 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
15 15 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
16 16 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
17 17 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
18 18 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
19 19 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
20 20 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
21 21 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1
22 22 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
23 23 0 1 0 0 1 0 1 0 0 1 0 0 1 1 0 0 1 0 0 0 0
24 24 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 1 0 0 1
25 25 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 1 0 0 0 0 0
26 26 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0
27 27 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
28 28 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
29 29 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
30 30 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
31 31 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
32 32 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
33 33 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
34 34 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
35 35 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
36 36 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
37 37 0 1 1 0 1 0 0 1 0 0 0 0 1 1 1 0 1 0 0 1 1
38 38 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
39 39 0 1 0 0 1 0 0 1 0 1 1 0 1 1 0 0 1 1 0 1 1
40 40 1 1 1 1 1 0 1 0 0 0 0 1 1 1 1 0 0 1 0 0 1
41 41 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1
42 42 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0"
DF <- read.table(textConnection(Lines), header = TRUE)
pairs <- data.frame(pred = factor(unlist(DF[2:21])), lab = factor(DF[,22]))
pred <- pairs$pred
lab <- pairs$lab
table(pred, lab)
library(caret)
sensitivity(pred, lab)
specificity(pred, lab)
On Mon, Sep 1, 2008 at 7:31 AM, Gabor Grothendieck
<ggrothendieck at gmail.com> wrote:
>
> Lines <- "video 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
> 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 2 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
> 3 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 4 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 5 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0
> 6 6 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0
> 7 7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 8 8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0
> 9 9 0 0 0 0 0 0 0 0 0 1 0 1 1 0 1 1 0 0 0 1 0
> 10 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 11 11 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 12 12 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 13 13 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 14 14 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 15 15 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 16 16 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 17 17 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 18 18 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
> 19 19 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 20 20 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 21 21 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1
> 22 22 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 23 23 0 1 0 0 1 0 1 0 0 1 0 0 1 1 0 0 1 0 0 0 0
> 24 24 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 1 0 0 1
> 25 25 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 1 0 0 0 0 0
> 26 26 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0
> 27 27 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 28 28 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 29 29 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 30 30 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 31 31 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 32 32 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 33 33 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 34 34 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 35 35 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 36 36 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 37 37 0 1 1 0 1 0 0 1 0 0 0 0 1 1 1 0 1 0 0 1 1
> 38 38 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 39 39 0 1 0 0 1 0 0 1 0 1 1 0 1 1 0 0 1 1 0 1 1
> 40 40 1 1 1 1 1 0 1 0 0 0 0 1 1 1 1 0 0 1 0 0 1
> 41 41 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1
> 42 42 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0"
>
> DF <- read.table(textConnection(Lines), header = TRUE)
>
> pairs <- data.frame(pred = factor(unlist(DF[2:21])), lab = factor(DF[,22]))
> head(pairs) # look at first few rows
>
> # predictions and gold standard reference labels
> pred <- pairs$pred
> lab <- pairs$lab
>
> # confusion matrix
> table(pred, lab)
>
> library(caret)
> sensitivity(pred, lab)
> specificity(pred, lab)
>
> See ?sensitivity and ?specificity and specify the third arg if you want the
> second level to represent positive rather than the first.
>
> On Mon, Sep 1, 2008 at 5:27 AM, drflxms <drflxms at googlemail.com> wrote:
>> Dear R-colleagues,
>>
>> this is a question from a R-newbie medical doctor:
>>
>> I am evaluating data on inter-observer-reliability in endoscopy. 20
>> medical doctors judged 42 videos filling out a multiple choice survey
>> for each video. The overall-data is organized in a classical way:
>> observations (items from the multiple choice survey) as columns, each
>> case (identified by the two columns "number of medical doctor" and
>> "number of video") in a row. In addition there is a medical doctor
>> number 21 who is assumed to be a gold-standard.
>>
>> As measure of inter-observer-agreement I calculated kappa according to
>> Fleiss and simple agreement in percent using the routines
>> "kappam.fleiss" and "agree" from the irr-package. Everything worked fine
>> so far.
>>
>> Now I'd like to calculate specificity, sensitivity and accuracy for each
>> item (compared to the gold-standard), as these are well-known and easy
>> to understand quantities for medical doctors.
>>
>> Unfortunately I haven't found a feasible way to do this in R so far. All
>> solutions I found, describe calculation of specificity, sensitivity and
>> accuracy from a contingency-table / confusion-matrix only. For me it is
>> very difficult to create such contingency-tables / confusion-matrices
>> from the raw data I have.
>>
>> So I started to do it in Excel by hand - a lot of work! When I'll keep
>> on doing this, I'll miss the deadline. So maybe someone can help me out:
>>
>> It would be very convenient, if there is way to calculate specificity,
>> sensitivity and accuracy from the very same data.frames I created for
>> the calculation of kappa and agreement. In these data.frames, which were
>> generated from the overall-data-table described above using the
>> "reshape" package, we have the judging medical doctor in the columns and
>> the videos in the rows. In the cells there are the coded answer-options
>> from the multiple choice survey. Please see an simple example with
>> answer-options 0/1 (copied from R console) below:
>>
>> video 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
>> 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>> 2 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
>> 3 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>> 4 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>> 5 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0
>> 6 6 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0
>> 7 7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>> 8 8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0
>> 9 9 0 0 0 0 0 0 0 0 0 1 0 1 1 0 1 1 0 0 0 1 0
>> 10 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>> 11 11 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>> 12 12 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>> 13 13 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>> 14 14 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>> 15 15 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>> 16 16 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>> 17 17 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>> 18 18 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
>> 19 19 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>> 20 20 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>> 21 21 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1
>> 22 22 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>> 23 23 0 1 0 0 1 0 1 0 0 1 0 0 1 1 0 0 1 0 0 0 0
>> 24 24 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 1 0 0 1
>> 25 25 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 1 0 0 0 0 0
>> 26 26 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0
>> 27 27 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>> 28 28 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>> 29 29 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>> 30 30 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>> 31 31 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>> 32 32 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>> 33 33 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>> 34 34 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>> 35 35 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>> 36 36 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>> 37 37 0 1 1 0 1 0 0 1 0 0 0 0 1 1 1 0 1 0 0 1 1
>> 38 38 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>> 39 39 0 1 0 0 1 0 0 1 0 1 1 0 1 1 0 0 1 1 0 1 1
>> 40 40 1 1 1 1 1 0 1 0 0 0 0 1 1 1 1 0 0 1 0 0 1
>> 41 41 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1
>> 42 42 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>>
>> What I did in Excel is: Creating the very same tables using
>> pivot-charts. Comparing columns 1-20 to column 21 (gold-standard),
>> summing up the count of values that are identical to 21. I repeated this
>> for each answer-option. From the results, one can easily calculate
>> specificity, sensitivity and accuracy.
>>
>> How to do this, or something similar leading to the same results in R?
>> I'd appreciate any kind of help very much!
>>
>> Greetings from Munich,
>> Felix
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
More information about the R-help
mailing list