[R] convenient way to calculate specificity, sensitivity and accuracy from raw data

drflxms drflxms at googlemail.com
Mon Sep 1 11:27:43 CEST 2008


Dear R-colleagues,

this is a question from a R-newbie medical doctor:

I am evaluating data on inter-observer-reliability in endoscopy. 20
medical doctors judged 42 videos filling out a multiple choice survey
for each video. The overall-data is organized in a classical way:
observations (items from the multiple choice survey) as columns, each
case (identified by the two columns "number of medical doctor" and
"number of video") in a row. In addition there is a medical doctor
number 21 who is assumed to be a gold-standard.

As measure of  inter-observer-agreement I calculated kappa according to
Fleiss and simple agreement in percent using the routines
"kappam.fleiss" and "agree" from the irr-package. Everything worked fine
so far.

Now I'd like to calculate specificity, sensitivity and accuracy for each
item (compared to the gold-standard), as these are well-known and easy
to understand quantities for medical doctors.

Unfortunately I haven't found a feasible way to do this in R so far. All
solutions I found, describe calculation of specificity, sensitivity and
accuracy from a contingency-table / confusion-matrix only. For me it is
very difficult to create such contingency-tables / confusion-matrices
from the raw data I have.

So I started to do it in Excel by hand - a lot of work! When I'll keep
on doing this, I'll miss the deadline. So maybe someone can help me out:

It would be very convenient, if there is way to calculate specificity,
sensitivity and accuracy from the very same data.frames I created for
the calculation of kappa and agreement. In these data.frames, which were
generated from the overall-data-table described above using the
"reshape" package, we have the judging medical doctor in the columns and
the videos in the rows. In the cells there are the coded answer-options
from the multiple choice survey. Please see an simple example with
answer-options 0/1 (copied from R console) below:

 video 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
1      1 0 0 0 0 0 0 0 0 0  0  0  0  0  0  0  0  0  0  0  0  0
2      2 0 0 0 0 0 0 0 0 0  0  0  0  0  0  0  0  0  0  0  0  1
3      3 0 0 0 0 0 0 0 0 0  0  0  0  0  0  0  0  0  0  0  0  0
4      4 0 0 0 0 0 0 0 0 0  0  0  0  0  0  0  0  0  0  0  0  0
5      5 0 0 0 0 0 0 0 0 0  0  0  0  0  0  0  0  0  0  0  1  0
6      6 0 0 0 0 0 0 0 0 0  0  0  0  0  1  0  0  0  0  0  0  0
7      7 0 0 0 0 0 0 0 0 0  0  0  0  0  0  0  0  0  0  0  0  0
8      8 0 0 0 0 0 0 0 0 0  0  0  0  0  0  1  0  0  0  0  0  0
9      9 0 0 0 0 0 0 0 0 0  1  0  1  1  0  1  1  0  0  0  1  0
10    10 0 0 0 0 0 0 0 0 0  0  0  0  0  0  0  0  0  0  0  0  0
11    11 0 0 0 0 0 0 0 0 0  0  0  0  0  0  0  0  0  0  0  0  0
12    12 0 0 0 0 0 0 0 0 0  0  0  0  0  0  0  0  0  0  0  0  0
13    13 0 0 0 0 0 0 0 0 0  0  0  0  0  0  0  0  0  0  0  0  0
14    14 0 0 0 0 0 0 0 0 0  0  0  0  0  0  0  0  0  0  0  0  0
15    15 0 0 0 0 0 0 0 0 0  0  0  0  0  0  0  0  0  0  0  0  0
16    16 0 0 0 0 0 0 0 0 0  0  0  0  0  0  0  0  0  0  0  0  0
17    17 0 0 0 0 0 0 0 0 0  0  0  0  0  0  0  0  0  0  0  0  0
18    18 0 0 0 0 1 0 0 0 0  0  0  0  0  0  0  0  0  0  0  0  1
19    19 0 0 0 0 0 0 0 0 0  0  0  0  0  0  0  0  0  0  0  0  0
20    20 0 1 0 0 0 0 0 0 0  0  0  0  0  0  0  0  0  0  0  0  0
21    21 0 0 0 0 0 0 1 0 0  0  0  0  0  0  0  0  0  0  0  0  1
22    22 0 0 0 0 0 0 0 0 0  0  0  0  0  0  0  0  0  0  0  0  0
23    23 0 1 0 0 1 0 1 0 0  1  0  0  1  1  0  0  1  0  0  0  0
24    24 0 0 0 0 0 0 0 0 0  0  0  0  1  1  1  1  0  1  0  0  1
25    25 0 0 0 0 0 0 0 0 0  0  0  1  0  0  1  1  0  0  0  0  0
26    26 0 0 0 0 0 0 0 0 0  0  0  1  0  0  0  0  0  0  0  0  0
27    27 0 0 0 0 0 0 0 0 0  0  0  0  0  0  0  0  0  0  0  0  0
28    28 0 1 0 1 0 0 0 0 0  0  0  0  0  0  0  0  0  0  0  0  0
29    29 0 0 0 0 0 0 0 0 0  0  0  0  0  0  0  0  0  0  0  0  0
30    30 0 0 0 0 0 0 0 0 0  0  0  0  0  0  0  0  0  0  0  0  0
31    31 0 0 0 0 0 0 0 0 0  0  0  0  0  0  0  0  0  0  0  0  0
32    32 0 0 0 0 0 0 0 0 0  0  0  0  0  0  0  0  0  0  0  0  0
33    33 0 0 0 0 0 0 0 0 0  0  0  0  0  0  0  0  0  0  0  0  0
34    34 0 0 0 0 0 0 0 0 0  0  0  0  0  0  0  0  0  0  0  0  0
35    35 0 0 0 0 0 0 1 0 0  0  0  0  0  0  0  0  0  0  0  0  0
36    36 0 0 0 0 0 0 0 0 0  0  0  0  0  0  0  0  0  0  0  0  0
37    37 0 1 1 0 1 0 0 1 0  0  0  0  1  1  1  0  1  0  0  1  1
38    38 0 0 0 0 0 0 0 0 0  0  0  0  0  0  0  0  0  0  0  0  0
39    39 0 1 0 0 1 0 0 1 0  1  1  0  1  1  0  0  1  1  0  1  1
40    40 1 1 1 1 1 0 1 0 0  0  0  1  1  1  1  0  0  1  0  0  1
41    41 0 0 0 0 0 0 0 0 0  1  0  0  0  0  0  0  0  0  0  0  1
42    42 0 0 0 0 0 0 0 0 0  0  0  0  0  0  0  0  0  0  0  0  0

What I did in Excel is: Creating the very same tables using
pivot-charts. Comparing columns 1-20 to column 21 (gold-standard),
summing up the count of values that are identical to 21. I repeated this
for each answer-option. From the results, one can easily calculate
specificity, sensitivity and accuracy.

How to do this, or something similar leading to the same results in R?
I'd appreciate any kind of help very much!

Greetings from Munich,
Felix



More information about the R-help mailing list