# [R] Statistical Significance of an observation

Emmanuel Charpentier emmanuel.charpentier at sap.ap-hop-paris.fr
Tue Apr 18 16:44:05 CEST 2000

```Accourding to your (curiously entitled) question to the R-help list :

> I would like to use a function in order to know the value of Specificity
> and Sensibility of data ( True/False Negative/Positive measurement).

That's an easy one ... at first sight. Computation is quite easy ; conditions
of use are *not*.

I'll use the vocabulary of my primary field (Biostatistics) and treat your
subject accordingly.

First, let us suppose that your variable is a boolean one (you are supposed to
use an undescribed device answering S="True" or False" to the question under
examination). Suppose furthermore that you have another mean of getting the
"real" answer, i. e. the value of the real (="gold standard) answer M to the
question (e. g. in medicine, pathology results).

By definition, Sensitivity is Pr(S|M) and is estimated, from a (supposedly
perfect) sample, by N(S^M)/N(M). In R parlance, if Sample is a dataframe with
vectors S and M both booleans, Sens<-sum(Sample\$S & Sample\$M)/sum(Sample^M).
(Resp. Specificity is Spec<-sum((!Sample\$S) & (!Sample\$M))/sum(!Sample\$M))

Now, what happens in the mayhem called "the Real Life". There are missing data.
And that's where bigmistakes can be done. The pattern of missing data should be
looked at very closely, and explanations should be obtained. This is often as
important as the sensitivity/specificity estimation.

IMHO, missing data in M and in S should be treated differently. Cases where M
is missing can be discarded unless there is some reason to their absence that
might be related to the S value. Cases with S missing are different. In our
example, a "clinical" sensitivity (what is the sensitivity of a given test when
diagnosing a given disease) is given by the same formula as above, making first
where either S or M cannot be got. If you are interested in the efficiency of
the test, however, you should look at the sample Sample[!is.na(Sample\$M),] :
patients undergoing the test with no effective result are indeed consuming
resources.

Conversively, if you're an epidemiologist looking at the overall efficiency of
a sceening policy, the relevant set might be well Sample[!is.na(Sample\$S),]
(hopefully the whole sample) and the denominator being nrow(Sample) : people
for which the M status remain unknown (screened negative) are probably a much
larger number than people diagnosed (screened positive then diagnosed either
positive or negative, or diagnosed for other reasons). But there what you're
looking at is no longer really sensitivity in the real sense ...

When you're looking at non-boolean signs (e. g. a biochemical dosage of a blood
sample), sensitivity and specificity have no longer any meaning per se. You
have to choose a threshold value  S0, beyound which S is noted S- and above
which it is noted S- (thus recreating a boolean variable).

Rather than studying sensitivity an specificity for a given value of S0, it is
often interesting to study the relationship between Sens(S0) and Spec(SO)
(always monotone non-increasing). More precisely, it is sufficient to have S
being a value extracted out of a totally ordered set (that is, for any pair S1,
S2, you have either S1<S2, S1>S2 or S1=S2 ; in S/R parlance, S is a numerical
variable or an *ordered* factor). For each value of S, you get a value of Sens
and a value of Spec.

The plot of Spec(S)=1-Sens(S) is called a ROC (Receiver Operating
Characteristic) curve, and is characteristic of the test, not of the
distribution of values in the sample. In particular, its integral (area under
curve) is considered as a good point value assessing the diagnostic value of
the test.

Computation of this curve is left as an exercise for the reader (that means
that I know how to compute it, but that my solution is ugly (loops ...) and
that I feel that it should be enhanced before publication :-). I might post a
smallish library for ROC analysis someday, but not soon : I have some other
things to attend first ...

A very common (and IMHO very dangerous) use of ROC analysis is to assess the
diagnostic value of a test starting from a sample of patients ordered in
"increasing order of (subjective) probability of presenting the disease" (e. g.
an ordering of radiographies). ROC analysis is valid for this kinf=d of
experiment if and only if the ordering is really a total order relationship
(that is, for any A, B, C, A>B and B>C entails A>C). This assumption is
necessary for the ROC to have a meaning ; this assumption, while extremely
strong, is always made, and very rarely checked upon. Caveat emptor ...

Of course, the cautions mentioned about missing values for the boolean case
remain valid; furthermore, attention should be paid to the distribution of S
(reproductibility, for example) and the impact of the S-measurement error on
apparent sensitivity/specificity should be assessed (by simulation or by
analysing samples of repeated measures).

Hope this helps ...

Emmanuel Charpentier

--
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

```