[R-sig-ME] Logistic regression with spatial autocorrelation structure
Douglas Bates
bates at stat.wisc.edu
Mon Feb 7 19:56:51 CET 2011
On Mon, Feb 7, 2011 at 10:58 AM, <Dale.W.Steele at gmail.com> wrote:
> Dear Prof. Bates - Thanks for your response. How would I calculate a 95%
> confidence interval for
> the probability of appendicitis in a new patient given a particular set of
> covariates for
> 1) patients enrolled by a specific clinician (eg. doc == "18")?
> 2) some future clinician?
Didn't I answer 2 in my previous message in the part about
> fixef(m2)
(Intercept) priorprob genderMale wbc
-8.2886183 0.0502540 1.1870539 0.3412998
> eta <- crossprod(c(1, 55, 0, 9.45), fixef(m2))
> eta
[,1]
[1,] -2.299365
> binomial()$linkinv(eta)
[,1]
[1,] 0.09117557
It is certainly possible to automate the process a bit more with a
predict method but, for the time being, I think that should be the way
to go about it.
If you have a simple random effect for doc, as in model m2, then to
get a prediction for a specific physician you add the random effect
for, say, doc == "18", to the value of eta before transforming it.
You can get the conditional modes of the random effects with the ranef
extractor.
> Thanks. --Dale
>
> On Feb 1, 2011 1:08pm, Douglas Bates <bates at stat.wisc.edu> wrote:
>> On Tue, Feb 1, 2011 at 12:05 PM, Douglas Bates bates at stat.wisc.edu> wrote:
>>
>> > On Mon, Jan 31, 2011 at 11:45 AM, Dale Steele dale.w.steele at gmail.com>
>> > wrote:
>>
>> >> Dear mixed-modeling experts,
>>
>> >>
>>
>> >> I'm interested in modeling the probability of appendicitis in patients
>>
>> >> with abdominal pain.
>>
>> >>
>>
>> >> The R binary data file 'http://www.ped-em.org/appy.rda' contains
>>
>> >> the following variables from a pilot study of 138 children with
>>
>> >> abdominal pain.
>>
>> >
>>
>> > Thank you for providing the data.
>>
>> >
>>
>> >> 'dx' eventual diagnosis: 0=no appendicitis, 1=appendicitis
>>
>> >> 'gender' Male/Female
>>
>> >> 'wbc' total white blood cell count
>>
>> >> 'priorprob' Clinical predicted probability of appendicitis
>>
>> >> 'doc' doctor who assigned 'priorprob'
>>
>> >>
>>
>> >> After taking a history and performing a physical examination, the ER
>> >> doctor
>>
>> >> was asked to make a vertical mark on a 100 mm horizontal line to
>> >> represent
>>
>> >> her estimate of the (percent) probability that the patient had
>> >> appendicitis.
>>
>> >>
>>
>> >> My initial thought was to fit a multiple logistic regression model:
>>
>> >>
>>
>> >> m1
>> >>
>>
>> >> However, it seems likely that each doctor interpreted the probability
>> >> scale
>>
>> >> differently. The 23 doctors evaluated from 1 to 17 patients each. I'm
>>
>> >> not primarily interest in predictions by a specific clinician. Thus,
>>
>> >> it seems to make sense to fit a generalized linear mixed model.
>>
>> >>
>>
>> >> At this point I get muddled. Have I correctly specified a random
>>
>> >> intercept model (m2) and a random intercept/random slope model (m3)?
>>
>> >> Are there other sensible models?
>>
>> >
>>
>> > Yes, you have correctly specified them. However, if you check the
>>
>> > output from the fits
>>
>> >> (m2
>> > Generalized linear mixed model fit by the Laplace approximation
>>
>> > Formula: dx ~ priorprob + gender + wbc + (1 | doc)
>>
>> > Data: appy
>>
>> > AIC BIC logLik deviance
>>
>> > 108.1 122.7 -49.03 98.07
>>
>> > Random effects:
>>
>> > Groups Name Variance Std.Dev.
>>
>> > doc (Intercept) 0.82922 0.91062
>>
>> > Number of obs: 138, groups: doc, 23
>>
>> >
>>
>> > Fixed effects:
>>
>> > Estimate Std. Error z value Pr(>|z|)
>>
>> > (Intercept) -8.28862 1.45666 -5.690 1.27e-08
>>
>> > priorprob 0.05025 0.01291 3.892 9.96e-05
>>
>> > genderMale 1.18705 0.56752 2.092 0.0365
>>
>> > wbc 0.34130 0.07090 4.814 1.48e-06
>>
>> >
>>
>> > Correlation of Fixed Effects:
>>
>> > (Intr) prrprb gndrMl
>>
>> > priorprob -0.775
>>
>> > genderMale -0.293 0.130
>>
>> > wbc -0.790 0.353 0.040
>>
>> >> (m3
>> > Generalized linear mixed model fit by the Laplace approximation
>>
>> > Formula: dx ~ priorprob + gender + wbc + (priorprob | doc)
>>
>> > Data: appy
>>
>> > AIC BIC logLik deviance
>>
>> > 111.5 132 -48.76 97.52
>>
>> > Random effects:
>>
>> > Groups Name Variance Std.Dev. Corr
>>
>> > doc (Intercept) 0.02897010 0.170206
>>
>> > priorprob 0.00016799 0.012961 1.000
>>
>> > Number of obs: 138, groups: doc, 23
>>
>> >
>>
>> > Fixed effects:
>>
>> > Estimate Std. Error z value Pr(>|z|)
>>
>> > (Intercept) -8.23977 1.44408 -5.706 1.16e-08
>>
>> > priorprob 0.04957 0.01328 3.733 0.000189
>>
>> > genderMale 1.16593 0.57481 2.028 0.042521
>>
>> > wbc 0.34485 0.07056 4.887 1.02e-06
>>
>> >
>>
>> > Correlation of Fixed Effects:
>>
>> > (Intr) prrprb gndrMl
>>
>> > priorprob -0.757
>>
>> > genderMale -0.299 0.141
>>
>> > wbc -0.800 0.350 0.030
>>
>> >> anova(m2, m3)
>>
>> > Data: appy
>>
>> > Models:
>>
>> > m2: dx ~ priorprob + gender + wbc + (1 | doc)
>>
>> > m3: dx ~ priorprob + gender + wbc + (priorprob | doc)
>>
>> > Df AIC BIC logLik Chisq Chi Df Pr(>Chisq)
>>
>> > m2 5 108.07 122.70 -49.034
>>
>> > m3 7 111.52 132.01 -48.759 0.5501 2 0.7595
>>
>> >> (m4
>> > Generalized linear mixed model fit by the Laplace approximation
>>
>> > Formula: dx ~ priorprob + wbc + (1 | doc)
>>
>> > Data: appy
>>
>> > AIC BIC logLik deviance
>>
>> > 110.5 122.2 -51.24 102.5
>>
>> > Random effects:
>>
>> > Groups Name Variance Std.Dev.
>>
>> > doc (Intercept) 0.63458 0.7966
>>
>> > Number of obs: 138, groups: doc, 23
>>
>> >
>>
>> > Fixed effects:
>>
>> > Estimate Std. Error z value Pr(>|z|)
>>
>> > (Intercept) -7.62797 1.33760 -5.703 1.18e-08
>>
>> > priorprob 0.04834 0.01229 3.934 8.35e-05
>>
>> > wbc 0.34395 0.06771 5.080 3.77e-07
>>
>> >
>>
>> > Correlation of Fixed Effects:
>>
>> > (Intr) prrprb
>>
>> > priorprob -0.786
>>
>> > wbc -0.813 0.357
>>
>> >> anova(m4, m3)
>>
>> > Data: appy
>>
>> > Models:
>>
>> > m4: dx ~ priorprob + wbc + (1 | doc)
>>
>> > m3: dx ~ priorprob + gender + wbc + (priorprob | doc)
>>
>> > Df AIC BIC logLik Chisq Chi Df Pr(>Chisq)
>>
>> > m4 4 110.48 122.19 -51.240
>>
>> > m3 7 111.52 132.01 -48.759 4.9639 3 0.1745
>>
>>
>>
>> Sorry, that test should have been
>>
>>
>>
>> > anova(m4, m2)
>>
>> Data: appy
>>
>> Models:
>>
>> m4: dx ~ priorprob + wbc + (1 | doc)
>>
>> m2: dx ~ priorprob + gender + wbc + (1 | doc)
>>
>> Df AIC BIC logLik Chisq Chi Df Pr(>Chisq)
>>
>> m4 4 110.48 122.19 -51.240
>>
>> m2 5 108.07 122.70 -49.034 4.4138 1 0.03565
>>
>>
>>
>> > you will see that the random intercept/random slope model produces a
>>
>> > degenerate fit (estimated correlation of the within-doctor random
>>
>> > effects is -1.000), which is not significantly better than the random
>>
>> > intercept fit. I would therefore use m2. (Because the fixed-effect
>>
>> > for gender had a z-value close to 2 I performed the more reliable
>>
>> > likelihood-ratio test, just to check.)
>>
>> >
>>
>> >
>>
>> >> library(lme4)
>>
>> >> m2
>> >> family=binomial, data=appy)
>>
>> >>
>>
>> >> m3
>> >> family=binomial, data=appy)
>>
>> >>
>>
>> >> My ultimate goal is to estimate the probability of appendicitis
>>
>> >> (and a prediction interval), given a specific 'gender', 'wbc' and
>>
>> >> 'priorprob' assigned by a doctor with similar diagnostic ability to
>>
>> >> those who participated in our pilot study. I'm stuck on how to code
>> >> this
>>
>> >> prediction.
>>
>> >
>>
>> > The prediction will be based on the fixed-effects only. Because it
>>
>> > applies to a doctor not in this study, we assign the random effect for
>>
>> > that doctor to be zero, which is the expected value in the absence of
>>
>> > any information on that doctor.
>>
>> >
>>
>> > The fixef extractor returns the estimates of the fixed-effects
>>
>> > parameters. For a female patient with median prior probability (55%)
>>
>> > and median white blood cell count (9.45) the estimated linear
>>
>> > predictor (eta = -2.299) corresponds to a probability of 9.1%
>>
>> >
>>
>> >
>>
>> >> fixef(m2)
>>
>> > (Intercept) priorprob genderMale wbc
>>
>> > -8.2886183 0.0502540 1.1870539 0.3412998
>>
>> >> eta
>> >> eta
>>
>> > [,1]
>>
>> > [1,] -2.299365
>>
>> >> binomial()$linkinv(eta)
>>
>> > [,1]
>>
>> > [1,] 0.09117557
>>
>> >
>>
More information about the R-sig-mixed-models
mailing list