[R] Probit predictions outside (0,1) interval
John Fox
jfox at mcmaster.ca
Fri Mar 5 15:39:24 CET 2004
Dear Arnab,
Several people have already noted that you're getting predicted values on
the wrong scale. Note, as well, that you fit a logit model rather than a
probit model; for a probit model, you need family=binomial(probit), since
the logit link is the canonical link for the binomial family.
John
> -----Original Message-----
> From: r-help-bounces at stat.math.ethz.ch
> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Arnab mukherji
> Sent: Friday, March 05, 2004 2:48 AM
> To: r-help at stat.math.ethz.ch
> Cc: r-help at stat.math.ethz.ch
> Subject: [R] Probit predictions outside (0,1) interval
>
> Hi!
>
> I was trying to implement a probit model on a dichotomous
> outcome variable and found that the predictions were outside
> the (0,1) interval that one should get. I later tried it with
> some simulated data with a similar result.
>
> Here is a toy program I wrote and I cant figure why I should
> be getting such odd predictions.
>
> x1<-rnorm(1000)
> x2<-rnorm(1000)
> x3<-rnorm(1000)
> x4<-rnorm(1000)
> x5<-rnorm(1000)
> x6<-rnorm(1000)
> e1<-rnorm(1000)/3
> e2<-rnorm(1000)/3
> e3<-rnorm(1000)/3
> y<-1-(1-pnorm(-2+0.33*x1+0.66*x2+1*x3+e1)*1-(pnorm(1+1.5*x4-0.
> 25*x5+e2)*pnorm(1+0.2*x6+e3)))
> y <- y>runif(1000)
> dat<-data.frame(y = y, x1 = x1, x2 = x2, x3 = x3) g<-glm(y~.,
> data = dat, family = binomial)
> summary(g)
> yhat<-predict(g, dat)
>
>
> Call:
> glm(formula = y ~ ., family = binomial, data = dat)
>
> Deviance Residuals:
> Min 1Q Median 3Q Max
> -1.8383 -1.3519 0.7638 0.9249 1.3698
>
> Coefficients:
> Estimate Std. Error z value Pr(>|z|)
> (Intercept) 0.71749 0.06901 10.397 < 2e-16 ***
> x1 0.10211 0.07057 1.447 0.14791
> x2 0.21068 0.07177 2.936 0.00333 **
> x3 0.35162 0.07070 4.974 6.57e-07 ***
> ---
> Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1
>
> (Dispersion parameter for binomial family taken to be 1)
>
> Null deviance: 1275.3 on 999 degrees of freedom
> Residual deviance: 1239.4 on 996 degrees of freedom
> AIC: 1247.4
>
> Number of Fisher Scoring iterations: 4
>
> > yhat<-predict(g, dat)
> >
> > range(yhat)
> [1] -0.4416826 2.0056527
> > range(y)
> [1] 0 1
>
> Any advice would be really helpful.
>
More information about the R-help
mailing list