[R] Logit/probit model

Thu Mar 25 18:35:33 CET 2010

Perhaps I am missing something but it appears that because X1 and X2 are
random normal, that the influence of X2 is much like a second sampling of
X1, and thus you would expect just what you observed, especially with a
large (1000) sample size.  Try making X2 and X1 different.

Charles Annis, P.E.

Charles.Annis at StatisticalEngineering.com
561-352-9699
http://www.StatisticalEngineering.com

-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
Behalf Of Ana De Barros
Sent: Thursday, March 25, 2010 12:19 PM
To: r-help at r-project.org
Subject: [R] Logit/probit model

Deal all,

I have a population with the following characteristics:

N=1000
X0=rep(1,N)
X1=rnorm(N)
X2=rnorm(N)

I also know that the population distribution is a linear logistic function
with parameters alpha0=0 (intercept), alpha1=0.4 and alpha2=1.1. So easily I
can get the dependent variable (in my case the response propensities) by
doing:
alpha=as.vector(c(0, 0.4, 1.1))
X=cbind(X0,X1, X2)
X=matrix(X, ncol=3, nrow=N)
P=X%*%alpha
propensity=1/(1+exp(-(P)))
proptrue=mean(propensity)

I have to estimate by sampling simulation the response propensity (dependent
variable), assuming I don9t know the population distribution and assuming:
1. a linear logistic function adjusting for x1 only
        1.1     assuming I know the true parameters (alpha0=0 and
alpha1=0.4)
        1.2     assuming I don9t know the true parameters
2. a probit function adjusting for x1 only
    2.1     assuming I know the true parameters (alpha0=0 and alpha1=0.4)
    2.2     assuming I don9t know the true parameters

When I assume I don9t know the true parameters I sample by doing for (g in
1:replicas)
    {  
    labels=sample(N, sample.size, replace=FALSE)
    x0=X0[labels]
    x1=X1[labels]
    x2=X2[labels]
    propsample=propensity[labels]

    logitx1=glm(propsample~x1, family=binomial(link="logit"))
    coefx1= logitx1$coefficients
    fitx1= logitx1$fitted.values
    PSprob=mean(fitx1)

    probx1=glm(propsample~x1, family=binomial(link="probit"))
    c33= probx1$coefficients
    cc33= probx1$fitted.values
    PSprob=mean(cc33)
    }

My problem is that although I omit x2 in the simulations I still get very
similar results (similar response propensities) with the population response
propensity and it doesn9t make any sense...  I must be doing something wrong
but I don9t find the error. Can you help me, please?

Thanks a lot
Ana

	[[alternative HTML version deleted]]