[R] what's wrong with my simulation programs on logistic regression

Prof Brian Ripley ripley at stats.ox.ac.uk
Thu Aug 31 16:55:40 CEST 2006


On Thu, 31 Aug 2006, zhijie zhang wrote:

> Dear friends,
>  I'm doing a simulation on logistic regression model, but the programs can't
> work well,please help me to correct it and give some suggestions.
> My programs:
> data<-matrix(rnorm(400),ncol=8)  #sample size is 50
> data<-data.frame(data)
> names(data)<-c(paste("x",1:8,sep=""))  #8 independent variables,x1-x8;
> #logistic regression model is logit(y)=x1+x2+x3+x4+x5+x6+x7+x8

Rather it is logit(p) = ...,  and y ~ binomial(1, p)

There is a different sort of 'logistic regression' with 

y = exp(eta)/(1+exp(eta)) + epsilon

but you fit that by nls, not glm.

> data$y<-exp(data$x1+data$x2+data$x3+data$x4+data$x5+data$x6+data$x7+data$x8)/(1+(data$x1+data$x2+data$x3+data$x4+data$x5+data$x6+data$x7+data$x8))

You need exp()/(1+exp()), and the second exp is missing.

Once you have p, you can use data$y <- rbinom(length(p), 1, p)

> logist<-glm(y~.,family=binomial(),data=simdata)
> *Warning messages:*
> 1: algorithm can't converge in: glm.fit(x = X, y = Y, weights = weights,
> start = start, etastart = etastart,
> 2: the probability is 0 or 1 in: glm.fit (x = X, y = Y, weights = weights,
> start = start, etastart = etastart,

You do not have a Bernoulli response: it often helps to look at your 
simulated data to see if it makes sense (just as you would look at real 
data, I hope).

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-help mailing list