Peter Dalgaard p.dalgaard at biostat.ku.dk
Fri Jul 15 17:58:51 CEST 2005

```Robin Hankin <r.hankin at noc.soton.ac.uk> writes:

> Hi
>
> I am trying to make glm() work to analyze a toy logit system.
>
> I have a dataframe with x and y independent variables.  I have
>
> L=1+x-y          (ie coefficients 1,1,-1)
>
> then if I have a logit relation with L=log(p/(1-p)),
> p=1/(1+exp(L)).
>
> If I interpret "p" as the probability of  success in a Bernouilli
> trial, and I can observe the result (0 for "no", 1 for "yes")
> how do I retrieve the coefficients c(1,1,-1)
> from the data?
>
> n <- 300
> des <- data.frame(x=(1:n)/n,y=sample(n)/n)   # experimental design
> des <- cbind(des,L=1+des\$x-des\$y)            # L=1+x-y
> des <- cbind(des,p=1/(1+exp(des\$L)))         # p=1/(1+e^L)
> des <- cbind(des,obs=rbinom(n,1,des\$p))      # observation: prob of
> success = p.
>
>
> My attempt is:
>
>
> But it does not retrieve the correct coefficients of c(1,1,-1) ;
> I would expect a reasonably close answer with so much data.
>
> What is the correct glm() call to perform my logit analysis?

Apart from a sign error, the only fault seems to be that you think
that 300 is a large number... Try upping n to 30000 and you'll see.

(The sign error is of course that log(p/(1-p)) = L implies that
p = exp(L)/(1+exp(L)) = 1/(1+exp(-L))).

--
O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
(*) \(*) -- University of Copenhagen   Denmark          Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)                  FAX: (+45) 35327907

```