[R-sig-ME] lme4, cloglog vs. binomial link (peter dalgaard)

peter dalgaard pdalgd at gmail.com
Mon Jun 11 12:59:52 CEST 2012


On Jun 11, 2012, at 05:26 , Murray Jorgensen wrote:

> Hi Peter and List,
> 
> I confess that I have difficulty in seeing the connection with Poisson processes. When we are fishing indefinitely we can assume an endless supply of fish capture events so a Poisson process seems reasonable. In my case of an insect leaving a habitation each 'process' will just be a single departure event at some future time which is not exactly observed.
> 
> Perhaps I'm thinking about this in the wrong way?

(Perhaps a bit long-winded for the list, but now that we've started...)

To my mind, a (time-invariant) Poisson process is just a mechanism which generates events at "completely random" points in time, that is, the (infinitesimal) probability of an event in the next instant is the same, which turns out to equivalent to saying that the time till next event has an exponential distribution. 

A survival process for a single item can be viewed as a Poisson process stopped at the first event. If you have n independent Poisson processes, the intensity of events is just the sum of the intensities. That is really all you need for your case of observing the first "death" out of n. (In classical survival analysis, the process would continue after the first death, but the intensity would be reduced to reflect the new population size.)

Notice that the number of events is not necessarily Poisson distributed if the process is stopped at a time that depends on the process itself (i.e., people usually only die once, not a Poisson-distributed number of times). However, a stopped Poisson process has the same probability for events that only depend on the behavior of the process until it is stopped. E.g., the probability of one or more events in a certain amount of time is the same in a process stopped at the first event as it is in the original process (in terms of electrical fuses, you could "unstop" the process by replacing the fuse).

-pd


> 
> Murray
> 
> On 8/06/2012 10:52 p.m., peter dalgaard wrote:
>> Hi Murray,
>> 
>> I think this is pretty strongly related to proportional hazards modelling. If you are looking at it from a Poisson (*) process point of view, the rate of event per unit of time when observing a number of independent processes should be proportional to time and the number  of processes, and the probability of at least one event in a fixed length of time T is then 1 - exp(- n T lambda) (or 1 - exp (-n Lambda(T)) if you have a time-varying intensity, Lambda being the integrated intensity).
>> 
>> -pd
>> 
>> (*) Could be fun if this was actually about fish...
>> 
>> On Jun 6, 2012, at 23:21 , Murray Jorgensen wrote:
>> 
>>> *Hi Peter, Tibor et al.
>>> 
>>> I came across an ecological situation recently where a cloglog link seemed
>>> to be called for. I won't remove the context to the following explanatory
>>> note that I wrote but I'm sure the same kind of situation could be
>>> reasonably common:
>>> 
>>> 
>>> We wish to explore the probability of one or more females departing a
>>> cavity between two site visits as a function of the habitation state of the
>>> cavity at the first visit. More strictly we study the probability of a
>>> decrease in the number of females inhabiting the cavity between the two
>>> visits. Clearly this probability will be zero if no females inhabit the
>>> cavity at the first visit. More generally the probability will be larger as
>>> the number of female inhabitants increase as each has the opportunity to
>>> depart.
>>> 
>>> Although this dependance on the initial number of females is part of what
>>> we want to study we are more interested in questions such as the influence
>>> of the initial number of males on the probability of female decrease. We
>>> are indeed also interested in the effect of the initial numbers of females
>>> on the probability of female decrease, but more in the sense that we would
>>> like to know whether this is greater than, less than or equal to what would
>>> be predicted by a simple model.
>>> 
>>> One naive model that could be considered is that the decrease probability
>>> would be proportional to the number of females. This might work if the
>>> decrease probability was very low but for larger decrease probabilities
>>> would predict decrease probabilities greater than one. A less naive model
>>> would assume that each female departs with the same probability,
>>> independently of the other females.
>>> Then if the probability of a single female departing is $p$ and there are
>>> $x$ females in the cavity the probability of 1 or more departing is $p_x =
>>> 1 - (1-p)^x$.
>>> 
>>> The link function for the complementary log-log link is $\eta =
>>> \log(-\log(1-p))$. To examine the effect of multiple initial females we
>>> evaluate this at $p_x$.
>>> 
>>>                 1-p_x  =  (1-p)^x
>>>           \log(1-p_x)  =  x\log(1-p)
>>>    \log(-\log(1-p_x))  =  \log(x)+\log(-\log(1-p))
>>> 
>>> Thus the effect of an initial habitation of $x$ females is a shift of
>>> $\log(x)$ on the linear predictor scale if a complementary log-log link is
>>> used in a GLM or GLMM for the probability of female decrease. This means
>>> that the naive model can be accommodated by including $\log(x)$ as an
>>> offset.  If $x$ were also included as a covariate, a significant
>>> coefficient would indicate a departure from the naive model.
>>> 
>>> Regards, Murray
>>> 
>>> *
>>> 
>>>> Message: 5
>>>> Date: Wed, 6 Jun 2012 22:54:16 +0200
>>>> From: peter dalgaard<pdalgd at gmail.com>
>>>> To: Tibor Kiss<tibor at linguistics.rub.de>
>>>> Cc: r-sig-mixed-models at r-project.org
>>>> Subject: Re: [R-sig-ME] lme4, cloglog vs. binomial link
>>>> Message-ID:<7658F572-AEE2-4127-AB77-321B5B6C3D69 at gmail.com>
>>>> Content-Type: text/plain; charset=us-ascii
>>>> 
>>>> 
>>>> On Jun 4, 2012, at 13:07 , Tibor Kiss wrote:
>>>> 
>>>>> [...snippage...]
>>>>> My questions are as follows:
>>>>> 
>>>>> 1. Is it correct to assume that given a cloglog link, the less frequent
>>>> response should be considered the success?
>>>> 
>>>> No, cloglog is asymmetric, so it will make a difference which outcome is
>>>> considered success, but there is no mathematical reason to choose between
>>>> them. In survival data, the cloglog comes out of the proportional hazards
>>>> model when you have death within a fixed time period as the response (exact
>>>> date of death not recorded). In that case, death is "success" (!);
>>>> hopefully, it is the least likely outcome, but it might not be. If cloglog
>>>> is just used as a generic link function, then no such logic applies.
>>>> 
>>>>> 2. Is it correct to conclude that the changes in the model have led to
>>>> less influence of the random factor?
>>>> 
>>>> No. The scales are different. At the very least, you need to somehow
>>>> compare it to the fixed effects on the same scale.
>>>> 
>>>>> 3. How shall I react to the increase in AIC?
>>>> 
>>>> (Or, equivalently, the deviance). The cloglog link model seems to give the
>>>> worse fit to data.
>>>> 
>>>>> A final question, which may not have an answer at all: I am most curious
>>>> to learn about possible modifications of the model so that an observed
>>>> random effect can be minimized (while its presence cannot be denied).
>>>> 
>>>> First, is that desirable, and why? The only logic, that I can think of, is
>>>> that you want to get the fixed-effect part of the model right, so that the
>>>> error is not mistakenly taken as part of the random variation.
>>>> 
>>>> --
>>>> Peter Dalgaard, Professor,
>>>> Center for Statistics, Copenhagen Business School
>>>> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
>>>> Phone: (+45)38153501
>>>> Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com
>>>> 
>>>> 
>>>> 
>>>> ------------------------------
>>>> 
>>>> _______________________________________________
>>>> R-sig-mixed-models mailing list
>>>> R-sig-mixed-models at r-project.org
>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>> 
>>>> 
>>>> End of R-sig-mixed-models Digest, Vol 66, Issue 10
>>>> **************************************************
>>>> 
>>> 
>>> 
>>> 
>>> --
>>> Dr Murray Jorgensen      http://www.stats.waikato.ac.nz/Staff/maj.html
>>> Department of Statistics, University of Waikato, Hamilton, New Zealand
>>> Email: maj at waikato.ac.nz    majorgensen at ihug.co.nz      Fax 7 838 4155
>>> Phone  +64 7 838 4773 wk    Home +64 7 825 0441   Mobile 021 0200 8350
>>> 
>>> 	[[alternative HTML version deleted]]
>>> 
>>> _______________________________________________
>>> R-sig-mixed-models at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>> 
> 
> -- 
> Dr Murray Jorgensen      http://www.stats.waikato.ac.nz/Staff/maj.html
> Department of Statistics, University of Waikato, Hamilton, New Zealand
> Email: maj at waikato.ac.nz      majmurr at gmail.com         Fax 7 838 4155
> Phone  +64 7 838 4773 wk    Home +64 7 825 0441   Mobile 021 0200 8350

-- 
Peter Dalgaard, Professor
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com



More information about the R-sig-mixed-models mailing list