[R] glmer with non integer weights

Emmanuel Charpentier charpent at bacbuc.dyndns.org
Mon Apr 19 20:17:38 CEST 2010

Le lundi 19 avril 2010 à 03:00 -0800, Kay Cichini a écrit : 
> hi emmanuel,
> thanks a lot for your extensive answer.
> do you think using the asin(sqrt()) transf. can be justified for publishing
> prurpose or do i have to expect criticism.

Hmmm ... depends of your reviewers. But if an half-asleep dental surgeon
caught that after an insomnia, you might expect that a fully caffeinated
reviewer will. Add Murphy's law to the mix and ... boom !

> naivly i excluded that possibility, because of violated anova-assumptions,
> but if i did get you right the finite range rather posses a problem here.

No. your problem is that you model a probability as a smooth (linear)
finite function of finite variables. Under those assumptions, you can't
get a *certitude* (probability 0 or 1). Your model is *intrinsically*
inconsistent with your data.

In other word, I'm unable to believe both your model (linear
whathyoumaycallit regression) and your data (wich include certainties)

I'd reconsider your 0 or 1, as meaning *censored* quantities (i. e. no
farther than some epsilon from 0 or 1), with *hard* data (i. e. not a
cooked-up estimate such as the ones i used) to estimate epsilon. There
are *lots* of ways to fit models with censored dependent variables.

> why is it in this special case an advantage? 

It's bloody hell *not* a specific advantage : if you want to fit a
linear model to a a probability, you *need* some function mapping R to
the open ]0 1[ (i. e. all reals strictly superior to 0 and strictly
inferior to 1 ; I thing that's denoted (0 1) in English/American usage).
Asin(sqrt()) does that.

However, (asin(sqrt()))^-1 has a big problem (mapping back [0 1] i. e.
*including* 0 and 1, *not* (0 1), to R) which *hides* the (IMHO bigger)
problem of the inadequacy of your model to your data ! In other words,
it lets you shoot yourself in the foot after a nice sciatic nerve
articaïne block making the operation painless (but still harmful). On
the other hand, logit (or, as pointed by Martin Maechler, qlogis), is
kind enough to choke on this (i. e. returning back Inf values, which
will make the regression program choke).

So please quench my thirst : what exactly is MH.Index supposed to be ?
How is it measured, estimated, guessed or divined ?


Emmanuel Charpentier

More information about the R-help mailing list