[R] Is profile.mle flexible enough?

Sat Jul 31 13:13:40 CEST 2010

Hi the list,

I am experiencing several issues with profile.mle (and consequently with 
confint.mle) (stat4 version 2.9.2), and I have to spend a lot of time to 
find workarounds to what looks like interface bugs. I would be glad to 
get feedback from experienced users to know if I am really asking too 
much or if there is room for improvement.

* Problem #1 with fixed parameters. I can't manage to get profile-based 
confidence intervals when some parameters of the likelihood function are 
fixed:

-----------8<--------------------------------
library(stats4)

minusLogL1 <- function(mu, logsigma2) { N*log(2*pi*exp(logsigma2))/2 + 
N*(var(x)+(mean(x)-mu)^2)/(2*exp(logsigma2)) }
minusLogL2 <- function(mu) { logsigma2 <- 0; 
N*log(2*pi*exp(logsigma2))/2 + 
N*(var(x)+(mean(x)-mu)^2)/(2*exp(logsigma2)) }

N <- 100
x <- rnorm(N, 0, 1)

fit <- mle(minusLogL1, start=list(mu=0, logsigma2=0))
confint(fit)

fit2 <- mle(minusLogL1, start=list(mu=0), fixed=list(logsigma2=0))
confint(fit2)

fit3 <- mle(minusLogL2, start=list(mu=0))
confint(fit3)
----------->8--------------------------------

Is it unreasonable to expect an identical result with fit2 and fit3? 
When looking into the code of the "profile" method for mle, one can find 
something like call$fixed <- fix ; i.e. the fixed effects in the call 
are completely replaced by the combination of fixed effects needed by 
the profile function. Perhaps I am missing something, but I don't 
understand why this is necessary.

* Problem #2 deals with the scope of the variables used in the "call" 
function. I understand that there are technical constraints, and similar 
remarks were previously answered on the Microsoft mode ("It's not a bug, 
it's a feature and you have to cope with it"), but the direct 
consequence is that the user has to understand the internals of the 
functions provided by the stats4 package, which does not look like a 
good idea to me. One of the simplest and most striking example is 
something like:

-----------8<--------------------------------

fit3 <- mle(minusLogL2, start=list(mu=0))
confint(fit3)
x <- rnorm(N, 0, 1)
confint(fit3)

----------->8--------------------------------

Although I understand the technical reason why the result is different, 
I think such side effects are catastrophic in terms of UI. The user 
should never have to know whether the information he/she wants is 
already stored in the object (vcov, coef) or if the function will 
recompute something.

Incidentally, such side effects make it very complicated to write 
wrappers for the mle (which is my actual problem). A straightforward way 
to solve it would be to store more information, including the data, in 
the mle object (as many others, e.g. lm, do), but if stats4 has been 
included among the core packages, it is probably because it was 
considered stable and flexible enough. Are there any tricks or 
alternatives to wrap mle calls properly?

Thanks in advance for your help.

Arnaud.