[Rd] suggested modification to the 'mle' documentation?

Luke Tierney luke at stat.uiowa.edu
Sun Dec 9 00:38:48 CET 2007


On Sat, 8 Dec 2007, Peter Dalgaard wrote:

> Luke Tierney wrote:
>
> [misc snippage]
>>> 
>>> But I'd prefer to avoid the necessity for users to manipulate the
>>> environment of a function.  I think the pattern
>>> 
>>> model( f, data=d )
>> 
>> For working at the general likelihood I think is is better to
>> encourage the approach of definign likelihood constructor functions.
>> The problem with using f, data is that you need to mathc the names
>> used in f and in data, so either you have to explicitly write out f
>> with the names you have in data or you have to modify data to use the
>> names f likes -- in the running example think
>>
>>     f <- function(lambda) -sum(dpois(x, lambda, log=T))
>>     d <- data.frame(y=rpois(10000, 12.34))
>> 
>> somebody has to connext up the x in f with the y in d. 
> [more snippage]
>
> That's not really worse than having to match the names in a model formula to 
> the names of the data frame in lm(), is it?

Yes and no.

If the likelihood is simple engough to include in line, is in

     d <- data.frame(y=rpois(100,12.34))
     mle(function(lambda) -sum(dpois(d$y, lambda, log = TRUE)),
         start = list(lambda=10))

or neaarly in line, eg in a with or local construct, like

     with(d, {
         f <- function(lambda) -sum(dpois(y, lambda, log = TRUE))
         mle(f, start = list(lambda=10))
     })

or

     local({
         y <- d$y
         f <- function(lambda) -sum(dpois(y, lambda, log = TRUE))
         mle(f, start = list(lambda=10))
     })

then I think it is essentially the same.  But if the function is
complex enough that you will want to define and debug it separately
then you will probably want to be able to reuse your code directly,
not with copy-paste-edit.  At that point things are different.

In a sense this difference also exists with model formulas as well. We
usually write formular in line, rather than something like

     f <- y ~ x
     lm(f)

With simple formulalas that is reasonable. But it would be nice to be
able to abstract out common patterns of more complex fomulas for
simple reuse. A simple-minded example might be to be able to define a
splitPlot formula operator so one can write

     yield ~ splitPlot(whole = fertilizer, sub = variety)

This sort of thing would become more useful in more complicated
multi-level models.  I could be wrong but I don't think BUGS has the
ability to abstract out submodel patterns in this way.  Don't know if
any of the other multi-level modeling systems provide this.  Might be
worth looking into; it's not unrelated to the issues you raise below.

luke

>
> The thing that I'm looking for in these matters is a structure which allows 
> us to operate on likelihood functions in a rational way, e.g. reparametrize 
> them, join multiple likelihoods with some parameters in common, or integrate 
> them. The join operation is illustrative: You can easily do 
> negljoint <- function(alpha, beta, gamma, delta)
>   negl1(alpha, beta, gamma) + negl2(beta, gamma, delta)
>
> and with a bit of diligence, this could be the result of Join(negl1, negl2). 
> But if the convention is that likelihods have their their data as an 
> argument, you also need to also automatically define a data argument fot 
> negljoint, (presumably a list of two) and organize that the calls to negl1 
> and negl2 contains the appropriate subdata. It is the sort of thing that 
> might be doable, but you'd rather do without.
>
> -pd
>
>

-- 
Luke Tierney
Chair, Statistics and Actuarial Science
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa                  Phone:             319-335-3386
Department of Statistics and        Fax:               319-335-3017
    Actuarial Science
241 Schaeffer Hall                  email:      luke at stat.uiowa.edu
Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu



More information about the R-devel mailing list