[R] implementing Maximum Likelihood with distrMod when only the PDF is known

Matthias Kohl Matthias.Kohl at stamats.de
Wed Jun 24 12:28:24 CEST 2009


Dear Guillaume,

retrospectively, I'm not sure if the decision to have spezial initialize 
methods is the optimal way to go. In distrMod and our other packages on 
robust statistics we don't introduce special initialize methods, but use 
so-called generating functions. This approach has the advantage that the 
default initialize method can be used for programming where I find it 
very useful.

I would say it is the canonical (and recommended) way to first define a 
function as generic (like mu) and then implement methods for it.

Choosing names for which there is already an existing definition for a 
generic function may also not be what you want in general. By defining 
new methods you have to respect the definition of the generic function; 
that is, your method definition has to be with respect to the arguments 
in the given generic function and also dispatching will be on these 
arguments (cf. scale in the distrMod example).

In defining our classes we decided that at least the slot "r" has to be 
filled (of class "function") whereas d, p and q are of class 
"OptionalFunction". Hence, there are functions to fill d, p and q for 
given r but, so far not for filling r, p and q given d.

A way to avoid implementing r is given in
http://www.stamats.de/distrModExample1.R

I also do not fill the slots p and q in this example. This avoids the 
simulation of a large random sample and speeds up the computation of the 
MLE.

However, this is rather a quick and dirty solution and it would of 
course be better to have a valid definition for r, d, p and q.

Best,
Matthias


guillaume.martin schrieb:
> Dear Mathias,
> That's pretty amazing, thanks a lot ! I'll have to look all this 
> through because I don't easily understand why each part has to be set 
> up, in particular the "initialize method" part seems crucial and is 
> not easy to intuit. From what I get, the actual name we give to a 
> parameter (my original "mu" for example) is important in itself, and 
> if we introduce new variable names we have to define a new generic, 
> right? The simplest option then is to re-use an existing variable name 
> that has the same properties/range, right?
>
> Another general question: my actual pdf is of the same type but not 
> the exact same as the skew normal. In particular, I don't have a rule 
> for building the slot r (eg the one borrowed from the sn package in 
> your example); is it a problem? isn't it sufficient to give slot d, 
> and then you have automatic methods implemented to get from d() to r() 
> slots etc. is that right?
>
> Thanks a lot for your help and time !
>
> Best,
>
> Guillaume
>
>
> Matthias Kohl a écrit :
>> Dear Guillaume,
>>
>> thanks for your interest in the distrMod package.
>>
>> Regarding your question I took up your example and put a file under:
>>
>> http://www.stamats.de/distrModExample.R
>>
>> Hope that helps ...
>>
>> Don't hesitate to contact me if you have further questions!
>>
>> Best,
>> Matthias
>>
>> guillaume.martin schrieb:
>>> Dear R users and Dear authors of the distr package and sequels
>>>
>>> I am trying to use the (very nice) package distrMod as I want to 
>>> implement maximum likelihood (ML) fit of some univariate data for 
>>> which I have derived a theoretical continuous density (pdf). As it 
>>> is a parametric density, I guess that I should implement myself a 
>>> new distribution of class AbscontDistributions (as stated in the pdf 
>>> on "creating new distributions in distr"), and then use 
>>> MLEstimator() from the distrMod package. Is that correct or is there 
>>> a simpler way to go? Note that I want to use the distr package 
>>> because it allows me to implement simply the convolution of my 
>>> theoretical pdf with some noise distribution that I want to model in 
>>> the data, this is more difficult with fitdistr or mle.
>>> It proved rather difficult for me to implement the new class 
>>> following all the steps provided in the example, so I am asking if 
>>> someone has an example of code he wrote to implement a parametric 
>>> distribution from its pdf alone and then used it with MLEstimator().
>>>
>>> I am sorry for the post is a bit long but it is a complicate 
>>> question to me and I am not at all skillful in the handling of such 
>>> notions as "S4 - class", etc.. so I am a bit lost here..
>>>
>>> As a simple example, suppose my theoretical pdf is the skew normal 
>>> distribution (available in package sn):
>>>
>>> #skew normal pdf (default values = the standard normal N(0,1)
>>>
>>> fsn<-function(x,mu=0,sd=1,d=0) {u = (x-mu)/sd;  f = 
>>> dnorm(u)*pnorm(d*u); return(f/sd)}
>>>
>>> # d = shape parameter (any real), mu = location (any real), sd = 
>>> scale (positive real)
>>>
>>> #to see what it looks like try
>>> x<-seq(-1,4,length=200);plot(fsn(x,d=3),type="l")
>>>
>>> #Now I tried to create the classes "SkewNorm" and 
>>> "SkewNormParameter" copying the example for the binomial
>>> ##Class:parameters
>>> setClass("SkewNormParameter",
>>> representation=representation(mu="numeric",sd="numeric",d="numeric"),
>>> prototype=prototype(mu=0,sd=1,d=0,name=gettext("Parameter of the 
>>> Skew Normal distribution")),
>>> contains="Parameter"
>>> )
>>>
>>> ##Class: distribution (created using the pdf of the skew normal 
>>> defined above)
>>> setClass("SkewNorm",prototype = prototype(
>>>     d = function(x, log = FALSE){fsn(x, mu=0, sd=1,d=0)},
>>>     param = new("SkewNormParameter"),
>>>     .logExact = TRUE,.lowerExact = TRUE),
>>> contains = "AbscontDistribution"
>>> )
>>>
>>> #so far so good but then with
>>> setMethod("mu", "SkewNormParameter", function(object) object at mu)
>>>
>>> #I get the following error message:
>>>
>>> > Error in setMethod("mu", "SkewNormParameter", function(object) 
>>> object at mu) : no existing definition for function "mu"
>>>
>>> I don't understand because to me mu is a parameter not a function... 
>>> maybe that is too complex programming for me and I should switch to 
>>> implementing my likelihood by hand with numerical convolutions and 
>>> optim() etc., but I would like to know how to use distr, so if there 
>>> is anyone who had the same problem and solved it, I would be very 
>>> grateful for the hint !
>>>
>>> All the best,
>>> Guillaume
>>>
>>>
>>>
>>
>

-- 
Dr. Matthias Kohl
www.stamats.de




More information about the R-help mailing list