[Rd] proposed simulate.glm method

Prof Brian Ripley ripley at stats.ox.ac.uk
Sat Feb 14 22:43:38 CET 2009


On Sat, 14 Feb 2009, Nicholas Lewin-Koh wrote:

> Well, my question wasn't that clear :-), but yes you mostly answered 
> it. I guess the one case I would be concerned is in Heather's code, 
> where the distribution to simulate from is chosen, that seemed to be 
> hard coded.

Rather, all known glm families in R that correspond to actual 
probability distributions are listed.

> So if I built a family object, say for a model that assumes errors 
> from a zipf distribution,

Hmm, plese explain how you get that into the GLM framework -- it is 
pretty restrictive.

> and I did have a predict method (which is a fair assumption) would that
> fail because the rzipf function would not be accessed?

glm has a predict method, so why do you need one?  Families do not 
create additional classes of fit objects.

We could extend the definition of a family to have a 'simulate' 
element, but then existing user-contributed families (principally the 
negative binomial) would not have one and so this would not solve the 
problem .....

If you know of an actual R implementation of another glm family that 
looks generally useful we'll consider adding it (but it seems that the 
end user could also do so rather easily).


> Nicholas
> On Sat, 14 Feb 2009 20:10:26 +0100, "Martin Maechler"
> <maechler at stat.math.ethz.ch> said:
>>>>>>> "NicLK" == Nicholas Lewin-Koh <nikko at hailmail.net>
>>>>>>>     on Sat, 14 Feb 2009 08:34:45 -0800 writes:
>>
>>     NicLK> Hi, For extended glms such as gams, gnm or other
>>     NicLK> distributions such as negative binomial, would there
>>     NicLK> need to be a separate simulate method?
>>
>> Not necessarily,  as I said, the "glm"s are now also dealt with
>> in simulate.lm() and Heather more or less confirmed that this
>> gives correct results for "gnm" objects.
>>
>> For gam(), I'd strongly expect the same to apply, but there
>> maybe sophisticated gam() models where the result is currently
>> not correct.  That's, BTW, also true for
>>     simulate(lm(...., weights), ...)
>>
>>     NicLK>  Or, could the current framework, rather than
>>     NicLK> stopping with an error look for the appropriate model
>>     NicLK> matrix, coefficients, distribution function and
>>     NicLK> family object to simulate from?
>>
>> What do you mean?
>> A situation where there's no supported 'family'
>> or a situation where  predict(<obj>) does not work as it's
>> supposed in the current framework,
>> or ????
>>
>> If there are such cases, we'd have to consider them together
>> with the corresponding package author.  It may often make sense
>> fthen that the author changes his methods {predict(), ..} such
>> that the (now) extended simulate.lm() will work automatically;
>> Alternatively, the author can provide  simulate.<myclass>().
>>
>> But I'm not sure I'm answering the question you've asked..
>> Martin
>>
>>     NicLK> Nicholas
>>
>>
>>    >> Message: 9 Date: Fri, 13 Feb 2009 21:27:57 +0100 From:
>>    >> Martin Maechler <maechler at stat.math.ethz.ch> Subject: Re:
>>    >> [Rd] proposed simulate.glm method To: Heather Turner
>>    >> <Heather.Turner at warwick.ac.uk> Cc: r-devel at r-project.org,
>>    >> Martin Maechler <maechler at stat.math.ethz.ch> Message-ID:
>>    >> <18837.55245.15158.29378 at cmath-5.math.ethz.ch>
>>    >> Content-Type: text/plain; charset=us-ascii
>>    >>
>>    >> Thank you, Heather and Ben,
>>    >>
>>    >>>>>>> "HT" == Heather Turner
>>    >> <Heather.Turner at warwick.ac.uk> >>>>> on Fri, 13 Feb 2009
>>    >> 15:52:37 +0000 writes:
>>    >>
>>     HT> Yes, thanks to Ben for getting the ball rolling. His
>>     HT> code was more streamlined than mine, pointing to further
>>     HT> simplifications which I've included in the extended
>>     HT> version below.
>>    >>
>>     HT> The code for the additional families uses functions from
>>     HT> MASS and SuppDists - I wasn't sure about the best way to
>>     HT> do this, so have just used :: for now.
>>    >>
>>     HT> It appears to be working happily for both glm and gnm
>>     HT> objects (no gnm-specific code used).
>>    >>
>>     HT> Best wishes,
>>    >>
>>     HT> Heather
>>    >>
>>    >> [....]
>>    >>
>>    >> I have now followed Brian Ripley's suggetion to just
>>    >> extend simulate.lm() to also deal with "glm" objects, but
>>    >> using Heather's suggestions for the different families;
>>    >> I've just commited src/library/stats/R/lm.R with the new
>>    >> code.  (get it from svn.r-project.org/R/trunk/ or this
>>    >> night's R-devel tarball).
>>    >>
>>    >> One difference to your propsal: Instead of just
>>    >> object$fitted , the code is using fitted(object)
>>    >> ... something which should properly to the na.action
>>    >> used.
>>    >>
>>    >> For the (MASS and) SuppDists package requirement, I'm
>>    >> using the following
>>    >>
>>    >> if(is.null(tryCatch(loadNamespace("SuppDists"), error =
>>    >> function(e) NULL))) stop("Need CRAN package 'SuppDists'
>>    >> for 'inverse.gaussian' family")
>>    >>
>>    >>
>>    >> I've not yet updated the help page for simulate(), and
>>    >> have only tested relatively few cases for binomial,
>>    >> poisson and Gamma.  I've wanted to expose this to you, so
>>    >> you can provide more feedback and possibly even a patch
>>    >> to
>>    >> svn.r-project.org/R/trunk/src/library/stats/man/simulate.Rd
>>    >>
>>    >> Martin
>>    >>
>>    >>
>>    >>
>>
>> ______________________________________________
>>     NicLK> R-devel at r-project.org mailing list
>>     NicLK> https://stat.ethz.ch/mailman/listinfo/r-devel
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-devel mailing list